Skip to main content

Extended Thinking

Extended thinking is a Claude feature that lets the model allocate additional reasoning tokens before producing its final answer, with a user-controllable thinking budget set per request. It is distinct from dedicated reasoning models, which always reason internally — extended thinking is a toggle you enable when the task benefits from slower, deeper thought.

It pays off on math, code, multi-step planning, and evaluations that reward careful steps, and is usually not worth the extra cost on creative writing, simple recall, or short chat turns.

Example

A developer debugging a tricky race condition sets a 16k-token thinking budget on a Claude request and asks the model to trace the execution order of two coroutines. The model uses most of its budget working through interleavings internally, then returns a short answer identifying the exact ordering that triggers the bug. For a follow-up "reword this to be friendlier," the same developer disables extended thinking — the task does not need it, and the latency and cost are not justified.

Frequently asked questions

What is Extended Thinking?

Extended thinking is a Claude feature that lets the model allocate additional reasoning tokens before producing its final answer, with a user-controllable thinking budget set per request.

How does Extended Thinking work?

It pays off on math, code, multi-step planning, and evaluations that reward careful steps, and is usually not worth the extra cost on creative writing, simple recall, or short chat turns.

Can you give an example of Extended Thinking?

A developer debugging a tricky race condition sets a 16k-token thinking budget on a Claude request and asks the model to trace the execution order of two coroutines. The model uses most of its budget working through interleavings internally, then returns a short answer identifying the exact ordering that triggers the bug. For a follow-up "reword this to be friendlier," the same developer disables extended thinking — the task does not need it, and the latency and cost are not justified.