Skip to main content

Extended Thinking

Extended thinking is a Claude feature that lets the model allocate additional reasoning tokens before producing its final answer, with a user-controllable thinking budget set per request. It is distinct from dedicated reasoning models, which always reason internally — extended thinking is a toggle you enable when the task benefits from slower, deeper thought. It pays off on math, code, multi-step planning, and evaluations that reward careful steps, and is usually not worth the extra cost on creative writing, simple recall, or short chat turns.

Example

A developer debugging a tricky race condition sets a 16k-token thinking budget on a Claude request and asks the model to trace the execution order of two coroutines. The model uses most of its budget working through interleavings internally, then returns a short answer identifying the exact ordering that triggers the bug. For a follow-up "reword this to be friendlier," the same developer disables extended thinking — the task does not need it, and the latency and cost are not justified.

Put this into practice

Build polished, copy-ready prompts in under 60 seconds with SurePrompts.

Try SurePrompts