Extended Thinking
Extended thinking is a Claude feature that lets the model allocate additional reasoning tokens before producing its final answer, with a user-controllable thinking budget set per request. It is distinct from dedicated reasoning models, which always reason internally — extended thinking is a toggle you enable when the task benefits from slower, deeper thought.
It pays off on math, code, multi-step planning, and evaluations that reward careful steps, and is usually not worth the extra cost on creative writing, simple recall, or short chat turns.
Example
A developer debugging a tricky race condition sets a 16k-token thinking budget on a Claude request and asks the model to trace the execution order of two coroutines. The model uses most of its budget working through interleavings internally, then returns a short answer identifying the exact ordering that triggers the bug. For a follow-up "reword this to be friendlier," the same developer disables extended thinking — the task does not need it, and the latency and cost are not justified.
Related Resources
AI Reasoning Models: The Complete 2026 Prompting Guide
The canonical 2026 guide to prompting reasoning models — what they actually are, the model landscape (o3, Claude extended thinking, Gemini Deep Think, DeepSeek R1), the universal anatomy of a strong reasoning prompt, per-model dialects, when not to use a reasoning model, and honest evaluation.
Extended Thinking Prompts for Claude (2026)
How to prompt Claude's extended thinking mode — when it helps, when it wastes budget, and how prompt structure shapes the reasoning process.
Claude 4 Prompting Guide: Adaptive Thinking, Extended Context, and Best Practices
Master Claude prompting with practical techniques for system prompts, XML formatting, extended thinking, and long-context workflows.