ReAct Prompting Guide: Reasoning Plus Acting for AI Agents (2026)

Q: What is ReAct prompting?

ReAct is a prompting pattern where the model interleaves reasoning and acting. It thinks about what to do (a Thought), takes an action such as a tool call (an Action), observes the result (an Observation), then reasons again — and that loop repeats until the task is done. The pattern was introduced in a 2022 paper by Yao et al., 'ReAct: Synergizing Reasoning and Acting in Language Models,' and has become the baseline for agentic systems. The key word is interleaved: pure chain-of-thought reasons but never acts, and pure tool use acts but reasons sparsely. By alternating, ReAct gives the model a chance to course-correct mid-task — if a search returns nothing useful, the next thought can say 'that did not help, try a different query' rather than charging down a dead end.

Q: How is ReAct different from chain-of-thought prompting?

Chain-of-thought produces reasoning without taking any actions — the model thinks step-by-step and outputs an answer. ReAct interleaves reasoning with tool calls, so each action's result can inform the next step. On a pure math problem the two patterns collapse into the same thing, because there are no actions to take. But on anything requiring external information, ReAct can check its work by observing tool results and chain-of-thought cannot. The practical implication: use plain chain-of-thought for tasks with no tools, like writing an essay, where wrapping the work in a ReAct scaffold just adds ceremony without adding capability. Reach for ReAct when partial results matter — web search, document retrieval, code execution, or API calls against real systems all produce noisy output that should change what the next step is.

Q: When does ReAct underperform simpler patterns?

Two cases. First, tasks with no tools at all — when there is nothing to observe, the ReAct loop collapses into chain-of-thought with extra ceremony, and plain chain-of-thought is lighter while producing the same output. A prompt like 'write a 500-word essay on honeybees' has no tool calls, no external state, and no verification points. Second, well-defined workflows where every step is known in advance — a plan-and-execute run plans once and executes, which is cheaper than reasoning between every action. ReAct costs more per task (more calls, more tokens) and its trace is longer, so it wins when the environment is noisy or uncertain, not when it is deterministic. The rule of thumb: ReAct shines when an intermediate result can tell you 'you are on the wrong track' and let the model pivot.

Q: How do I prevent a ReAct loop from running forever?

Use three layered defenses, because any single one is weaker than the combination. First, set a hard iteration cap in the harness that stops the loop after N steps regardless of outcome, for example 15 iterations, surfacing a timeout to the caller. Second, define an explicit 'finish' tool that the model is instructed to call when done, making termination a first-class action rather than something the model has to infer. Third, add a prompt-level stop condition tied to verifiable state, such as 'stop when tests pass' or 'stop when the answer is in the last observation.' ReAct can loop forever if the model keeps choosing actions and nothing tells it to stop, so a missing termination action is a common anti-pattern worth guarding against explicitly.

Q: What is the difference between ReAct and plan-and-execute?

They are cousins, not competitors, and the distinction is about where the reasoning lives. ReAct reasons, acts, and observes in a tight loop — reasoning is interleaved with every action — which makes it good when the environment surprises you and partial results steer the path. Plan-and-execute reasons once up front to produce a full plan, executes that plan, and optionally re-plans if it fails; it is cheaper, easier to audit, and works well when steps are independent and the environment is predictable. The catch with planning first is that plans based on zero observations are often wrong. Many production agents combine both: a high-level plan sketches the phases of the work, and within each phase a ReAct loop handles the exploration and observation.

Imtiaz Rayhan

ReAct is a prompting pattern where the model interleaves Reasoning and Acting — it thinks about what to do, takes an action (typically a tool call), observes the result, then reasons again. That loop repeats until the task is done. The pattern was introduced in a 2022 paper by Yao et al., "ReAct: Synergizing Reasoning and Acting in Language Models," and has since become the baseline for agentic systems. In 2026 most coding agents — Claude Code, Cursor, Aider — run a ReAct-style loop internally. Knowing the pattern helps you prompt those agents well.

What ReAct Is

The loop has three steps, and they repeat:

Thought — the model reasons about the current state and decides what to do next.
Action — the model calls a tool (search, read a file, run a query, hit an API).
Observation — the tool returns a result, which becomes part of the next thought.

Each cycle narrows the search space. The model is not planning the entire path up front and executing blindly — it reasons one step ahead, acts, looks at what came back, and adjusts. If a search returns nothing useful, the next thought says "that did not help — try a different query" rather than charging down a dead end.

The key word is interleaved. Pure chain-of-thought reasons but does not act. Pure tool use acts but reasons sparsely — it fires off calls without checking whether intermediate results make sense. ReAct alternates, which gives the model a chance to course-correct mid-task. See the ReAct prompting glossary entry for a compact definition.

Why Interleaving Beats Planning-Then-Executing

The seductive alternative to ReAct is to plan the whole sequence up front, then execute it — the plan-and-execute pattern. It is cheaper, easier to audit, and works well when the environment is predictable.

But plans based on zero observations are often wrong. The model guesses a file is at src/lib/client.ts, writes a four-step plan to read it, and it is actually at lib/client.ts. Plan-and-execute either fails at step one or — worse — produces confident output from a bad assumption. ReAct fails earlier because step one is do the thing and see what comes back. If the first call errors, the next thought sees it and adapts.

The tradeoff: ReAct costs more per task (more calls, more tokens) and the trace is longer. Use it when the environment is uncertain or partial results matter. Use plan-and-execute when steps are well-defined and independent.

When ReAct Helps and When It Hurts

Task shape	Best pattern	Why
Math word problem, no tools	Chain-of-thought	No actions to take; reasoning alone solves it
Fact lookup, single API call	Pure tool use	One call, no intermediate reasoning needed
Multi-hop question with search	ReAct	Each search informs the next; partial results steer the path
Agentic coding task	ReAct (often inside a larger plan)	Environment is noisy — files, tests, shell output all require observation
Well-defined data pipeline	Plan-and-execute	Steps are independent; planning once is cheaper than per-step reasoning
Ambiguous, exploratory research	ReAct	The path is not knowable up front

ReAct shines when partial results matter. Web search, document retrieval, code execution, API calls against real systems — all produce noisy output that changes what the next step should be. If an intermediate result can tell you "you are on the wrong track," ReAct lets the model notice and pivot.

ReAct hurts when there is nothing to observe. A prompt like "write a 500-word essay on honeybees" has no tool calls, no external state, no verification points. Wrapping it in a ReAct scaffold just adds ceremony. Plain chain-of-thought is lighter and produces the same output.

Prompting for ReAct in a Chat Model

Modern agent frameworks (LangChain, LangGraph, Anthropic's tool use, OpenAI's function calling) implement the ReAct loop for you. You define tools; the framework runs the loop. But you can also prompt a plain chat model to behave ReAct-style by scaffolding the format in the system prompt.

A minimal scaffold:

code

You solve problems by interleaving reasoning and actions.

Available tools:
- search(query: str) -> list of snippets
- calc(expression: str) -> number
- finish(answer: str) -> terminate the loop

Format each step exactly as:

Thought: <one or two sentences of reasoning about what to do next>
Action: <tool name>(<arguments>)
Observation: <result of the action — you will be given this>

Repeat Thought/Action/Observation until the question is answered.
When you are confident in the answer, call finish(answer) and stop.

The user's question goes after this scaffold. The model emits Thought: and Action:. Your harness parses the action, runs it, and feeds the result back as Observation:. The model then emits the next Thought:. The loop terminates when the model calls finish.

This is an illustrative scaffold, not a production one. Real systems use structured tool calls (JSON) rather than free-form text to make parsing reliable and reduce hallucinated tools. But the shape is the same: scaffold the format, enumerate the tools, define a termination action.

ReAct in Modern Agents

Most agentic coding tools in 2026 run a ReAct-style loop by default. Claude Code plans a bit, calls Read, looks at the output, decides whether to Grep, calls it, then writes an Edit. Between each tool call there is a short reasoning step. You almost never write Thought: or Action: in those tools — the harness handles the scaffolding.

Your job with an agent that already runs ReAct is to give it good reasoning anchors. That means:

Naming the observations that matter. "Before editing, read lib/api/client.ts and confirm the function signature — do not guess." That tells the agent which observation to collect first.
Describing the stop condition. "Stop when pnpm test passes and git diff --name-only lists only the two files in scope." The loop has a clear exit.
Specifying what to do when an observation surprises the agent. "If the test file does not exist, stop and ask before creating one." Otherwise ReAct will keep improvising.

See the Claude Code prompting guide for what this looks like in a specific agent, and the broader pillar: The Complete Guide to Prompting AI Coding Agents. The tool-use prompting patterns guide covers the adjacent question of how to describe tools well so the agent picks the right one at each step.

Anatomy of a ReAct Prompt — Worked Example

Here is an illustrative trace for a multi-hop question. The question, scaffold, and trace are hypothetical and meant to show the shape of the pattern, not a real system.

code

Question: Which of the two main characters in the novel
"The Remains of the Day" is older, and by how many years?

Thought: I need to identify the two main characters, find their ages
in the novel, and compare. Let me start by finding who they are.
Action: search("main characters The Remains of the Day novel Ishiguro")
Observation: The novel's two main characters are Stevens, the butler,
and Miss Kenton, the housekeeper.

Thought: Good. Now I need each character's age. Let me search for Stevens.
Action: search("Stevens age The Remains of the Day")
Observation: Stevens is described as having served for over 30 years;
his exact age is not stated, but he is generally understood to be in
his late 50s at the time of the 1956 road trip.

Thought: "Late 50s" is imprecise. Let me look for Miss Kenton first
and see if her age is more specific, then decide what to do.
Action: search("Miss Kenton age The Remains of the Day")
Observation: Miss Kenton is described as being a similar generation
to Stevens, with no specific age given in the novel.

Thought: Neither age is given precisely in the text. I should say so
rather than invent a number.
Action: finish("The novel does not state either character's exact age.
Both are portrayed as middle-aged to late-middle-aged adults of
roughly the same generation, so a numeric difference cannot be given
from the text alone.")

Three things to notice. First, each Thought refers back to the previous Observation — that is what makes it ReAct and not plan-and-execute. Second, when an observation was unhelpful (imprecise age), the next thought adapted rather than pretending the information was there. Third, the final action was finish with an honest answer — ReAct makes it easier to catch the moment when the available evidence will not support a confident answer, because the observations are visible.

A plan-and-execute run on the same question might have planned "search for each character's age, subtract, return the difference" and invented numbers when the searches came back vague, because the plan committed to producing a numeric answer before any evidence arrived.

ReAct vs. Plan-and-Execute

These are cousins, not competitors. The distinction is about where the reasoning lives.

ReAct: reason → act → observe → reason → act → observe. Reasoning is interleaved with every action. Good when the environment surprises you.
Plan-and-execute: reason once to produce a plan, execute the plan, optionally re-plan if it fails. Good when steps are independent and the environment is predictable.

Many production agents combine both. A high-level plan sketches the phases of the work ("find the bug, write a failing test, fix it, verify"). Within each phase, a ReAct loop handles the exploration and observation. The plan-and-execute prompting guide covers the outer loop; ReAct is how the inner loop usually runs. For longer-horizon or role-separated work, both often sit inside a multi-agent prompting setup where a planner agent dispatches ReAct-style worker agents.

Common Anti-Patterns

Using ReAct for tasks with no tools. If there is nothing to observe, the loop collapses into chain-of-thought with extra ceremony. Fix: use plain chain-of-thought.
No termination action. The model keeps "reasoning" after the answer is in hand because nothing told it to stop. Fix: define an explicit finish or equivalent terminal tool, and require the model to call it.
Free-form action syntax. Asking the model to emit Action: search(...) as plain text makes parsing fragile and invites hallucinated tools. Fix: use structured tool calls (JSON or the framework's native format) rather than string matching.
Ignoring observation length. Each observation goes into the context window. Long tool outputs (full HTML pages, huge files) push the trace out of context after a few turns. Fix: truncate or summarize tool outputs before feeding them back.
Letting the loop run unbounded. ReAct can loop forever if the model keeps choosing actions. Fix: set a hard step limit (e.g. 15 iterations) and surface a timeout to the caller.
Dropping the scaffold mid-task. Some models start summarizing instead of emitting another Thought:/Action:. Fix: enforce the format with a lightweight validator; re-prompt if the output does not match.

FAQ

How is ReAct different from chain-of-thought?

Chain-of-thought produces reasoning without taking actions — the model thinks step-by-step and outputs an answer. ReAct interleaves reasoning with tool calls, so each action's result can inform the next step. On a pure math problem they collapse to the same thing; on anything requiring external information, ReAct can check its work and chain-of-thought cannot.

Do I need to write `Thought:/Action:/Observation:` myself when using a modern agent?

Usually not. Claude Code, Cursor, Aider, LangGraph agents, and most 2026 frameworks run the loop internally using structured tool calls under the hood. You describe the tools and the task; the harness emits the reasoning steps and routes the tool calls. Knowing the pattern is still useful because it tells you how to prompt the agent — naming observations, defining stop conditions — and how to debug a trace that goes sideways.

When does ReAct actually underperform simpler patterns?

Two cases. First, tasks with no tools at all — ReAct adds overhead without adding capability, and plain chain-of-thought prompting is lighter. Second, well-defined workflows where every step is known in advance — a plan-and-execute run plans once and executes, which is cheaper than reasoning between every action. ReAct wins when the environment is noisy or uncertain, not when it is deterministic.

What research backs ReAct?

The pattern was introduced by Yao et al. in 2022 in a paper titled "ReAct: Synergizing Reasoning and Acting in Language Models." The paper reported that ReAct outperformed both chain-of-thought-only and act-only baselines on knowledge-intensive question-answering benchmarks, with the advantage coming largely from the model's ability to recover from bad tool calls by reasoning about the observation. Directional results, not specific percentages — check the paper if you need the exact numbers.

How do I prevent the ReAct loop from running forever?

Three defenses. A hard iteration cap in the harness (stop after N steps regardless). An explicit finish tool the model is instructed to call when done, so termination is a first-class action. And a prompt-level stop condition tied to verifiable state ("stop when tests pass," "stop when the answer is in the last observation"). Layered beats any single one.

Wrap-Up

ReAct is the pattern that made agents practical. It is a small idea — interleave thinking and acting, feed each observation back into the next thought — and it turns out to matter a lot when the world is noisy. In 2026 you rarely write the scaffold by hand, because the agents you use have it baked in, but understanding the loop is what lets you prompt those agents well: you are writing reasoning anchors, stop conditions, and tool descriptions that feed a loop you do not see. For the broader picture see the complete guide to prompting AI coding agents; for the nearest cousins in pattern space, see plan-and-execute prompting and multi-agent prompting.

ReAct Prompting Guide: Reasoning Plus Acting for AI Agents (2026)

What ReAct Is

Why Interleaving Beats Planning-Then-Executing

When ReAct Helps and When It Hurts

Prompting for ReAct in a Chat Model

ReAct in Modern Agents

Anatomy of a ReAct Prompt — Worked Example

ReAct vs. Plan-and-Execute

Common Anti-Patterns

FAQ

How is ReAct different from chain-of-thought?

Do I need to write `Thought:/Action:/Observation:` myself when using a modern agent?

When does ReAct actually underperform simpler patterns?

What research backs ReAct?

How do I prevent the ReAct loop from running forever?

Wrap-Up

Ready to write better prompts?

Related Resources

Prompt Refinement Template

Prompt Chain Builder Template

System Prompt Writer Template

Prompt Engineering Framework Template

Related Articles

The Complete Guide to Prompting AI Coding Agents (2026)

Plan-and-Execute Prompting: Decompose First, Then Act (2026)

Tool Use Prompting Patterns: Getting Reliable Tool Calls (2026)

ReAct Prompting Guide: Reasoning Plus Acting for AI Agents (2026)

What ReAct Is

Why Interleaving Beats Planning-Then-Executing

When ReAct Helps and When It Hurts

Prompting for ReAct in a Chat Model

ReAct in Modern Agents

Anatomy of a ReAct Prompt — Worked Example

ReAct vs. Plan-and-Execute

Common Anti-Patterns

FAQ

How is ReAct different from chain-of-thought?

Do I need to write Thought:/Action:/Observation: myself when using a modern agent?

When does ReAct actually underperform simpler patterns?

What research backs ReAct?

How do I prevent the ReAct loop from running forever?

Wrap-Up

Ready to write better prompts?

Related Resources

Prompt Refinement Template

Prompt Chain Builder Template

System Prompt Writer Template

Prompt Engineering Framework Template

Related Articles

The Complete Guide to Prompting AI Coding Agents (2026)

Plan-and-Execute Prompting: Decompose First, Then Act (2026)

Tool Use Prompting Patterns: Getting Reliable Tool Calls (2026)

Do I need to write `Thought:/Action:/Observation:` myself when using a modern agent?