Self-refine is one of the simplest agentic patterns that works. The model produces an output, then — with no external feedback — critiques it against a rubric and revises. One model, three prompts, one loop. Where the model can recognize its own mistakes, it lifts quality. Where it cannot, it produces polished versions of the same error. This guide covers when to use it and how to tell which case you're in.
What Self-Refine Is
Self-refine is a three-step loop, all executed by the same model in the same session:
- Generate. Produce an initial output.
- Critique. Evaluate that output against a rubric — what's wrong, what's missing, what could be better. No outside feedback.
- Revise. Produce a new output addressing the critique.
The loop runs once or iterates. Each revision sees its predecessor and the critique of it. The key property: critique and revision happen without new information — the model uses its own judgment on its own output.
It's related to chain-of-thought prompting: both ask the model to produce intermediate reasoning. The difference: chain-of-thought surfaces reasoning before the answer; self-refine produces the answer then reasons about it. The two compose.
Self-Refine vs Reflexion vs ReAct
Three patterns get confused because they all involve "try, evaluate, try again." They are not the same.
| Pattern | Feedback source | Memory | Loop structure |
|---|---|---|---|
| Self-refine | Model's own critique of its output | None across attempts | Generate → critique → revise, in one session |
| Reflexion | External signal (test failure, reward) plus self-reflection | Persists lessons across attempts | Attempt → reflect → retry with lessons in context |
| ReAct | Observations from tool calls | Trace of prior steps in context | Thought → action → observation, repeat |
Self-refine is the leanest. It needs no external signal and no memory store — just a well-formed critique prompt. When an external signal is available (tests, ground truth, reward), reflexion prompting is stronger because the signal grounds the critique. When external information is needed to answer at all, ReAct prompting is the right structure. Self-refine is what you reach for when the task is self-contained and the model plausibly has enough knowledge to grade its own work.
When Self-Refine Works
Self-refine helps when the model can recognize mistakes better than it can avoid making them. Generation and evaluation are asymmetric — spotting a bug in visible code is easier than writing bug-free code in one pass. Tasks where it tends to help:
- Code correctness on checkable criteria. The model writes a function, then reads it for off-by-one errors, null cases, wrong return types. Visible on the page; the second pass catches them.
- Format fidelity. JSON that almost matches a schema — the critique catches missing fields, wrong types, extra commentary outside the object.
- Instruction adherence. "Rewrite in under 100 words, no jargon, no bullets." The first pass often misses one constraint; a critique pass that checks each constraint catches the miss.
- Fact-matching-source. Given a source document, the model drafts a summary, then critiques it against the source for unsupported claims. The grounding is the source, not the model's knowledge.
- Style consistency. Drafting in a voice, then critiquing for tone slips and formatting inconsistencies.
- Math with shown work. Errors in a visible calculation are easier to spot than to avoid.
The common thread: an objectively checkable criterion, either in the prompt or in the output itself, that the critique step can apply. The model isn't relying on hidden knowledge to grade the output; it's reading the output and applying a rule.
When Self-Refine Fails
Self-refine fails — often silently — when the model can't evaluate its own output any better than it generated it:
- Novel factual knowledge. If the model doesn't know a fact, it can't recognize a wrong answer as wrong. It produces a confident critique of a confident wrong answer and revises into a polished wrong answer.
- Subjective writing quality. "Make this more engaging" leans on aesthetic judgment the model has no stable anchor for. Revisions drift toward generic "polished" style rather than convergently improve. Sometimes worse than the original.
- Creative divergence tasks. Brainstorming and alternate framings don't converge under critique — critique collapses divergent output toward a single "best" version, defeating the purpose.
- Self-referential creative work. Jokes, narrative surprise, and rhetorical risk all suffer under critique; the critique step over-explains and de-fangs.
- When the rubric is in the model's blind spot. If the critique criterion requires knowledge the model lacks, the loop does nothing. "Check against our internal style guide" fails unless the guide is in the prompt.
The tell for a bad self-refine target: the critique sounds vaguely positive, and each revision is structurally similar to the previous with minor wording changes. That's the model going through motions without new signal.
Designing the Critique Prompt
The critique prompt is where self-refine lives or dies. "Review this and suggest improvements" yields vague critique. A specific rubric with named criteria yields actionable critique. Three patterns that work:
- Enumerate criteria explicitly. List 3–7 checks. Each should be binary or near-binary: "Does the output match the schema? List fields that are missing, extra, or wrong-typed." Avoid "rate clarity 1–10" — scores aren't actionable.
- Demand specific citations. "For each issue, quote the exact line from the output and explain what's wrong." This stops hand-waving and forces the model to point at real problems.
- Require a verdict. "State either PASS (no significant issues) or REVISE (with a summary of what to fix)." Lets the loop terminate cleanly.
A code-task rubric might list: correctness against spec, edge cases, error handling, type safety, naming clarity. A JSON-task rubric: schema conformance, field completeness, type correctness, no extraneous text. The template changes by task; the pattern — named criteria, quoted evidence, explicit verdict — does not.
Termination Conditions
A self-refine loop needs a stop rule. Four options:
- Fixed iterations (k=1 or k=2). Simplest. Run one critique-revise cycle and stop. Past k=3, revisions usually drift without improving.
- PASS verdict. If the critique outputs PASS, stop. Requires a well-calibrated verdict — a lenient model passes too early, a harsh one never does. Combine with a max-iteration cap.
- No-improvement detector. Diff the revision against the previous output. If the changes are cosmetic, stop. Harder to implement but avoids wasted loops.
- Specific-criterion pass. Run the actual validator (schema, tests) after each revision. No longer pure self-refine — you're using external signal — but the most reliable stop when available.
Default to k=1 and only add a second iteration when you can show the second revision routinely improves things.
Self-Refine Inside Agents
Modern agent loops often use self-refine internally without labeling it. An agent that writes code, reads its own output, and edits before committing is doing self-refine on the code. An agent that drafts a tool call, inspects it for missing fields, and revises before emitting is doing self-refine on arguments.
Making it explicit helps. Rather than hoping the agent self-checks, require it: "Before submitting your answer, review it against these criteria: [list]. If any criterion fails, revise and re-review before submitting." That turns emergent behavior into a guaranteed step. In multi-agent architectures, a "critic" role implementing self-refine as a dedicated turn is a common pattern — see ai code review: agents vs prompts for how this plays out specifically for code review.
A Self-Refine Loop — Hypothetical Example
An illustrative three-prompt sequence for a JSON-output task. Hypothetical, meant to show the shape of the loop.
# Step 1: Generate
Task: Extract structured contact info from the email below.
Schema:
{
"name": string, "email": string,
"company": string | null, "phone": string | null,
"request": string
}
Email: <email content here>
Output the JSON only. No commentary.
# Step 2: Critique
Below is the JSON output for the task above. Critique it against these
criteria. Quote the exact part of the output for each issue found.
1. Schema conformance: every required field present, types correct?
2. Extraction accuracy: each value match what's in the email?
3. Null handling: optional fields null only when absent?
4. Cleanliness: valid JSON, no extra commentary?
End with either:
- PASS: no significant issues, or
- REVISE: <one-sentence summary of what to fix>
# Step 3: Revise (only if REVISE)
Given the original task, your previous output, and the critique above,
produce a revised JSON output addressing every issue named. JSON only.
The structure is repeatable. The task and schema change; the criteria-with-quoted-evidence-and-verdict shape does not. A production pipeline wraps this in a harness that runs the critique once, checks for PASS, and either returns the original or runs the revise step.
Common Anti-Patterns
- Vague critique prompts. "Review this and suggest improvements" produces vague output. Fix: list named criteria with quoted-evidence requirements.
- Self-refine on subjective quality. Creative prose, jokes, brainstorming — critique collapses these toward generic output. Fix: skip it; use other techniques like prompting reasoning models for divergent thinking.
- No termination condition. Loops run to a turn limit, producing three near-identical revisions. Fix: k=1 by default; add a PASS verdict to short-circuit.
- Critiquing knowledge-gap output. Factual questions the model doesn't know produce polished wrong answers. Fix: use retrieval or tools, not self-refine.
- Hiding the original during revision. Without access to the original and the critique, the model can't target fixes. Fix: include both in the revise prompt.
- Too many iterations. k=5 rarely beats k=1 or k=2 and often drifts. Fix: cap at 2; verify the second pass actually helps.
FAQ
Does self-refine always improve output?
No. On tasks with objectively checkable criteria — code, schema conformance, instruction adherence — it usually does. On tasks where evaluation is as hard as generation — novel knowledge, subjective quality, creative work — it often doesn't, and can make things worse. Test on your specific task.
How is self-refine different from just writing a better prompt?
A better prompt pushes the first-pass output closer to the target. Self-refine gives the model a second pass to catch what the first missed. They compose — a clearer spec is almost always worth doing first, and self-refine picks up the residual errors a good prompt still leaves behind.
Does the critique have to come from the same model instance?
Not strictly. The "self" usually means same model in the same session. Using a different model for critique — sometimes called a "critic" or "judge" model — avoids the blind-spot problem where a model misses issues its own training would make it prone to. For high-stakes outputs, consider a different model for the critique stage.
When should I prefer reflexion over self-refine?
When you have an external signal that actually tells the model whether it succeeded — a test suite, a ground-truth check, a tool error — reflexion is stronger because the signal anchors the critique in reality rather than the model's judgment. Self-refine is the right choice when the task is self-contained.
Wrap-Up
Self-refine is cheap and general, worth trying on any task with checkable output. Cost: one or two extra model calls. Win: measurable on code correctness, schema conformance, instruction adherence. Risk: on subjective or knowledge-bound tasks, it polishes errors into confident-sounding errors. The critique prompt carries the weight — name criteria, require quoted evidence, demand an explicit verdict, cap iterations at one or two. For the bigger picture see the complete guide to prompting AI coding agents; for adjacent patterns see reflexion prompting when external signals exist and ReAct prompting when you need tool calls.