Skip to main content

Self-RAG

Self-RAG is a pattern in which the language model emits special reflection tokens that control its own retrieval and generation decisions. At inference time, the model decides whether a retrieval call is needed for the current step, whether each retrieved passage is relevant and supportive, and whether the draft generation is faithful to the retrieved evidence. These reflection tokens are introduced during fine-tuning on a dataset labeled with retrieval and critique signals, so the behavior is learned rather than prompted. The result is a model that can skip retrieval on easy prompts, pull in multiple passages only when needed, and self-flag when its own output is under-supported. It differs from Corrective RAG in that the corrective logic is inside the model weights rather than implemented as an external control loop over a generic LLM.

Example

A fine-tuned Self-RAG model is given the prompt "write a haiku about autumn". It emits a reflection token indicating no retrieval is needed and produces the poem directly, saving the round-trip. Given "summarize the findings of the latest CDC report on flu hospitalizations", it emits a retrieve token, pulls three CDC passages, marks two as relevant and one as off-topic, drafts a summary, and emits a support token flagging each claim as supported or not. The unsupported claim is flagged for a human reviewer instead of being silently shipped.

Put this into practice

Build polished, copy-ready prompts in under 60 seconds with SurePrompts.

Try SurePrompts