Skip to main content

Parent-Document Retrieval

Parent-document retrieval is a chunking-and-retrieval pattern that separates the unit used for matching from the unit used for generation. Documents are split into small child chunks (a sentence or short paragraph) and indexed for high-precision vector matching, while the full parent document or parent section is kept accessible by id. At query time, retrieval matches against child chunks but returns the corresponding parents to the generator. The pattern bridges a core tension in RAG: small chunks give sharper retrieval scores but strip surrounding context the generator needs, while large chunks carry context but dilute the relevance signal. Parent-document retrieval gets precision from the small chunks and context from the parents. The cost is a slightly more complex index layout and duplicated storage, usually well worth it in practice.

Example

A technical-docs assistant initially chunks at 1,200 tokens. Retrieval is noisy — a chunk that covers the right topic also covers three unrelated ones. Switching to 200-token child chunks pushes top-5 recall up sharply, but the generator now sees tiny snippets and misses surrounding setup. Adding parent-document retrieval — matching on the 200-token children but returning their 1,200-token parent sections — yields both the sharper match and the full context. End-to-end answer accuracy rises from illustrative 0.73 to 0.86 without retraining the embedding model.

Put this into practice

Build polished, copy-ready prompts in under 60 seconds with SurePrompts.

Try SurePrompts