Question 1

What is Parent-Document Retrieval?

Accepted Answer

Parent-document retrieval is a chunking-and-retrieval pattern that separates the unit used for matching from the unit used for generation.

Question 2

How does Parent-Document Retrieval work?

Accepted Answer

At query time, retrieval matches against child chunks but returns the corresponding parents to the generator. The pattern bridges a core tension in RAG: small chunks give sharper retrieval scores but strip surrounding context the generator needs, while large chunks carry context but dilute the relevance signal.

Question 3

Can you give an example of Parent-Document Retrieval?

Accepted Answer

A technical-docs assistant initially chunks at 1,200 tokens. Retrieval is noisy — a chunk that covers the right topic also covers three unrelated ones. Switching to 200-token child chunks pushes top-5 recall up sharply, but the generator now sees tiny snippets and misses surrounding setup. Adding parent-document retrieval — matching on the 200-token children but returning their 1,200-token parent sections — yields both the sharper match and the full context. End-to-end answer accuracy rises from illustrative 0.73 to 0.86 without retraining the embedding model.

Parent-Document Retrieval

Example

Frequently asked questions

What is Parent-Document Retrieval?

How does Parent-Document Retrieval work?

Can you give an example of Parent-Document Retrieval?

Related Terms

Put this into practice