Question 1

What is HyDE (Hypothetical Document Embeddings)?

Accepted Answer

HyDE is a retrieval technique in which the language model first generates a hypothetical answer to the user's query, and then that hypothetical answer — not the original query — is embedded and used to retrieve real documents by vector similarity.

Question 2

How does HyDE (Hypothetical Document Embeddings) work?

Accepted Answer

The idea, introduced by Gao et al. in 2022, is that in embedding space a plausible-looking answer is often closer to the real supporting documents than a short, under-specified question.

Question 3

Can you give an example of HyDE (Hypothetical Document Embeddings)?

Accepted Answer

A research assistant is asked "effects of vitamin D on muscle recovery?" — a short noun phrase. Directly embedding the query returns mixed results. With HyDE, the model first drafts a paragraph-length hypothetical answer about vitamin D, muscle-protein synthesis, and recovery timelines; that paragraph is then embedded. Vector search against the hypothetical retrieves seven more on-topic studies in the top ten than the direct-query baseline did.

HyDE (Hypothetical Document Embeddings)

Example

Frequently asked questions

What is HyDE (Hypothetical Document Embeddings)?

How does HyDE (Hypothetical Document Embeddings) work?

Can you give an example of HyDE (Hypothetical Document Embeddings)?

Related Terms

Related Resources

HyDE Retrieval: Generating Hypothetical Answers to Improve Vector Search