Question 1

What is Cross-Encoder?

Accepted Answer

A cross-encoder is a transformer architecture that takes a query and a candidate document as a single joint input — typically concatenated with a separator token — and outputs one scalar relevance score.

Question 2

How does Cross-Encoder work?

Accepted Answer

Because attention runs across both sides together, a cross-encoder can model fine-grained interactions between query terms and document terms — negation, exact matches inside a paraphrase, positional constraints — that a bi-encoder cannot.

Question 3

Can you give an example of Cross-Encoder?

Accepted Answer

A technical-support search retrieves the top 100 candidates per query with a bi-encoder in around 30ms, then reranks the top 50 with a small cross-encoder in around 120ms. The two-stage pipeline lands at roughly 180ms end-to-end. A control experiment running the cross-encoder over the full corpus of 2M documents was projected at 40+ seconds per query — operationally unusable. The two-stage setup ships at illustrative nDCG@10 of 0.81 vs. 0.74 for bi-encoder alone, with per-query latency the user does not notice.

Cross-Encoder

Example

Frequently asked questions

What is Cross-Encoder?

How does Cross-Encoder work?

Can you give an example of Cross-Encoder?

Related Terms

Related Resources

Reranking Retrieval Results: A Cross-Encoder Walkthrough