Tip
TL;DR: Chain-of-Density (CoD) runs a summary through five rewrites. Each pass adds one or two entities from the source while keeping the summary's length constant. Iteration 3 usually hits the best density-to-readability ratio. Use it when you have a long source, a hard length budget, and readers who need information per word — not narrative.
Key takeaways:
- CoD is an iterative pattern, not a single prompt. Five passes, each constrained to add entities without growing length. Skipping iterations defeats the point.
- The core constraint is length held constant. Without it, the model simply appends facts, which is just a longer summary. The fixed budget forces it to replace filler with entities.
- Iteration 3 is usually the sweet spot. Iterations 4 and 5 are denser but read as compressed lists. Score every pass; do not default to the last one.
- Pair CoD with an evaluation loop. An LLM-as-Judge pass or a SurePrompts Quality Rubric score per iteration lets you auto-select the best version.
- CoD fails on short sources and subjective content. There has to be enough entity density in the source to keep adding through five passes, and the task has to be "convey facts" not "convey feeling."
Why Chain-of-Density exists
Default summarization prompts have a specific failure mode. Ask any frontier model to "summarize this in 100 words" and you usually get a summary that leads with the top one or two facts, then dilutes into generic connective tissue — "the article discusses," "several experts noted," "overall the trend suggests." The word count is honored. The information density is not.
This matters when the summary is a deliverable, not a preview. Executive briefs, PR clip digests, research abstracts, and meeting recaps are read instead of the source. A thin summary wastes its budget on filler. What you want is the maximum number of source entities — people, companies, numbers, events, decisions — packed into the same word count.
Chain-of-Density, introduced by Adams et al. in the 2023 paper From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting, is a pattern designed for exactly this. The authors framed summary quality as a trade-off between density (entities per word) and readability, and found that iterative rewriting under a fixed length constraint can push density substantially higher than a single-pass summary. Human raters in their study tended to prefer denser versions up to a point — after which compression made the text feel list-like rather than narrative.
You use CoD when density matters more than flow. You skip it when the reader wants a gentle overview.
The pattern
CoD is a five-step prompt chain. The skeleton:
Step 1 — Initial summary
You will generate increasingly entity-dense summaries of the source below.
Source: <long source document>
Write a 60-word summary. Use generic filler where needed — the first
pass is intentionally sparse. Identify 1-3 informative entities
(named people, companies, numbers, events) from the source that
are NOT yet included in the summary. List them.
Output:
SUMMARY: <60 words>
MISSING_ENTITIES: <1-3 entities>
Step 2 to 5 — Rewrite, each pass
Rewrite the previous summary at exactly the same length (60 words).
The new summary MUST include every entity from the previous summary
AND add 1-2 entities from the MISSING_ENTITIES list (or newly identified
from the source). Make space for new entities by compressing generic
language, not by removing prior entities.
Then list 1-3 new missing entities for the next pass.
Previous summary: <iteration N-1 output>
Previous missing entities: <from iteration N-1>
Output:
SUMMARY: <60 words, strictly>
MISSING_ENTITIES: <1-3 entities>
Three properties of the skeleton are load-bearing:
- Length is held constant. Not "about 60 words." Exactly 60. Without this, the model grows the summary rather than compressing filler.
- Prior entities must be preserved. Each pass is additive, not substitutive. The model cannot drop an entity just because a new one is more interesting — that turns CoD into a random-walk rewrite.
- The missing-entity list is mechanical. You ask the model to name specific entities it is about to add. This stops the critique from collapsing into "make it better" vibes.
The worked example
We will run CoD on a hypothetical source: a 3-page product-launch blog post by a fictional SaaS company we will call Meridian Labs. Treat this as a labeled hypothetical — a real source would be longer and more cluttered, but the pattern is identical.
Source excerpt (hypothetical):
Meridian Labs today announced Atlas 2.0, a cloud-native observability platform that consolidates logs, metrics, and traces into a unified query layer. The launch, made at the company's MeridianCon 2026 conference in Austin on April 15, 2026, included a keynote by CEO Priya Raman and CTO Marcus Hollis. Atlas 2.0 is priced at $18 per host per month, a 28% reduction from the prior version's effective rate. Early adopters include Finlark, a Series B fintech, and Kestrel Aviation, which reduced mean time to detection from 47 minutes to under 6 minutes during beta. The platform runs on AWS, GCP, and Azure and ships with an SDK for Python, Go, TypeScript, and Rust. Meridian raised a $90M Series C led by Accel Partners in March 2026, bringing total funding to $142M. Former Datadog VP of Engineering Lena Ortiz joined the board as part of the round. The company plans a free tier limited to 50 hosts, available May 1, 2026. Competitors named in Raman's keynote included Datadog, New Relic, and Grafana.
Now the CoD passes, each exactly 60 words.
Iteration 1 (sparse — the baseline):
Meridian Labs announced Atlas 2.0, a cloud-native observability platform that unifies logs, metrics, and traces. The launch happened this month at the company's annual conference. Pricing is lower than the previous version. Several early customers reported improved incident response. The product works across major cloud providers and supports popular programming languages. The company also discussed recent funding and competitive positioning.
Entity count: roughly 3 (Meridian Labs, Atlas 2.0, observability). The rest is filler.
MISSING_ENTITIES: MeridianCon 2026, $18/host/month, 28% price reduction.
Iteration 3 (the practical stopping point):
Meridian Labs launched Atlas 2.0 at MeridianCon 2026 in Austin on April 15, unifying logs, metrics, and traces under one query layer. Priced at $18 per host — a 28% cut — with a free tier of 50 hosts starting May 1. Finlark and Kestrel Aviation reported faster incident detection in beta. The platform runs on AWS, GCP, and Azure.
Entity count: roughly 13. Reads naturally; every sentence carries multiple facts.
MISSING_ENTITIES: $90M Series C, Accel Partners, Lena Ortiz.
Iteration 5 (maximum density):
Meridian Labs launched Atlas 2.0 at MeridianCon 2026 Austin April 15, unifying logs, metrics, traces. CEO Priya Raman and CTO Marcus Hollis keynoted. Priced $18/host (−28%); free tier 50 hosts May 1. Finlark, Kestrel Aviation cut MTTD 47 to 6 minutes. Runs AWS, GCP, Azure; SDKs Python, Go, TypeScript, Rust. $90M Series C led by Accel; ex-Datadog Lena Ortiz joins board.
Entity count: roughly 25. Extremely dense. Reads like a press-ticker crawl, not prose.
Scoring the iterations
Run each pass through the SurePrompts Quality Rubric. For CoD outputs the three dimensions that move most are role clarity (constant across passes), format structure (length compliance), and constraint tightness (entities per word). Typical scores on a well-run CoD chain sit in the 28-31 range at iterations 3 and 4, with iteration 5 sometimes dropping a point on readability.
In this example:
- Iteration 1 scores around 22/35 — the length is right but the density is low and no entity-specific structure exists yet.
- Iteration 3 scores around 30/35 — dense, still readable, most constraints met.
- Iteration 5 scores around 28/35 — densest, but readability slips; a human reader needs to work harder to parse it.
The rubric lets you pick between iterations 3 and 5 on purpose rather than by gut.
When to stop iterating
The density-readability trade-off is the whole game. Each extra pass trades prose quality for entity count. Where to stop depends on the reader:
- A human skimming a brief. Iteration 3. Sentences still flow; entities are unmistakable.
- A pipeline feeding downstream extraction (e.g., an LLM-as-judge extracting structured fields). Iteration 4 or 5. Readability matters less; density matters more.
- Copy for a newsletter or digest. Iteration 2 or 3. Slight narrative flow is worth the entity loss.
In practice: always score every iteration (manually or via judge model). Never ship "whatever iteration 5 produced" by default. Related patterns like self-refine also rely on picking a stopping point rather than trusting the final pass — CoD is not unique here.
Common failure modes
Four patterns break CoD quietly. Watch for them when a chain feels off.
- Adding the same entity twice. The model re-introduces "Atlas 2.0" at iteration 4 under slightly different phrasing. Fix: require the missing-entity list to exclude any entity already in the prior summary, and dedupe case-insensitively before prompting.
- Running out of entities. On a short source, the missing-entity list dries up around iteration 3. The model starts listing low-signal entities ("April" becomes a named entity) or re-lists old ones. Fix: cap the chain at whatever iteration still produces informative missing entities.
- Length drift. Each pass creeps from 60 to 63 to 68 words. Happens most on smaller models. Fix: put a literal word count check in the prompt and reject passes that miss by more than two words.
- Incoherent structure from over-packing. At iteration 5 the model starts concatenating fragments without verbs to meet the length. Fix: treat iteration 3 as the usable output, not a stepping stone to iteration 5.
Variants
Three common adaptations:
- News digests. Prioritize people, organizations, dates, and numbers as entities. Cap at 3 passes — news readers skim.
- Technical docs (release notes, RFCs). Entities are features, API surfaces, version numbers, breaking changes. Run five passes; technical readers tolerate higher density. Instruct the model to preserve the distinction between what changed and what stayed the same.
- Meeting notes. Entities are decisions, owners, deadlines, blockers. Two or three passes is usually enough; the goal is "did we capture every decision" not maximum density. Add a final pass that pulls out action items separately.
For all variants the skeleton stays the same: fixed length, additive entities, explicit missing-entity list. The variant is about what counts as an entity.
Our position
- Iteration 3 is the default output, not iteration 5. Unless you have evidence your reader prefers maximum density, the third pass is the one you ship.
- Always score every iteration. Picking the final pass by default treats CoD as a ritual, not a tool. Score with a rubric or judge; pick by score.
- Length must be enforced mechanically. Natural-language instructions about length fail on smaller models. Post-process length; reject non-compliant passes.
- CoD is for "convey facts" tasks. Do not use it for narrative, persuasive, or creative summaries. Information density is the wrong objective there.
- CoD stacks with other patterns. Run it as the first stage in a chain, then feed the chosen iteration into an RCAF-structured formatting prompt or a judge pass. CoD produces the dense core; other stages package it.
Related reading
- The SurePrompts Quality Rubric — scoring framework for picking the best iteration.
- Self-Refine Prompting Guide — the sibling iterative pattern, critique instead of densify.
- LLM-as-Judge Prompting Guide — automate iteration selection with a judge model.
- Prompt Chaining Guide — CoD is a chain; this covers the general mechanics.
- RCAF Prompt Structure — structure the single-iteration wrappers around CoD.
- Advanced Prompt Engineering Techniques — CoD in the wider toolkit.
- Prompt Patterns for Content Strategy — where CoD fits in a content workflow.