Skip to main content

Chain of Code

Chain of Code is a hybrid reasoning pattern in which the model produces a trace that interleaves executable code with natural-language "pseudocode" comments. The code sections are run by an interpreter; the natural-language sections are "executed" by the model itself acting as a simulator, substituting plausible outputs for steps that cannot be expressed as code. Introduced by Li et al. (2023), it extends Program-of-Thoughts to tasks that mix precise computation (arithmetic, lookups, string manipulation) with qualitative reasoning (classification, commonsense judgment). The combination tends to outperform either pure code generation or pure natural-language chain-of-thought on mixed-mode tasks.

Example

Asked to count "how many sentences in this paragraph express a positive sentiment", the model writes code that splits the paragraph into sentences and iterates. Inside the loop, the code calls a pseudo-function `is_positive(sentence)` that the interpreter cannot run; the model simulates its output for each sentence based on its own sentiment judgment. The interpreter then tallies the True values. Pure code would fail on the sentiment judgment step; pure natural language would undercount on long paragraphs.

Put this into practice

Build polished, copy-ready prompts in under 60 seconds with SurePrompts.

Try SurePrompts