Skip to main content

Cost Per Task

Cost per task is the total cost — including input tokens, output tokens, tool-call overhead, and retry rate — to complete one unit of useful work with a language model. It's the denominator that makes flagship and budget models comparable on real workloads.

Per-token pricing alone is misleading. A budget model that needs three retries to produce acceptable output can be more expensive per task than a premium model that succeeds first try. Cost per task is what actually shows up on a production invoice.

How it works

  1. 1

    Sum input + output token cost for the full task pipeline.

  2. 2

    Add tool-call and function-execution overhead.

  3. 3

    Multiply by expected retry rate based on failure rate.

  4. 4

    Compare the result across model candidates, not just per-token price.

Example

A budget model at one-tenth the per-token cost still costs 30% more than a premium model if it fails twice as often, requires twice the tool calls, and produces outputs that the next stage of the pipeline has to re-validate.