What is Model Distillation?

Model distillation is a technique for creating a smaller, more efficient "student" model that approximates the behavior of a larger "teacher" model.

Model Distillation - Prompt Engineering Glossary

Model Distillation: Model distillation is a technique for creating a smaller, more efficient "student" model that approximates the behavior of a larger "teacher" model. The student is trained not on the original dataset but on the teacher's outputs (including its probability distributions over tokens), allowing it to capture much of the teacher's capability at a fraction of the computational cost. Distillation enables deploying powerful AI capabilities on resource-constrained environments.

Example

A company uses GPT-4 to generate 100,000 high-quality customer support responses. They then fine-tune a much smaller 7B-parameter model on these responses. The distilled model handles 90% of support queries with comparable quality at 1/50th the inference cost, while the remaining 10% of complex queries are routed to the full GPT-4.

Frequently asked questions

What is Model Distillation?: Model distillation is a technique for creating a smaller, more efficient "student" model that approximates the behavior of a larger "teacher" model.
Can you give an example of Model Distillation?: A company uses GPT-4 to generate 100,000 high-quality customer support responses. They then fine-tune a much smaller 7B-parameter model on these responses. The distilled model handles 90% of support queries with comparable quality at 1/50th the inference cost, while the remaining 10% of complex queries are routed to the full GPT-4.

Put this into practice

Build polished, copy-ready prompts in under 60 seconds with SurePrompts.

Try SurePrompts

Model Distillation

Example

Frequently asked questions

What is Model Distillation?

Can you give an example of Model Distillation?

Related Terms

Put this into practice