Gemini 2.5 Flash
Google's fast, cheap multimodal model with a 1M-token context window.
Gemini 2.5 Flash specifications
| Provider | |
|---|---|
| Context window | 1,000,000 tokens (1.0M) |
| Input price | $0.10 / 1M tokens |
| Output price | $0.40 / 1M tokens |
| Inputs | Text, Images, Audio, Video |
| Reasoning mode | Yes |
| Vision (image input) | Yes |
| Tool / function calling | Yes |
| Native real-time web | No |
| Open-weight | No |
Pricing & context web-verified June 2026 — confirm on Google's pricing page.
What Gemini 2.5 Flash is best for
- Cheap long-context processing
- High-volume multimodal tasks
- Latency-sensitive apps
How to prompt Gemini 2.5 Flash
- Hard to beat for cheap long-context — a 1M window at a low price.
- Use it for bulk multimodal extraction and triage.
- Escalate to Gemini Pro for the hardest reasoning.
Work with Gemini 2.5 Flash
- Build an optimized prompt with the Google prompt generator.
- Estimate cost with the token counter & cost calculator, or compare every model on the AI model comparison page.
- Related reading: Best Gemini Prompts 2026.
- Official: gemini.google.com
Frequently asked questions
What is Gemini 2.5 Flash's context window?
Gemini 2.5 Flash has a 1,000,000-token context window (about 1.0M tokens) — the maximum amount of text it can consider at once.
How much does Gemini 2.5 Flash cost?
Via the Google API, Gemini 2.5 Flash costs $0.10 per 1M input tokens and $0.40 per 1M output tokens (web-verified June 2026). Output tokens cost more, so the length of the model's reply usually drives the bill.
Is Gemini 2.5 Flash multimodal?
Yes — Gemini 2.5 Flash accepts Text, Images, Audio, Video as input, not just text.
Is Gemini 2.5 Flash open-weight?
No — Gemini 2.5 Flash is available only through the Google API, not as downloadable weights.