Skip to main content

Llama 4 Maverick (hosted)

Meta's open-weight, natively multimodal model with a 1M-token context.

Llama 4 Maverick (hosted) specifications

ProviderMeta
Context window1,000,000 tokens (1.0M)
Input price$0.20 / 1M tokens
Output price$0.60 / 1M tokens
InputsText, Images
Reasoning modeYes
Vision (image input)Yes
Tool / function callingYes
Native real-time webNo
Open-weightYes
Knowledge cutoffAugust 2024

Pricing & context web-verified June 2026 confirm on Meta's pricing page. Native 1M context, but many hosts serve less; price and context vary by host (Groq, Together, Fireworks).

What Llama 4 Maverick (hosted) is best for

  • Open-weight / self-hosted deployments
  • Long-context multimodal tasks
  • Cost-controlled scale (price varies by host)

How to prompt Llama 4 Maverick (hosted)

  • Open weights — run it yourself or via Groq/Together/Fireworks; price and context vary by host.
  • A strong language model with competent vision; not ideal for fine-grained spatial reasoning.
  • Native context is 1M tokens, but many hosts serve less — check your provider.

Work with Llama 4 Maverick (hosted)

Frequently asked questions

What is Llama 4 Maverick (hosted)'s context window?

Llama 4 Maverick (hosted) has a 1,000,000-token context window (about 1.0M tokens) — the maximum amount of text it can consider at once.

How much does Llama 4 Maverick (hosted) cost?

Via the Meta API, Llama 4 Maverick (hosted) costs $0.20 per 1M input tokens and $0.60 per 1M output tokens (web-verified June 2026). Output tokens cost more, so the length of the model's reply usually drives the bill.

Is Llama 4 Maverick (hosted) multimodal?

Yes — Llama 4 Maverick (hosted) accepts Text, Images as input, not just text.

Is Llama 4 Maverick (hosted) open-weight?

Yes — Llama 4 Maverick (hosted) is open-weight, so you can self-host it or run it through a third-party host.