← All models

Llama 4 Scout

Available

meta/llama-4-scout · by Meta · mixture-of-experts

Pricing — 1 offering(s)

Together AI

Input tokens

$0.10 / 1M tokens (input) Current 2025-04-05 → present

Output tokens

$0.30 / 1M tokens (output) Current 2025-04-05 → present

Showing the active price and any recorded history. Full pricing history is available via the paid API — see API docs.

Capability profile

reasoning moderate

coding moderate

tool use moderate

instruction following strong

context window strong

multilingual moderate

speed strong

Operator guidance

Route here specifically for workloads that need a very long context window and cannot afford frontier closed-model pricing. The 10M-token context is the primary differentiator — if your task doesn't need it, Llama 4 Maverick or GPT-4o mini are better-rounded alternatives at similar cost.

Use cases

Ultra-long-context tasks: entire codebases, legal corpora, full books
Multimodal workflows (image + text) at minimal cost
Open-weights hosting where context length is the primary constraint

Limitations

Newer model; fewer production deployments than Llama 3.x series
Capability ratings are qualitative, not a single cited benchmark run