Stable Audio 2.0
Available stability-ai/stable-audio-2 · by Stability AI · Diffusion transformer (DiT)
Pricing — 1 offering(s)
Stability AI
Per generation (Professional plan)
- $0.20 / generation Current 2024-04-01 → present
Showing the active price and any recorded history. Full pricing history is available via the paid API — see API docs.
Capability profile
audio quality strong
style range strong
vocal support weak
duration ceiling strong
stem export weak
Benchmarks
| Benchmark | Score | Config | Source |
|---|---|---|---|
| CLAP score and FAD (Stability AI internal) | — | Stability AI reports improved CLAP and FAD scores vs Stable Audio 1.0. No independent third-party benchmark published comparing against Suno v4 or Udio on a standardised dataset. | — |
Operator guidance
The best option for long-form, high-fidelity instrumental audio. At $0.20/generation (Professional plan) it is more expensive than MusicGen on Replicate (~$0.042) but produces significantly longer and higher-quality output. For vocals and lyrics, Suno v4 is the better choice. For open- weights self-hosting, MusicGen Large allows cost elimination at scale.
Use cases
- Long-form background music (up to 3 minutes) for video and podcasts
- Sound design and ambient audio for games and installations
- High-fidelity instrumental generation at 44.1kHz
- Applications where output duration precision is required (timing conditioning)
Limitations
- Limited vocal generation — primarily instrumental and sound design
- No public API as of mid-2026; web app only
- Professional plan ($20/month, 100 credits) is necessary for regular use beyond the free tier
- No stem export