Azure Neural TTS
Available microsoft-azure/neural-tts · by Microsoft · neural-tts
Pricing — 1 offering(s)
Microsoft Azure
Standard voices
- $4.00 / 1M characters Current 2019-09-01 → present
Neural voices
- $16.00 / 1M characters Current 2019-09-01 → present
Custom Neural Voice (trained)
- $24.00 / 1M characters Current 2021-01-01 → present
Showing the active price and any recorded history. Full pricing history is available via the paid API — see API docs.
Capability profile
voice naturalness strong
language support strong
voice variety strong
streaming latency moderate
cloning support moderate
Benchmarks
| Benchmark | Score | Config | Source |
|---|---|---|---|
| MOS (naturalness, Azure en-US Aria — vendor-published) | — | Microsoft publishes internal MUSHRA and MOS scores favourably; no independent third-party MOS comparison published. | — |
Operator guidance
The natural choice for organisations running Azure-native infrastructure. At $16/1M chars (Neural) it is price-identical to Google Neural2 and Polly Neural, but Azure offers the widest language support (140+ languages) and speaking style controls. For voice cloning, ElevenLabs and PlayHT are simpler (no enterprise contract needed). For real-time conversational AI, Cartesia Sonic is faster.
Use cases
- Enterprise TTS within Azure-native infrastructure and identity management
- Multilingual content production across 140+ languages
- Brand voice consistency with Custom Neural Voice training
- SSML-rich workflows with speaking styles and role play
Limitations
- Custom Neural Voice requires Microsoft contract and voice data submission
- Neural TTS not available in all Azure regions
- Speaking styles limited to a subset of voices — check documentation per voice