Microsoft CEO Satya Nadella confirmed that the company’s latest in-house AI accelerator, Maia 200, is now live in Azure, expanding Microsoft’s portfolio of CPUs, GPUs, and custom silicon available to cloud customers. Maia 200 is positioned as an inference-optimized accelerator designed to improve efficiency and cost effectiveness for large-scale AI workloads, with Microsoft citing up to 30% better performance per dollar compared with current systems.
Maia 200 is optimized for modern low-precision AI inference, combining high FP4 and FP8 throughput with a large HBM3e memory footprint and very high memory bandwidth. By bringing Maia 200 online in Azure, Microsoft is giving customers another option to deploy advanced AI workloads at scale, particularly for inference-heavy scenarios where memory bandwidth, efficiency, and predictable scaling are critical.
Maia 200 key specifications (as disclosed by Microsoft):
- AI accelerator optimized for inference workloads
- >10 PFLOPS FP4 peak throughput
- ~5 PFLOPS FP8 peak throughput
- 216 GB HBM3e memory
- ~7 TB/s memory bandwidth
- Designed to deliver ~30% better performance per dollar versus current systems
- Integrated into Azure alongside CPUs, GPUs, and other custom accelerators
🌐 We’re tracking the latest developments in AI infrastructure. Follow our ongoing coverage at https://convergedigest.com/category/ai-infrastructure/







