Site icon Converge Digest

Meta Deploys Tens of Millions of AWS Graviton5 Cores

The Rise of CPU Power in Agentic AI

Meta will expand its AI infrastructure footprint with a large-scale deployment of Amazon Web Services (AWS) Graviton5 processors, targeting the growing class of agentic AI workloads that demand high-performance CPU resources. The agreement positions Meta Platforms as one of the largest customers for Graviton-based infrastructure, with an initial rollout spanning tens of millions of cores and the flexibility to scale further.

The deployment underscores a shift in AI infrastructure design. While GPUs remain central to model training, Meta is scaling CPU capacity to support real-time inference, orchestration, and multi-step reasoning tasks associated with agentic AI systems. These workloads—ranging from code generation to search and task coordination—require massive parallel processing and low-latency communication across distributed compute environments. AWS Graviton5, built on a 3nm process and featuring 192 cores with significantly expanded cache, targets these requirements by improving inter-core communication and overall throughput.

The infrastructure stack integrates tightly with AWS services, including the Nitro System for hardware-level virtualization, Elastic Fabric Adapter (EFA) for low-latency networking, and support for bare-metal access. Meta also continues to leverage AWS’s broader AI platform, including Amazon Bedrock, as part of its evolving AI architecture. The result is a hybrid compute model that combines GPUs for training with purpose-built CPUs for large-scale inference and orchestration across billions of user interactions.

“This isn’t just about chips; it’s about giving customers the infrastructure foundation, as well as data and inference services, to build AI that understands, anticipates, and scales efficiently to billions of people worldwide,” said Nafea Bshara, vice president and distinguished engineer, Amazon.

🌐 Analysis: The rise of AWS Graviton reflects a decade-long strategy by Amazon Web Services to vertically integrate its infrastructure stack, beginning with silicon design and extending into full data center architecture. The trajectory shows how custom Arm-based CPUs evolved from a cost experiment into a cornerstone of hyperscale AI infrastructure.

AWS Graviton Development Timeline

AWS Graviton Processor Comparison: Graviton4 vs. Graviton5
Feature Graviton4 Graviton5
Launch timeframe 2023 2025–2026 (current generation)
CPU cores 96 cores 192 cores
Architecture Arm Neoverse V2 Custom Arm-based (AWS-designed)
Process node 4nm (TSMC) 3nm (TSMC)
Performance uplift ~30% vs. Graviton3 (AWS-reported) ~25% vs. Graviton4 (AWS-reported)
Cache Large shared L3 cache (expanded vs. G3) Significantly larger aggregate cache (~5x vs. prior gen, AWS claim)
Memory DDR5 with increased bandwidth Enhanced DDR5, higher bandwidth and efficiency
AI / ML support Optimized for ML inference, vector workloads Optimized for agentic AI orchestration, real-time reasoning workloads
Primary workloads Databases, analytics, HPC, scale-out services Agentic AI, large-scale inference, distributed orchestration, real-time services
Design & manufacturing AWS (Annapurna Labs) / TSMC AWS (Annapurna Labs) / TSMC
Exit mobile version