NVIDIA introduced its BlueField-4 STX reference architecture at GTC 2026, targeting a fundamental bottleneck in AI infrastructure: storage performance for long-context, agentic AI workloads. The new design focuses on accelerating data access and contextual memory, enabling AI systems to sustain higher throughput and responsiveness as they scale across multi-step reasoning tasks. The announcement aligns with NVIDIA’s broader Vera Rubin platform strategy, extending optimization beyond compute and networking into the storage layer.
The STX architecture centers on a rack-scale implementation that integrates the new CMX context memory platform, designed to expand GPU memory with a high-performance storage tier optimized for inference and agentic workflows. By keeping data closer to GPUs and minimizing latency across the data path, NVIDIA reports up to 5x higher token throughput, 4x greater energy efficiency, and 2x faster data ingestion compared to conventional storage architectures. The system combines a storage-optimized BlueField-4 processor—integrating the Vera CPU and ConnectX-9 SuperNIC—with Spectrum-X Ethernet, DOCA software, and NVIDIA AI Enterprise to create a tightly coupled, full-stack architecture.
Ecosystem support for STX spans hyperscalers, AI labs, storage vendors, and system manufacturers. Early adopters include CoreWeave, Crusoe, IREN, Lambda, Mistral AI, Nebius, Oracle Cloud Infrastructure, and Vultr, all targeting context memory storage for large-scale AI deployments. Storage partners such as Cloudian, DDN, Dell Technologies, Hitachi Vantara, HPE, IBM, MinIO, NetApp, Nutanix, VAST Data, and WEKA are co-developing STX-based platforms, while AIC, Supermicro, and Quanta Cloud Technology are building reference systems. NVIDIA expects partner platforms based on STX to become available in the second half of 2026.
- BlueField-4 STX targets storage bottlenecks in agentic AI and long-context inference
- CMX context memory platform extends GPU memory with a high-performance storage layer
- Up to 5x token throughput versus traditional storage architectures
- Up to 4x energy efficiency improvement and 2x faster data ingestion rates
- Combines Vera CPU, ConnectX-9 SuperNIC, and Spectrum-X Ethernet in a unified stack
- Designed for real-time access to contextual data across inference, training, and analytics
- Broad ecosystem spanning cloud providers, AI labs, storage vendors, and OEM manufacturers
- Commercial availability expected through partners in the second half of 2026
“Agentic AI is redefining what software can do — and the computing infrastructure behind it must be reinvented to keep pace,” said Jensen Huang, founder and CEO of NVIDIA. “AI systems that reason across massive context and continuously learn require a new class of storage. NVIDIA STX reinvents the storage stack, providing a modular foundation for AI-native infrastructure that keeps AI factories operating at peak performance.”
🌐 Analysis: NVIDIA is extending its full-stack control strategy beyond GPUs and networking into storage, positioning context memory as a first-class resource in AI system design. This move parallels broader industry efforts—from hyperscalers and startups alike—to rearchitect data paths for inference-heavy workloads, where memory bandwidth, locality, and latency increasingly dictate overall system performance.
🌐 We’re tracking the latest developments in AI infrastructure. Follow our ongoing coverage at: https://convergedigest.com/category/ai-infrastructure/







