• Home
  • About
  • Events Calendar
  • Blueprint Guidelines
  • Privacy Policy
  • Manage Email Delivery
  • NextGenInfra.io
No Result
View All Result
Converge Digest
Friday, June 19, 2026
  • Home
  • About
  • Events Calendar
  • Blueprint Guidelines
  • Privacy Policy
  • Manage Email Delivery
  • NextGenInfra.io
No Result
View All Result
Converge Digest
No Result
View All Result

Home » CoreWeave Details Rack Engineering for NVIDIA Vera Rubin

CoreWeave Details Rack Engineering for NVIDIA Vera Rubin

June 19, 2026
in AI Infrastructure, Data Centers
A A

CoreWeave disclosed the engineering work behind its early deployment and validation of NVIDIA’s next-generation Vera Rubin NVL72 platform, becoming the first cloud provider to successfully bring up the rack-scale AI system and complete diagnostic validation. The company outlined a series of hardware, networking, cooling, orchestration, and observability innovations designed to support large-scale agentic AI workloads built around trillion-parameter models, extended context windows, and continuously operating AI systems.

The NVIDIA Vera Rubin NVL72 platform integrates 72 Rubin GPUs, 36 Vera CPUs, ConnectX-9 SuperNICs, BlueField-4 DPUs, NVLink 6 switching, and support for both Quantum-X800 InfiniBand and Spectrum-X Ethernet fabrics. NVIDIA positions the architecture as delivering equivalent AI training performance using one-quarter the GPUs required by Blackwell systems and AI inference at one-tenth the cost per million tokens. CoreWeave said the platform required extensive re-engineering of power, liquid cooling, networking, storage, orchestration, and fleet management infrastructure before deployment.

A central theme of the deployment is treating an entire AI rack as a programmable cloud resource. CoreWeave introduced a patent-pending liquid cooling management system called Valvey, a rack management platform called Racky, and enhanced lifecycle orchestration through its Mission Control software stack. The company also highlighted its deployment of NVIDIA’s liquid-cooled Spectrum-X SN6600 Ethernet switches, support for both InfiniBand and RoCE fabrics, topology-aware Kubernetes scheduling, local storage acceleration, and large-scale observability tools designed to maximize GPU utilization and cluster reliability.

NVIDIA Vera Rubin NVL72 on CoreWeave Cloud
Key architectural elements and infrastructure innovations disclosed by CoreWeave (June 2026)
Deployment MilestoneCoreWeave says it became the first cloud provider to validate and successfully run diagnostics on NVIDIA Vera Rubin NVL72.
GPU Configuration72 NVIDIA Rubin GPUs integrated into a single rack-scale NVL72 architecture.
CPU Complex36 NVIDIA Vera CPUs providing host processing and system orchestration.
Scale-Up FabricNVIDIA NVLink 6 switching architecture enables high-bandwidth communication across all GPUs within the rack.
Scale-Out FabricsSupports both NVIDIA Quantum-X800 InfiniBand and NVIDIA Spectrum-X Ethernet with RoCE, allowing customers to choose their preferred cluster interconnect.
Network InterfacesNVIDIA ConnectX-9 SuperNICs provide scale-out connectivity across both InfiniBand and Ethernet fabrics.
DPU LayerBlueField-4 DPUs provide infrastructure acceleration and data movement services.
Ethernet SwitchingLiquid-cooled NVIDIA Spectrum-X SN6600 Ethernet switches deliver 102.4 Tbps of switching capacity across 128 ports operating at up to 800 Gbps each.
Network ArchitectureCoreWeave designed a multi-rail, multi-plane non-blocking fabric architecture that can scale incrementally by adding spine switches.
Bandwidth per GPUEach ConnectX-9 interface supports up to 800 Gbps per port, delivering as much as 1.6 Tbps of backend network bandwidth per GPU.
Maximum ScaleCoreWeave says the architecture can support deployments exceeding 120,000 GPUs while maintaining a non-blocking topology.
Valvey Cooling SystemPatent-pending programmable liquid-cooling valve assembly that monitors and controls flow rate, temperature, pressure, leak detection, maintenance isolation, and emergency shutdown functions.
Racky Management LayerUnified rack manager that aggregates power, cooling, environmental, leak detection, flow, and temperature telemetry into a single control surface.
Mission ControlCoreWeave’s operational platform integrates Racky, Valvey, lifecycle controllers, observability, telemetry, fleet management, unhealthy-node replacement, and GPU performance monitoring.
AI OrchestrationCoreWeave Kubernetes Service (CKS) uses topology-aware scheduling to keep workloads within high-bandwidth NVLink domains whenever possible.
Dynamic Resource AllocationSUNK dynamically reallocates GPU resources between training and inference workloads and reduces cluster fragmentation as workloads evolve.
Storage AccelerationLOTA (Local Object Transport Accelerator) stages training datasets locally to reduce object-storage bottlenecks and improve GPU utilization.
Storage PlatformFeatures liquid-cooled NVMe storage using Micron 7600 SSDs designed to maintain throughput under sustained thermal loads.
Server PlatformBuilt on Dell Technologies PowerEdge XE9812 servers integrated into CoreWeave’s accelerated AI infrastructure stack.
NVIDIA Performance ClaimsNVIDIA says Vera Rubin can deliver equivalent AI training with one-quarter the GPUs required by Blackwell systems and AI inference at one-tenth the cost per million tokens.

“CoreWeave has delivered highly performant clusters with full cluster observability and a support team that engages deeply on hard problems, giving us the confidence to partner with them on Vera Rubin,” said Craig Falls, Head of Quantitative Research at Jane Street.

🌐 Analysis

CoreWeave’s disclosure provides one of the first detailed looks at the operational challenges associated with deploying NVIDIA’s Vera Rubin generation. The industry discussion around Rubin has largely focused on GPU performance, but CoreWeave highlights a broader reality: power distribution, liquid cooling, rack management, storage throughput, and fabric architecture increasingly determine overall AI system efficiency. As AI infrastructure scales from thousands to tens of thousands of accelerators, operational tooling and fleet management become as important as raw compute performance.

The announcement also reflects a growing trend toward rack-scale architectures. NVIDIA’s Rubin roadmap pushes more intelligence into tightly integrated systems where GPUs, CPUs, networking, and cooling operate as a single platform. Similar efforts are underway across the industry from NVIDIA ecosystem partners including Dell Technologies, Supermicro, HPE, Lenovo, and major cloud providers. CoreWeave’s emphasis on software-defined cooling, rack-level control, and multi-plane networking illustrates how competitive differentiation among AI cloud providers is increasingly shifting beyond GPU procurement toward infrastructure engineering and operational efficiency.

CoreWeave Profile

NASDAQ: CRWV

CoreWeave provides cloud infrastructure built for accelerated computing and AI workloads, including GPU-based services for AI training, inference, rendering, and high-performance computing.

HeadquartersLivingston, New Jersey
CEOMichael Intrator
Founded2017
Primary FocusAI Cloud Infrastructure, GPU Computing, AI Training & Inference
CoverageAI Infrastructure • GPU Clouds • AI Training • AI Inference • NVIDIA GPUs • Data Centers • Cloud Computing • High-Performance Computing
Tags: CoreWeaveDell
ShareTweetShareSummarizeSummarize
Previous Post

Verizon Business Adds Ericsson Private 5G for Global Enterprise Networks

Jim Carroll

Jim Carroll

Editor and Publisher, Converge! Network Digest, Optical Networks Daily - Covering the full stack of network convergence from Silicon Valley

Related Posts

Data Centers

Dell Ships Liquid-Cooled Racks with Vera Rubin NVL72 to CoreWeave

May 31, 2026
AI Infrastructure

CoreWeave Launches Unified Agentic AI Capabilities

May 28, 2026
All

IREN Orders $1.6B of Dell Blackwell Systems 

May 28, 2026
Enterprise

Dell Targets AI-Era Data Centers with Liquid Cooling, Agentic Automation

May 19, 2026
AI Infrastructure

Neocloud Spending Surge Set to Accelerate

May 10, 2026
AI Infrastructure

CoreWeave Doubles Revenue to $2.1B as AI Cloud Backlog Nears $100B

May 7, 2026

Categories

  • 5G / 6G / Wi-Fi
  • AI Infrastructure
  • All
  • Automotive Networking
  • Blueprints
  • Clouds and Carriers
  • Data Centers
  • Enterprise
  • Explainer
  • Feature
  • Financials
  • Last Mile / Middle Mile
  • Legal / Regulatory
  • Optical
  • Quantum
  • Research
  • Security
  • Semiconductors
  • Space
  • Start-ups
  • Subsea
  • Sustainability
  • Video
  • Webinars

Archives

Tags

5G All AT&T Australia AWS Blueprint columns BroadbandWireless Broadcom China Ciena Cisco Data Centers Dell'Oro Ericsson FCC Financial Financials Huawei Infinera Intel Japan Juniper Last Mile Last Mille LTE Mergers and Acquisitions Mobile NFV Nokia Optical Packet Systems PacketVoice People Regulatory Satellite SDN Service Providers Silicon Silicon Valley StandardsWatch Storage TTP UK Verizon Wi-Fi
Converge Digest

A private dossier for networking and telecoms

Follow Us

  • Home
  • About
  • Events Calendar
  • Blueprint Guidelines
  • Privacy Policy
  • Manage Email Delivery
  • NextGenInfra.io

© 2026 Converge Digest - A private dossier for networking and telecoms.

No Result
View All Result
  • Home
  • About
  • Events Calendar
  • Blueprint Guidelines
  • Privacy Policy
  • Manage Email Delivery
  • NextGenInfra.io

© 2026 Converge Digest - A private dossier for networking and telecoms.

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.
Go to mobile version