CoreWeave Details Rack Engineering for NVIDIA Vera Rubin

Jim Carroll

6 hours ago

CoreWeave disclosed the engineering work behind its early deployment and validation of NVIDIA’s next-generation Vera Rubin NVL72 platform, becoming the first cloud provider to successfully bring up the rack-scale AI system and complete diagnostic validation. The company outlined a series of hardware, networking, cooling, orchestration, and observability innovations designed to support large-scale agentic AI workloads built around trillion-parameter models, extended context windows, and continuously operating AI systems.

The NVIDIA Vera Rubin NVL72 platform integrates 72 Rubin GPUs, 36 Vera CPUs, ConnectX-9 SuperNICs, BlueField-4 DPUs, NVLink 6 switching, and support for both Quantum-X800 InfiniBand and Spectrum-X Ethernet fabrics. NVIDIA positions the architecture as delivering equivalent AI training performance using one-quarter the GPUs required by Blackwell systems and AI inference at one-tenth the cost per million tokens. CoreWeave said the platform required extensive re-engineering of power, liquid cooling, networking, storage, orchestration, and fleet management infrastructure before deployment.

A central theme of the deployment is treating an entire AI rack as a programmable cloud resource. CoreWeave introduced a patent-pending liquid cooling management system called Valvey, a rack management platform called Racky, and enhanced lifecycle orchestration through its Mission Control software stack. The company also highlighted its deployment of NVIDIA’s liquid-cooled Spectrum-X SN6600 Ethernet switches, support for both InfiniBand and RoCE fabrics, topology-aware Kubernetes scheduling, local storage acceleration, and large-scale observability tools designed to maximize GPU utilization and cluster reliability.

NVIDIA Vera Rubin NVL72 on CoreWeave Cloud

Key architectural elements and infrastructure innovations disclosed by CoreWeave (June 2026)

Deployment Milestone

CoreWeave says it became the first cloud provider to validate and successfully run diagnostics on NVIDIA Vera Rubin NVL72.

GPU Configuration

72 NVIDIA Rubin GPUs integrated into a single rack-scale NVL72 architecture.

CPU Complex

36 NVIDIA Vera CPUs providing host processing and system orchestration.

Scale-Up Fabric

NVIDIA NVLink 6 switching architecture enables high-bandwidth communication across all GPUs within the rack.

Scale-Out Fabrics

Supports both NVIDIA Quantum-X800 InfiniBand and NVIDIA Spectrum-X Ethernet with RoCE, allowing customers to choose their preferred cluster interconnect.

Network Interfaces

NVIDIA ConnectX-9 SuperNICs provide scale-out connectivity across both InfiniBand and Ethernet fabrics.

DPU Layer

BlueField-4 DPUs provide infrastructure acceleration and data movement services.

Ethernet Switching

Liquid-cooled NVIDIA Spectrum-X SN6600 Ethernet switches deliver 102.4 Tbps of switching capacity across 128 ports operating at up to 800 Gbps each.

Network Architecture

CoreWeave designed a multi-rail, multi-plane non-blocking fabric architecture that can scale incrementally by adding spine switches.

Bandwidth per GPU

Each ConnectX-9 interface supports up to 800 Gbps per port, delivering as much as 1.6 Tbps of backend network bandwidth per GPU.

Maximum Scale

CoreWeave says the architecture can support deployments exceeding 120,000 GPUs while maintaining a non-blocking topology.

Valvey Cooling System

Patent-pending programmable liquid-cooling valve assembly that monitors and controls flow rate, temperature, pressure, leak detection, maintenance isolation, and emergency shutdown functions.

Racky Management Layer

Unified rack manager that aggregates power, cooling, environmental, leak detection, flow, and temperature telemetry into a single control surface.

Mission Control

CoreWeave’s operational platform integrates Racky, Valvey, lifecycle controllers, observability, telemetry, fleet management, unhealthy-node replacement, and GPU performance monitoring.

AI Orchestration

CoreWeave Kubernetes Service (CKS) uses topology-aware scheduling to keep workloads within high-bandwidth NVLink domains whenever possible.

Dynamic Resource Allocation

SUNK dynamically reallocates GPU resources between training and inference workloads and reduces cluster fragmentation as workloads evolve.

Storage Acceleration

LOTA (Local Object Transport Accelerator) stages training datasets locally to reduce object-storage bottlenecks and improve GPU utilization.

Storage Platform

Features liquid-cooled NVMe storage using Micron 7600 SSDs designed to maintain throughput under sustained thermal loads.

Server Platform

Built on Dell Technologies PowerEdge XE9812 servers integrated into CoreWeave’s accelerated AI infrastructure stack.

NVIDIA Performance Claims

NVIDIA says Vera Rubin can deliver equivalent AI training with one-quarter the GPUs required by Blackwell systems and AI inference at one-tenth the cost per million tokens.

“CoreWeave has delivered highly performant clusters with full cluster observability and a support team that engages deeply on hard problems, giving us the confidence to partner with them on Vera Rubin,” said Craig Falls, Head of Quantitative Research at Jane Street.

🌐 Analysis

CoreWeave’s disclosure provides one of the first detailed looks at the operational challenges associated with deploying NVIDIA’s Vera Rubin generation. The industry discussion around Rubin has largely focused on GPU performance, but CoreWeave highlights a broader reality: power distribution, liquid cooling, rack management, storage throughput, and fabric architecture increasingly determine overall AI system efficiency. As AI infrastructure scales from thousands to tens of thousands of accelerators, operational tooling and fleet management become as important as raw compute performance.

The announcement also reflects a growing trend toward rack-scale architectures. NVIDIA’s Rubin roadmap pushes more intelligence into tightly integrated systems where GPUs, CPUs, networking, and cooling operate as a single platform. Similar efforts are underway across the industry from NVIDIA ecosystem partners including Dell Technologies, Supermicro, HPE, Lenovo, and major cloud providers. CoreWeave’s emphasis on software-defined cooling, rack-level control, and multi-plane networking illustrates how competitive differentiation among AI cloud providers is increasingly shifting beyond GPU procurement toward infrastructure engineering and operational efficiency.

CoreWeave Profile

NASDAQ: CRWV

CoreWeave provides cloud infrastructure built for accelerated computing and AI workloads, including GPU-based services for AI training, inference, rendering, and high-performance computing.

Headquarters

Livingston, New Jersey

CEO

Michael Intrator

Founded

2017

Primary Focus

AI Cloud Infrastructure, GPU Computing, AI Training & Inference

Coverage

AI Infrastructure • GPU Clouds • AI Training • AI Inference • NVIDIA GPUs • Data Centers • Cloud Computing • High-Performance Computing