
On June 1, 2026, in Livingston, N.J., CoreWeave announced it had completed the bring‑up and system‑level validation of NVIDIA Vera Rubin NVL72 on CoreWeave Cloud, becoming the first AI cloud provider to do so. The company says the work extends platform support for NVIDIA’s newest hardware and validates the rack‑scale architecture required to run Vera Rubin in production. This step matters to AI teams that need higher inference throughput and efficiency to move models from lab tests into continuous, production use.
Each validated Vera Rubin rack pairs 72 NVIDIA Rubin GPUs with 36 NVIDIA Vera CPUs, all linked by a 260 TB/s, sixth‑generation NVIDIA NVLink fabric. Citing vendor benchmarks, CoreWeave says the configuration can deliver up to 10× better inference per watt, require as few as one‑fourth the GPUs, and cut cost to roughly one‑tenth the price per million tokens versus NVIDIA Blackwell 1 for comparable inference workloads.
CoreWeave frames the bring‑up around rising infrastructure demands from agentic AI: models approaching a trillion parameters, context windows stretching to millions of tokens, and persistent reasoning sessions that make inference throughput and efficiency the primary scaling constraints for builders and operators. Validating a full rack architecture is intended to address those constraints at production scale rather than at isolated, lab‑scale benchmarks.
To operate Vera Rubin at rack scale, CoreWeave added purpose‑built controls in its Mission Control suite. Valvey, a programmable per‑rack valve assembly, turns liquid cooling into a software‑defined control plane that monitors flow, temperature, pressure and leaks in real time, and that enables automated isolation, emergency shutdown and maintenance without disrupting adjacent racks.
The company also updated the physical and network stack. Racky, a unified rack control appliance, aggregates power, cooling and environmental sensors so each rack can be managed as a cloud resource. A multi‑rail, multi‑plane networking design supports NVIDIA Quantum‑X800 InfiniBand and NVIDIA Spectrum‑X Ethernet with RoCE; CoreWeave describes a non‑blocking fabric delivering up to 1.6 Tb/s of backend bandwidth per GPU and an architecture that can scale to hundreds of thousands of GPUs across two network tiers.
CoreWeave highlights secure, multi‑tenant operations enabled by NVIDIA BlueField‑4 DPUs to accelerate infrastructure services, lower latency and strengthen tenant isolation at scale. Craig Falls, head of Quantitative Research at Jane Street, said CoreWeave’s cluster observability and support speed researchers’ iteration cycles, and Chen Goldberg, EVP of Product & Engineering at CoreWeave, said the agentic era requires deeper engineering to move from lab performance to production reliability.
Sources
Replies (0)
No replies in this topic yet.