CoreWeave adds Sandboxes to run RL, agent tool use and model evaluation: what developers gain

News

5/16/2026, 3:49:20 AM

CoreWeave adds Sandboxes to run RL, agent tool use and model evaluation: what developers gain

CoreWeave announced CoreWeave Sandboxes on May 14, 2026, shipping an execution layer that delivers secure, isolated runtimes for reinforcement learning (RL), AI agent tool use, and model evaluation. The capability is available both inside a customer’s CoreWeave infrastructure and as a serverless runtime delivered through Weights & Biases (W&B), and it can be accessed from the Cloud Console or a Python SDK. Sandboxes runs directly in a customer’s CoreWeave Kubernetes Service (CKS) cluster and includes built‑in session management, storage integration, and monitoring to handle concurrent jobs and back‑and‑forth task patterns.

CoreWeave positions Sandboxes to address a common operational shortfall: organizations often lack a unified execution layer for RL, agent workflows, and model evaluation and instead rely on custom systems, loosely integrated tools, or third‑party sandboxes that sit outside core infrastructure. Those disconnected approaches become harder to govern, scale, and observe as concurrency and workflow complexity grow, increasing operational burden for platform teams.

The serverless path through W&B is aimed at researchers and requires authentication with an existing W&B API key plus a pip install of the Python client. Every sandbox runs in its own fully isolated virtual environment by default, so failures, memory spikes, or runaway processes in one sandbox cannot affect another. Activity from sandboxes is captured in the same W&B run view as training metrics, centralizing logs and metrics for debugging and observability.

For platform teams running on CKS, Sandboxes lets organizations run RL, agent tool use, and evaluation workloads alongside other AI jobs without adding a separate execution stack. CoreWeave emphasizes that the service behaves like a governed, observable part of existing infrastructure, intended to reduce the operational overhead and complexity of building and maintaining custom execution systems.

Customer and executive comments underline the scale and intent behind the product. Brian Belgodere, senior technical staff member, AI/ML Systems, IBM Research, said, “Our reinforcement learning workflows spin up thousands of sandboxes in parallel per training step, each with its own container image and resource boundaries.” Chen Goldberg, EVP, Product and Engineering at CoreWeave, framed Sandboxes as closing the execution gap in reinforcement learning and agent workflows.

Sources

CoreWeave Newsroom · 5/14/2026

Replies (0)

No replies in this topic yet.

Back