NVIDIA launches Cosmos 3, 32‑billion‑parameter Alpamayo 2 Super and open humanoid reference at GTC Taipei

News

6/1/2026, 1:57:14 PM

NVIDIA launches Cosmos 3, 32‑billion‑parameter Alpamayo 2 Super and open humanoid reference at GTC Taipei

NVIDIA used GTC Taipei to unveil a coordinated push into “physical AI,” announcing three linked offerings designed for robots, autonomous vehicles and video systems: Cosmos 3, a next‑generation multimodal world model; Alpamayo 2 Super, a scaled driving model for Level 4 robotaxis; and the Isaac GR00T Reference Humanoid Robot, an open hardware and software baseline. The move signals an effort to provide end‑to‑end tooling that lets developers train large teacher models and distill them into production‑grade stacks.

Cosmos 3 is presented as an open omnimodel that ingests text, images, video, ambient audio and action data in a single system. NVIDIA highlighted three concrete applications: a vision — language video analyzer (Linker Vision is using it to detect traffic anomalies), a world model that can synthesize photorealistic video of rare events, and a world‑action model that outputs numerical motion data such as joint angles or gripper positions (Agile Robots is demonstrating pick‑and‑place learning).

Architecturally, Cosmos 3 applies a mixture‑of‑transformers approach: a reasoning transformer first analyzes a scene, then a generator transformer produces videos, descriptions or motion trajectories. NVIDIA says training used billions of multimodal examples. The company will offer three variants — Cosmos 3 Super for top quality, a Nano build for fast inference and an Edge model planned for real‑time embedded use-and has licensed the models under OpenMDW‑1.1 with releases on Hugging Face and GitHub.

Alpamayo 2 Super expands NVIDIA’s open driving family to 32 billion parameters, up from roughly 10 billion in the earlier Alpamayo 1 Nano and 1.5 Nano releases. The model ingests camera images and outputs concrete trajectories along with higher‑level meta‑actions such as “lane change,” “stop” or “yield,” and it generates a textual “chain of causation” for each decision. Perception now covers the entire vehicle rather than only front cameras, and NVIDIA describes the large model as a teacher intended to be distilled into smaller, vehicle‑grade stacks.

To support distillation and closed‑loop training, NVIDIA released AlpaGym, an open‑source reinforcement‑learning framework, and OmniDreams, a generative model for rare traffic scenarios. Code and weights for the driving models are expected on GitHub and Hugging Face this summer. NVIDIA emphasized these releases are meant to train and teach smaller stacks to run on vehicle hardware like Drive AGX Thor and did not offer direct external comparisons to other autonomy providers.

Complementing the software, the Isaac GR00T Reference Humanoid Robot is an open reference platform built on a Unitree chassis and presented as a developer‑oriented baseline. The platform is intended to give builders a shared starting point for humanoid research and prototyping while tying into NVIDIA’s broader ecosystem of models, simulation and cloud training.

Sources

The Decoder AI · 6/1/2026

Replies (0)

No replies in this topic yet.

Back