Nvidia robotics AI models and open physical AI

What Nvidia’s new physical AI models are and why they matter

Nvidia’s new physical AI foundation models, Alpamayo 2 Super and Cosmos 3, are open multimodal systems that combine perception, language, world simulation and action planning so robots and autonomous vehicles can understand, reason about and interact with complex physical environments more safely and efficiently. Alpamayo 2 Super is a 32‑billion‑parameter Vision Language Action model designed for Level 4 robotaxis, focused on long‑tail scenario reasoning, 3D spatial understanding and trajectory prediction across the full driving stack. Cosmos 3 is an “open world” model that fuses vision reasoning, world generation and action prediction, and can understand and generate text, images, video, ambient audio and actions with strong physical consistency. Together, these Nvidia robotics AI models point to a shift from closed, task‑specific stacks toward shared, open source AI robotics infrastructure that smaller developers can reuse instead of rebuilding from scratch.

Alpamayo 2 Super: an open teacher model for Level 4 robotaxis

Alpamayo 2 Super sits at the core of Nvidia’s autonomous robotics development push. The 32‑billion‑parameter VLA model builds on earlier 10‑billion‑parameter systems and is tailored for Level 4 robotaxis, with reasoning, planning and execution integrated in one foundation model. It expands situational awareness from front‑facing cameras to full 360‑degree perception for safer lane changes, merges and intersections, and introduces “meta action” outputs such as yielding, stopping and lane changes alongside trajectory prediction and chain‑of‑causation tracking. The model generates automated 2D grounding labels, cutting data‑labeling cycles from months to days. According to Nvidia, Alpamayo 2 Super will be released this summer via GitHub as inference code and on Hugging Face with downloadable model weights, positioning it as an open teacher model that can later be distilled into smaller deployable networks on DRIVE AGX Thor hardware.

Nvidia’s Open Physical AI Models Are Rewiring How Robots Learn

Cosmos 3: a world foundation model for physical AI

Where Alpamayo 2 Super targets driving, Cosmos 3 generalizes the idea of physical AI foundation models to a wide range of embodied systems. Built on a mixture‑of‑transformers architecture, Cosmos 3 combines vision reasoning, world generation and action prediction within a single “omni‑model” that understands and generates text, images, video, ambient audio and action sequences with high physical accuracy. Nvidia says Cosmos 3 can reduce physical AI training and evaluation cycles from months to days by letting developers run large numbers of simulated interactions before ever touching a real robot or vehicle. It is explicitly positioned for robots, autonomous vehicles and vision AI systems that must perceive, reason, plan and act in the physical world. To grow this ecosystem, Nvidia has launched the Nvidia Cosmos Coalition, bringing together world‑model developers including Agile Robots, Black Forest Labs, Dyna Robotics, Generalist, LTX, Runway and Skild AI.

Lowering barriers for autonomous robotics developers

For smaller robotics and autonomous vehicle teams, the key shift is that these physical AI models and tools are open rather than locked inside a single OEM stack. Nvidia is releasing Alpamayo 2 Super’s inference code, model weights and chain‑of‑causation auto‑labeling pipeline as open resources, while AlpaGym, its high‑throughput closed‑loop reinforcement learning framework, is also open‑source. These pieces sit on top of a shared simulation layer, including OmniDreams for realistic long‑tail scenario generation and Omniverse NuRec for turning real‑world driving footage into simulation‑ready scenes. Instead of building their own world models, labeling pipelines and simulators, companies can plug into Nvidia’s open source AI robotics toolkit, focus on application‑specific tuning and still obtain safety‑critical features like explainable reasoning. This democratizes access to high‑end autonomous robotics development, especially for startups that lack the data and infrastructure budgets of major automakers.

From single vehicles to a broader physical AI ecosystem

Nvidia is also tying these physical AI foundation models into a wider ecosystem that spans simulation, training and deployment. Alpamayo 2 Super is designed as a teacher model whose knowledge can be distilled into compact networks running on the DRIVE Hyperion platform, aligning AI research outputs with real robotaxi fleets. AlpaGym closes the loop between models and environments so long‑tail failures discovered in AlpaSim simulations feed directly back into training. OmniDreams and NuRec handle synthetic data generation and neural reconstruction, turning raw sensor logs into scalable training worlds. On top of that, Nvidia is expanding its DRIVE Hyperion robotaxi ecosystem through partnerships with automakers, Tier 1 suppliers and mobility providers. In practice, this means that Nvidia robotics AI models like Alpamayo 2 Super and Cosmos 3 can move from research labs into large‑scale physical deployments faster, while remaining accessible to the broader developer community.