Nvidia has released a new set of physical Artificial Intelligence research tools, agent workflows and open source models for building more advanced systems that operate in the real world. The systems are intended to accelerate development of autonomous vehicles, robots and vision Artificial Intelligence systems by giving researchers more integrated ways to simulate, train and evaluate software before it is deployed in the real world.
Unveiled this week at the Computer Vision and Pattern Recognition conference in Denver, the updates build on Nvidia’s recently launched Cosmos 3 world foundation model and are designed to help researchers automate key stages of physical Artificial Intelligence development, including simulation, synthetic data generation, policy training and evaluation. Physical Artificial Intelligence refers to systems that interact with and operate in the physical world, including self-driving vehicles, industrial robots and embodied Artificial Intelligence agents. Nvidia said the capabilities address a major industry challenge: creating scalable workflows to train and test Artificial Intelligence virtually before real-world deployment. The company described the current process as fragmented across separate tools, making experimentation slower for researchers who have to assemble workflows manually.
New agent skills are being integrated across Nvidia Omniverse, Isaac Sim, Isaac Lab and Cosmos, enabling developers to automate scene reconstruction, simulation setup, environment generation and reinforcement learning workflows. For autonomous vehicle development, Nvidia introduced tools aimed at the industry’s “long-tail problem”, difficult-to-capture driving scenarios that are critical for training and validation. Its Artificial Intelligence agents can automate the reconstruction of real-world driving environments from fleet data and generate synthetic edge-case scenarios for testing. Nvidia also introduced Alpamayo 2 Super, a 32-billion-parameter vision-language-action model for autonomous driving. The system is designed with advanced reasoning capabilities, enabling it to autonomously act across the full driving stack.
In vision Artificial Intelligence, Nvidia expanded its video analysis capabilities with updates to Metropolis, adding tools for video search, summarization and synthetic data generation. These capabilities are designed to help developers build Artificial Intelligence agents that understand complex scenes, identify events and generate alerts from video streams. Robotics also received new agent skills intended to automate simulation and training workflows, reducing manual work required to create virtual environments and train robots within them. Nvidia’s new physical Artificial Intelligence suite is available through GitHub, while synthetic data generation tools including Neural Reconstruction, Video Augmentation and Defect Image Generation are available on Nvidia Brev with free trial credits for researchers.
