NVIDIA has released a large, open-source dataset to support the development of physical AI systems, including robotics and autonomous vehicles (AVs). Announced at the NVIDIA GTC conference in San Jose, the Physical AI Dataset is available on Hugging Face and provides 15 terabytes of data, featuring more than 320,000 robotics trajectories and up to 1,000 Universal Scene Description (OpenUSD) assets. NVIDIA plans to expand the dataset to include data supporting end-to-end AV development, with 20-second traffic scenario clips from over 1,000 U.S. cities and multiple European countries.
The dataset is intended to accelerate pretraining and post-training for AI models used in applications like warehouse robotics, humanoid surgical assistants, and AVs navigating complex traffic conditions. NVIDIA said the dataset will also feed into its existing platforms, including Cosmos, DRIVE AV, Isaac, and Metropolis. Research institutions such as the Berkeley DeepDrive Center, Carnegie Mellon Safe AI Lab, and UC San Diego’s Contextual Robotics Institute are early adopters.
NVIDIA highlighted that physical AI model development typically requires extensive, diverse data to train robust systems. The company emphasized that collecting and curating such data can be cost-prohibitive, especially for smaller organizations. The dataset’s scale is designed to enhance safety research, with tools like NVIDIA NeMo Curator allowing for faster processing of large video datasets. Developers can also leverage NVIDIA’s Isaac GR00T workflow for generating synthetic robot manipulation data.
• NVIDIA launched a 15TB open-source dataset for physical AI development.
• Dataset includes 320,000+ robotics training trajectories and 1,000 OpenUSD assets.
• Autonomous vehicle dataset expansion will cover 1,000+ cities across the U.S. and Europe.
• Supports NVIDIA Cosmos, DRIVE AV, Isaac, and Metropolis platforms.
• Early adopters: UC Berkeley, Carnegie Mellon, UC San Diego labs.
• Enables faster AI model training for safety-critical applications.
• NVIDIA NeMo Curator processes 20 million hours of video in two weeks on Blackwell GPUs.
• Dataset is available now on Hugging Face.
“We can do a lot of things with this dataset, such as training predictive AI models that help autonomous vehicles better track the movements of vulnerable road users like pedestrians to improve safety,” said Henrik Christensen, director of robotics and AV labs at UC San Diego.