keivalyamit

Nemotron-VLA MetaWorld Expert Demonstrations

Expert demonstration dataset for training Vision-Language-Action (VLA) models on 50 MetaWorld robot manipulation tasks, containing 187,252 transitions with RGB images, proprioceptive state, actions, and natural language instructions.

Downloads406
Episodes2500
Likes1

Why This Matters for Physical AI

This dataset provides high-quality expert demonstrations for training vision-language-action models that can understand and execute diverse manipulation tasks from natural language instructions, bridging perception and control for embodied AI systems.

Technical Profile

Modalities
rgbproprioceptionlanguage
Robot Embodiments
MetaWorld Sawyer
Action Space
end_effector_delta
Environment
simulation
Task Types
manipulationgraspingpick_and_placepushingpullinginsertiondoor_openingdrawer_opening
Episodes
2500
Data Format
parquet
Annotation Types
language_instructions
License
mit
Part of the Nemotron-VLA family

Community Signals

Top 50% by downloads
HuggingFace Discussions1

Access

Need custom rgb data?

Claru builds purpose-built datasets for simulation applications with dense human annotations and quality assurance.

Request a Sample Pack

Related Datasets