keivalyamit
Nemotron-VLA MetaWorld Expert Demonstrations
Expert demonstration dataset for training Vision-Language-Action (VLA) models on 50 MetaWorld robot manipulation tasks, containing 187,252 transitions with RGB images, proprioceptive state, actions, and natural language instructions.
Downloads406
Episodes2500
Likes1
Why This Matters for Physical AI
This dataset provides high-quality expert demonstrations for training vision-language-action models that can understand and execute diverse manipulation tasks from natural language instructions, bridging perception and control for embodied AI systems.
Technical Profile
- Modalities
- rgbproprioceptionlanguage
- Robot Embodiments
- MetaWorld Sawyer
- Action Space
- end_effector_delta
- Environment
- simulation
- Task Types
- manipulationgraspingpick_and_placepushingpullinginsertiondoor_openingdrawer_opening
- Episodes
- 2500
- Data Format
- parquet
- Annotation Types
- language_instructions
- License
- mit
Community Signals
Top 50% by downloads
HuggingFace Discussions1
Access
Need custom rgb data?
Claru builds purpose-built datasets for simulation applications with dense human annotations and quality assurance.
Request a Sample Pack