cagataydev2026cc-by-4.0
VLM Robotics Voice Commands (Text)
50,000 curated natural language commands for Vision-Language-Model robot control, covering 10 categories of embodied interaction including pick-and-place, manipulation, navigation, and human-robot interaction.
Downloads26
Episodes50000
Why This Matters for Physical AI
This dataset enables training Vision-Language Models to understand diverse natural language commands for robot control, supporting research in embodied AI and human-robot interaction through text-based instruction understanding.
Technical Profile
- Modalities
- language
- Action Space
- language
- Task Types
- pick_and_placemanipulationnavigationmultistepobservationhousehold
- Episodes
- 50000
- Annotation Types
- language_instructionstask_category_labelsdifficulty_labels
- License
- cc-by-4.0
Access
Need custom language data?
Claru builds purpose-built datasets for any environment applications with dense human annotations and quality assurance.
Request a Sample Pack