keplerccc2025apache-2.0
Robo2VLM-Reasoning
A visual question-answering dataset derived from large-scale in-the-wild robot manipulation data, augmented with reasoning traces generated by Gemini-2.5-Pro for supporting correct multiple-choice answers.
Downloads150
Episodes5150
Why This Matters for Physical AI
This dataset enables training vision-language models to reason about robot manipulation tasks by combining real robotic imagery with structured question-answering and reasoning traces, advancing embodied AI's capacity for visual understanding and explanatory inference.
Technical Profile
- Modalities
- rgblanguage
- Task Types
- manipulationvisual-question-answering
- Episodes
- 5150
- Annotation Types
- language_instructionsreasoning
- License
- apache-2.0
Community Signals
Top 50% by downloads
Access
Need custom rgb data?
Claru builds purpose-built datasets for any environment applications with dense human annotations and quality assurance.
Request a Sample Pack