remyxai2025apache-2.0
Robo2VLM-Reasoning
A vision-language dataset derived from large-scale robot manipulation data, augmented with reasoning traces generated by gemini-2.5-pro to support visual question answering tasks.
Downloads64
Episodes5150
Likes4
Why This Matters for Physical AI
This dataset bridges vision-language understanding and robot manipulation by creating reasoning-augmented VQA tasks from in-the-wild robot data, enabling models to develop interpretable reasoning about robotic actions and scene understanding.
Technical Profile
- Modalities
- rgblanguage
- Task Types
- visual-question-answering
- Episodes
- 5150
- Data Format
- HuggingFace
- Annotation Types
- language_instructionsreasoning
- License
- apache-2.0
Access
Need custom rgb data?
Claru builds purpose-built datasets for any environment applications with dense human annotations and quality assurance.
Request a Sample Pack