remyxai2025apache-2.0

Robo2VLM-Reasoning

A vision-language dataset derived from large-scale robot manipulation data, augmented with reasoning traces generated by gemini-2.5-pro to support visual question answering tasks.

Downloads64
Episodes5150
Likes4

Why This Matters for Physical AI

This dataset bridges vision-language understanding and robot manipulation by creating reasoning-augmented VQA tasks from in-the-wild robot data, enabling models to develop interpretable reasoning about robotic actions and scene understanding.

Technical Profile

Modalities
rgblanguage
Task Types
visual-question-answering
Episodes
5150
Data Format
HuggingFace
Annotation Types
language_instructionsreasoning
License
apache-2.0
Part of the Robo2VLM family

Access

Need custom rgb data?

Claru builds purpose-built datasets for any environment applications with dense human annotations and quality assurance.

Request a Sample Pack

Related Datasets