Bupt-Joy2025MIT
VitaSet: Vision-Tactile VQA Dataset
A vision-tactile Visual Question Answering dataset for physical property reasoning combining RGB vision and tactile sensing, containing 5,145 human-verified QA pairs across hardness classification, material property description, and surface roughness classification tasks.
Downloads295
Episodes5145
Likes2
Why This Matters for Physical AI
VitaSet enables vision-language models to learn physical property reasoning by combining visual and tactile sensing modalities, advancing multimodal understanding of material properties essential for robotic manipulation and physical reasoning.
Technical Profile
- Modalities
- rgbtactile
- Robot Embodiments
- Franka Emika Panda
- Environment
- lab
- Task Types
- visual-question-answeringmaterial-property-reasoninghardness-classificationroughness-classification
- Episodes
- 5145
- Data Format
- JSON
- Annotation Types
- language_instructionsvisual-question-answering-pairs
- License
- MIT
Community Signals
Top 50% by downloads
HuggingFace Discussions1
Access
Need custom rgb data?
Claru builds purpose-built datasets for lab applications with dense human annotations and quality assurance.
Request a Sample Pack