Bupt-Joy2025MIT

VitaSet: Vision-Tactile VQA Dataset

A vision-tactile Visual Question Answering dataset for physical property reasoning combining RGB vision and tactile sensing, containing 5,145 human-verified QA pairs across hardness classification, material property description, and surface roughness classification tasks.

Downloads295
Episodes5145
Likes2

Why This Matters for Physical AI

VitaSet enables vision-language models to learn physical property reasoning by combining visual and tactile sensing modalities, advancing multimodal understanding of material properties essential for robotic manipulation and physical reasoning.

Technical Profile

Modalities
rgbtactile
Robot Embodiments
Franka Emika Panda
Environment
lab
Task Types
visual-question-answeringmaterial-property-reasoninghardness-classificationroughness-classification
Episodes
5145
Data Format
JSON
Annotation Types
language_instructionsvisual-question-answering-pairs
License
MIT
Part of the VitaSet family

Community Signals

Top 50% by downloads
HuggingFace Discussions1

Access

Need custom rgb data?

Claru builds purpose-built datasets for lab applications with dense human annotations and quality assurance.

Request a Sample Pack

Related Datasets