swistreich2025mit
X-Capture
A multisensory dataset containing RGB-D, acoustic, tactile, and 3D data collected from 600 real-world objects across nine in-the-wild environments, with six capture points per object.
Downloads60
Episodes3600
Why This Matters for Physical AI
This dataset enables multimodal representation learning and cross-sensory understanding of physical objects, which is critical for robots to learn rich, grounded perceptions of their environment through multiple sensing modalities.
Technical Profile
- Modalities
- rgbdepthtactileaudiopoint_cloud
- Environment
- labhomeoutdoor
- Task Types
- object_perceptioncross_modal_retrieval
- Episodes
- 3600
- License
- mit
Access
Need custom rgb data?
Claru builds purpose-built datasets for lab applications with dense human annotations and quality assurance.
Request a Sample Pack