oier-mees2025mit

FuSe

FuSe contains 26,866 trajectories collected on a WidowX robot with visual, tactile, sound, and action data across multiple environments, annotated with natural language instructions.

Downloads2K
Episodes26866
Likes4

Why This Matters for Physical AI

FuSe demonstrates the importance of heterogeneous sensor modalities (vision, touch, sound) for training generalizable robot policies through language grounding, advancing multi-modal learning for physical AI.

Technical Profile

Modalities
rgbtactileaudioproprioception
Robot Embodiments
WidowX
Environment
lab
Task Types
manipulation
Episodes
26866
Annotation Types
language_instructions
License
mit
Part of the FuSe family

Community Signals

Top 25% by downloads
HuggingFace Discussions1

Access

Need custom rgb data?

Claru builds purpose-built datasets for lab applications with dense human annotations and quality assurance.

Request a Sample Pack

Related Datasets