OpenMOSS-Team2025cc-by-nc-4.0

OmniAction

A large-scale multimodal dataset for proactive robot manipulation comprising 141,162 episodes with contextual instructions derived from spoken dialogue, environmental sounds, and visual cues rather than explicit commands.

Downloads3K
Episodes141162
Likes69

Why This Matters for Physical AI

OmniAction enables training of robots for proactive intention recognition from multimodal contextual cues, advancing beyond explicit instruction-following toward more natural human-robot collaboration in real-world scenarios.

Technical Profile

Modalities
rgbaudiolanguage
Environment
simulation
Task Types
manipulation
Episodes
141162
Data Format
RLDS
Annotation Types
language_instructionsaction_labels
License
cc-by-nc-4.0
Part of the OmniAction family

Community Signals

Access

Need custom rgb data?

Claru builds purpose-built datasets for simulation applications with dense human annotations and quality assurance.

Request a Sample Pack

Related Datasets