hosam12kalad2025cc-by-nc-4.0
OmniAction
A large-scale multimodal dataset for proactive robot manipulation with 141,162 episodes covering contextual instruction following through spoken dialogue, environmental sounds, and visual cues.
Downloads11K
Episodes141162
Why This Matters for Physical AI
This dataset enables training of multimodal robot systems that can infer user intentions from contextual cues without explicit instructions, advancing real-world human-robot collaboration capabilities.
Technical Profile
- Modalities
- rgbaudiolanguage
- Environment
- simulationlab
- Task Types
- manipulation
- Episodes
- 141162
- Data Format
- RLDS
- Annotation Types
- language_instructionsaction_labels
- License
- cc-by-nc-4.0
Community Signals
Top 10% by downloads
Academic Citations20
Access
Need custom rgb data?
Claru builds purpose-built datasets for simulation applications with dense human annotations and quality assurance.
Request a Sample Pack