OpenMOSS-Teamcc-by-nc-4.0

OmniAction

Name: OmniAction
Creator: OpenMOSS-Team
License: cc-by-nc-4.0
Keywords: rgb, audio, language, manipulation, pick_and_place, simulation, lab

A large-scale multimodal dataset for proactive robot manipulation comprising 141,162 episodes across 112 skills and 748 objects, enriched with audio, visual, and contextual instruction data for cross-modal intention recognition.

Downloads30K
Likes283

Technical Profile

Modalities: rgbaudiolanguage
Environment: simulationlab
Task Types: manipulationpick_and_place
Data Format: RLDS
License: cc-by-nc-4.0

Part of the OmniAction family

Community Signals

Top 5% by downloads

HuggingFace Discussions5

Access

View on HuggingFace

Need custom rgb data?

Claru builds purpose-built datasets for simulation applications with dense human annotations and quality assurance.

Request a Sample Pack

Related Datasets

OmniAction

A large-scale multimodal dataset for proactive robot manipulation comprising 141,162 episodes with cross-modal contextual instructions derived from spoken dialogue, environmental sounds, and visual cues rather than explicit commands.

rgbaudiolanguage

95K downloadsMar 2026cc-by-nc-4.0

OmniAction

A large-scale multimodal dataset for proactive robot manipulation with 141,162 episodes covering contextual instruction following through spoken dialogue, environmental sounds, and visual cues. The dataset includes 5,096 distinct speaker timbres, 2,482 non-verbal sound events, and 640 environmental backgrounds across six categories of contextual instructions.

rgbaudiolanguage

85K downloadsApr 2026cc-by-nc-4.0

OmniAction

A large-scale multimodal dataset for proactive robot manipulation with 141,162 episodes covering contextual instruction following through spoken dialogue, environmental sounds, and visual cues.

rgbaudiolanguage

85K downloadsApr 2026cc-by-nc-4.0

OmniAction

A large-scale multimodal dataset for proactive robot manipulation comprising 141,162 episodes with contextual instructions derived from spoken dialogue, environmental sounds, and visual cues rather than explicit commands.

rgbaudiolanguage

4K downloadsMar 2026cc-by-nc-4.0

UR Ultrasound Robotics Dataset

A robotics dataset collected using LeRobot with a UR robot performing ultrasound-related tasks, featuring multi-view RGB imagery including base, wrist, and ultrasound camera feeds.

rgbproprioception

null downloadsJul 2026apache-2.0

CSI-Agent Towel Folding Dataset

A dataset of towel folding demonstrations collected using LeRobot with a bimanual SO-101 robot. Contains 50 episodes of towel folding tasks with multi-camera RGB observations and joint position actions.

rgbproprioception

null downloadsJul 2026apache-2.0