ropedia-aiother

Xperience-10M

Name: Xperience-10M
Creator: ropedia-ai
License: other
Keywords: rgb, audio, depth, proprioception, language, imu, point_cloud, egocentric action recognition, task prediction, action captioning, human-object interaction, depth estimation, hand pose estimation, body motion estimation, imitation learning, lab

A large-scale egocentric multimodal dataset of human experience containing 10 million interactions and 10,000 hours of synchronized first-person recordings with six video streams, audio, stereo depth, camera pose, hand mocap, full-body mocap, IMU, and hierarchical language annotations for embodied AI, robotics, and world modeling research.

Downloads127K
Likes194

Technical Profile

Modalities: rgbaudiodepthproprioceptionlanguageimupoint_cloud
Environment: lab
Task Types: egocentric action recognitiontask predictionaction captioninghuman-object interactiondepth estimationhand pose estimationbody motion estimationimitation learning
Data Format: HDF5
License: other

Part of the Xperience-10M family

Community Signals

Top 1% by downloads

HuggingFace Discussions14

Access

View on HuggingFace

Need custom rgb data?

Claru builds purpose-built datasets for lab applications with dense human annotations and quality assurance.

Request a Sample Pack

Related Datasets

LingBot-Depth Dataset

Self-curated RGB-D dataset for training masked depth modeling approaches, containing real-world indoor scenes, VLA robot manipulation tasks, and simulated data across multiple camera types and robot platforms.

rgbdepth

167K downloadsApr 2026CC BY-NC-SA 4.0

OmniAction

A large-scale multimodal dataset for proactive robot manipulation comprising 141,162 episodes across 112 skills and 748 objects, enriched with audio, visual, and contextual instruction data for cross-modal intention recognition.

rgbaudiolanguage

145K downloadsMar 2026cc-by-nc-4.0

Egocentric-100K

The largest dataset of manual labor with 100,405 hours of egocentric video from head-mounted fisheye cameras, featuring state-of-the-art hand visibility and active manipulation density.

rgbvideo

74K downloadsFeb 2026apache-2.0

OmniAction

A large-scale multimodal dataset for proactive robot manipulation comprising 141,162 episodes with cross-modal contextual instructions derived from spoken dialogue, environmental sounds, and visual cues rather than explicit commands.

rgbaudiolanguage

71K downloadsMar 2026cc-by-nc-4.0

Open-H-Embodiment

A community-driven, multi-embodiment dataset of paired kinematics and video for training and evaluating AI autonomy models in surgical robotics and ultrasound applications, including tabletop exercises, clinical procedures, and healthcare robotics simulations.

rgbproprioception

63K downloadsMay 2026cc-by-4.0

OmniAction

A large-scale multimodal dataset for proactive robot manipulation with 141,162 episodes covering contextual instruction following through spoken dialogue, environmental sounds, and visual cues.

rgbaudiolanguage

48K downloadsApr 2026cc-by-nc-4.0