ropedia-aiother
Xperience-10M
A large-scale egocentric multimodal dataset of human experience containing 10 million interactions and 10,000 hours of synchronized first-person recordings with six video streams, audio, stereo depth, camera pose, hand mocap, full-body mocap, IMU, and hierarchical language annotations for embodied AI, robotics, and world modeling research.
Downloads2.3M
Episodes10000000
Hours10000
Likes161
Why This Matters for Physical AI
Xperience-10M provides the largest structured multimodal egocentric dataset with synchronized 3D/4D annotations essential for training embodied AI systems that understand motion, geometry, and interaction from human experience at scale.
Technical Profile
- Modalities
- rgbaudiodepthproprioceptionlanguageimupoint_cloud
- Environment
- lab
- Task Types
- egocentric action recognitiontask predictionaction captioninghuman-object interactiondepth estimationhand pose estimationbody motion estimationimitation learning
- Episodes
- 10000000
- Total Hours
- 10000
- Data Format
- HDF5
- Annotation Types
- language_instructionsaction_labelssegmentationcamera_posemocaphierarchical_captions
- License
- other
Community Signals
Top 1% by downloads
Access
Need custom rgb data?
Claru builds purpose-built datasets for lab applications with dense human annotations and quality assurance.
Request a Sample Pack