X-Humanoid2025MIT
WoW-1 Benchmark Samples
Official evaluation dataset for the World-Omniscient World Model project containing 612 natural language prompts representing real-world robot interaction tasks, designed to assess physical consistency and causal reasoning capabilities of generative world models for robotics and embodied AI.
Downloads260
Episodes612
Likes1
Why This Matters for Physical AI
This benchmark dataset provides language-grounded evaluation for assessing whether generative world models can reason about physical causality, spatial relationships, and state transitions—critical capabilities for embodied AI systems to understand and predict real-world robot interactions.
Technical Profile
- Modalities
- language
- Action Space
- language
- Task Types
- manipulationobject_manipulationaction_generation
- Episodes
- 612
- Data Format
- JSON / Parquet
- Annotation Types
- language_instructions
- License
- MIT
Community Signals
Top 50% by downloads
HuggingFace Discussions1
Access
Need custom language data?
Claru builds purpose-built datasets for any environment applications with dense human annotations and quality assurance.
Request a Sample Pack