X-Humanoid2025MIT

WoW-1 Benchmark Samples

Official evaluation dataset for the World-Omniscient World Model project containing 612 natural language prompts representing real-world robot interaction tasks, designed to assess physical consistency and causal reasoning capabilities of generative world models for robotics and embodied AI.

Downloads260
Episodes612
Likes1

Why This Matters for Physical AI

This benchmark dataset provides language-grounded evaluation for assessing whether generative world models can reason about physical causality, spatial relationships, and state transitions—critical capabilities for embodied AI systems to understand and predict real-world robot interactions.

Technical Profile

Modalities
language
Action Space
language
Task Types
manipulationobject_manipulationaction_generation
Episodes
612
Data Format
JSON / Parquet
Annotation Types
language_instructions
License
MIT
Part of the WoW (World-Omniscient World Model) family

Community Signals

Top 50% by downloads
HuggingFace Discussions1

Access

Need custom language data?

Claru builds purpose-built datasets for any environment applications with dense human annotations and quality assurance.

Request a Sample Pack

Related Datasets