IPEC-COMMUNITY2025apache-2.0

EO-Bench

A comprehensive benchmark for evaluating embodied reasoning capabilities of vision-language models in robotics scenarios, featuring 600 samples across 12 distinct embodied reasoning categories in multiple-choice format.

Downloads84
Likes2

Why This Matters for Physical AI

EO-Bench provides a standardized benchmark for evaluating how well vision-language models can perform embodied reasoning tasks, which is critical for developing foundation models that can understand and control robots across diverse scenarios.

Technical Profile

Modalities
rgblanguage
Task Types
visual-question-answeringtrajectory reasoningvisual groundingaction reasoningprocess verificationrelation reasoningobject state recognitiontask planning
Annotation Types
language_instructionsvisual_grounding_annotationsaction_labels
License
apache-2.0
Part of the EO-1 family

Access

Need custom rgb data?

Claru builds purpose-built datasets for any environment applications with dense human annotations and quality assurance.

Request a Sample Pack

Related Datasets