IPEC-COMMUNITY2025apache-2.0
EO-Bench
A comprehensive benchmark for evaluating embodied reasoning capabilities of vision-language models in robotics scenarios, featuring 600 samples across 12 distinct embodied reasoning categories in multiple-choice format.
Downloads84
Likes2
Why This Matters for Physical AI
EO-Bench provides a standardized benchmark for evaluating how well vision-language models can perform embodied reasoning tasks, which is critical for developing foundation models that can understand and control robots across diverse scenarios.
Technical Profile
- Modalities
- rgblanguage
- Task Types
- visual-question-answeringtrajectory reasoningvisual groundingaction reasoningprocess verificationrelation reasoningobject state recognitiontask planning
- Annotation Types
- language_instructionsvisual_grounding_annotationsaction_labels
- License
- apache-2.0
Access
Need custom rgb data?
Claru builds purpose-built datasets for any environment applications with dense human annotations and quality assurance.
Request a Sample Pack