VLA-Arena2025apache-2.0

VLA-Arena Dataset (L1 - Large Variant)

An open-source benchmark for systematic evaluation of Vision-Language-Action (VLA) models featuring 55 tasks at difficulty level 1 with 2,750 human demonstrations across safety, distractor, extrapolation, and long-horizon domains.

Downloads126
Episodes2750
Likes1

Why This Matters for Physical AI

VLA-Arena provides a systematic benchmark for evaluating vision-language-action models across hierarchical difficulty levels and multiple evaluation dimensions (safety, generalization, long-horizon reasoning), enabling rigorous assessment of embodied AI agents for real-world deployment.

Technical Profile

Modalities
rgbproprioceptionlanguage
Action Space
end_effector_delta
Environment
simulation
Task Types
manipulationgraspingpick_and_place
Episodes
2750
Data Format
HDF5
Annotation Types
language_instructionsaction_labels
License
apache-2.0
Part of the VLA-Arena family

Access

Need custom rgb data?

Claru builds purpose-built datasets for simulation applications with dense human annotations and quality assurance.

Request a Sample Pack

Related Datasets