Auryal2026apache-2.0

LIBERO Microwave GRM Rollouts

Dense-reward-annotated rollout dataset from GRPO training of OpenVLA-OFT on LIBERO-10 Task 9, capturing policy behavior across a 20-epoch training run including reward hacking and policy collapse.

Downloads19K
Episodes~2,100

Why This Matters for Physical AI

This dataset is valuable for studying dense reward model failure modes and reward hacking in embodied RL, providing a documented case study of policy collapse during training despite positive reward model scores.

Technical Profile

Modalities
rgbproprioception
Robot Embodiments
Franka Panda
Action Space
joint_positions
Environment
simulation
Task Types
manipulationpick_and_place
Episodes
~2,100
Data Format
LeRobot
Annotation Types
reward_labelslanguage_instructions
License
apache-2.0
Part of the LIBERO family

Community Signals

Top 5% by downloads

Access

Need custom rgb data?

Claru builds purpose-built datasets for simulation applications with dense human annotations and quality assurance.

Request a Sample Pack

Related Datasets