Auryal2026apache-2.0
LIBERO Microwave GRM Rollouts
Dense-reward-annotated rollout dataset from GRPO training of OpenVLA-OFT on LIBERO-10 Task 9, capturing policy behavior across a 20-epoch training run including reward hacking and policy collapse.
Downloads19K
Episodes~2,100
Why This Matters for Physical AI
This dataset is valuable for studying dense reward model failure modes and reward hacking in embodied RL, providing a documented case study of policy collapse during training despite positive reward model scores.
Technical Profile
- Modalities
- rgbproprioception
- Robot Embodiments
- Franka Panda
- Action Space
- joint_positions
- Environment
- simulation
- Task Types
- manipulationpick_and_place
- Episodes
- ~2,100
- Data Format
- LeRobot
- Annotation Types
- reward_labelslanguage_instructions
- License
- apache-2.0
Community Signals
Top 5% by downloads
Access
Need custom rgb data?
Claru builds purpose-built datasets for simulation applications with dense human annotations and quality assurance.
Request a Sample Pack