MicroAGI-Labsapache-2.0
VLM Info Loss - Grounding Results
Grounding evaluation results for vision-language models on robotics manipulation datasets, testing how VLM connectors transform visual representations and preserve spatial information for bounding box output.
Downloads0
Episodes50 episodes per dataset (8 datasets total)
Why This Matters for Physical AI
This dataset evaluates how vision-language models ground spatial understanding in robotics tasks, revealing critical information bottlenecks in VLM connectors that affect end-to-end manipulation task performance.
Technical Profile
- Modalities
- rgblanguage
- Robot Embodiments
- Franka PandaUR5JACO
- Environment
- lab
- Task Types
- manipulationobject-detectionvisual-question-answering
- Episodes
- 50 episodes per dataset (8 datasets total)
- Annotation Types
- bounding_boxeslanguage_instructionsreward_labels
- License
- apache-2.0
Access
Need custom rgb data?
Claru builds purpose-built datasets for lab applications with dense human annotations and quality assurance.
Request a Sample Pack