hhyhrhy2025MIT

OWMM-Agent-data

A multi-modal agentic dataset for open world mobile manipulation tasks, synthesized to enhance vision-language model performance for mobile manipulators with global scene understanding and robot state tracking.

Downloads203
Likes2

Why This Matters for Physical AI

This dataset enables training of foundation models for mobile manipulators that can generalize to open-ended instructions and environments, advancing real-world robotic capability through multi-modal scene understanding and unified action generation.

Technical Profile

Modalities
rgblanguage
Robot Embodiments
mobile_manipulator
Action Space
function_calling
Environment
simulationreal world
Task Types
mobile manipulationnavigationmanipulationgrasping
Annotation Types
language_instructionsaction_labels
License
MIT
Part of the OWMM-Agent family

Community Signals

Top 50% by downloads
HuggingFace Discussions1

Access

Need custom rgb data?

Claru builds purpose-built datasets for simulation applications with dense human annotations and quality assurance.

Request a Sample Pack

Related Datasets