robotics-diffusion-transformer2024MIT
RDT-1B Fine-tuning Dataset
A multimodal fine-tuning dataset for bimanual robotic manipulation tasks, featuring joint positions, multi-view RGB images, and language annotations generated through GPT-4-Turbo expansion.
Downloads3K
Likes23
Why This Matters for Physical AI
This dataset enables training of diffusion-based foundation models for bimanual manipulation by combining multi-view visual observations with diverse language annotations, advancing the development of generalizable robot learning systems.
Technical Profile
- Modalities
- rgbproprioceptionlanguage
- Robot Embodiments
- bimanual manipulator
- Action Space
- joint_positions
- Environment
- lab
- Task Types
- manipulation
- Data Format
- HDF5
- Annotation Types
- language_instructions
- License
- MIT
Community Signals
Top 25% by downloads
Academic Citations5
- Diffusion models for robotic manipulation: a survey2025 · Frontiers in Robotics and AI
- Beyond performance: Explaining generalisation failures of Robotic Foundation Models in industrial simulation2025 · Biomimetic Intelligence and Robotics
- Empowering natural human–robot collaboration through multimodal language models and spatial intelligence: Pathways and perspectives2025 · Robotics and Computer-Integrated Manufacturing
Access
Need custom rgb data?
Claru builds purpose-built datasets for lab applications with dense human annotations and quality assurance.
Request a Sample Pack