Real-World Data for Meta-World

Meta-World evaluates multi-task RL with perfect state information. Real-world data adds the visual complexity that actual robots face.

Meta-World at a Glance

Tasks

ML45/MT50

Evaluation Splits

4-DOF

Action Space

Sawyer

Robot

MuJoCo

Physics Engine

2020

Released

Meta-World Task Categories

Meta-World's 50 tasks span basic manipulation primitives and multi-step interactions, each with parametric variation.

Category	Example Tasks	Task Count	State Dimensions
Reaching & Pushing	Reach, push, push-wall	~8	End-effector + object + goal
Pick & Place	Pick-place, pick-out-of-hole	~6	End-effector + object + goal
Door & Drawer	Open door, close door, open drawer	~6	End-effector + handle + joint angle
Button & Switch	Button-press, button-press-wall	~6	End-effector + button state
Assembly & Insertion	Peg-insert, hammer, assembly	~8	End-effector + peg/tool + target
Tool Use	Hammer, sweep, coffee-push	~8	End-effector + tool + object
Other	Shelf-place, soccer, hand-insert	~8	Varies by task

Meta-World vs. Related Multi-Task Benchmarks

Feature	Meta-World	LIBERO	RLBench	robosuite
Primary evaluation	Multi-task / meta-RL	Continual learning	Multi-task imitation	Single-task benchmarking
Default observation	Low-dimensional state	RGB images	Multi-view RGB-D	Configurable
Task count	50	130	100	8
Meta-learning split	ML10/ML45 (train/test)	Sequential suites	None (all tasks)	None
Physics engine	MuJoCo	MuJoCo	CoppeliaSim	MuJoCo

Benchmark Profile

Meta-World is a multi-task benchmark for meta-reinforcement learning and multi-task learning, providing 50 manipulation tasks with a simulated Sawyer robot arm. Created by researchers at UC Berkeley, it evaluates whether RL agents can learn shared representations across diverse manipulation skills.

Task Set

50 parametrically varied manipulation tasks including reach, push, pick-place, open door, close door, open drawer, button press, peg insert, hammer, and assembly. Tasks are grouped into ML1 (single-task), ML10 (10 meta-train tasks), ML45 (45 meta-train tasks), and MT50 (all 50 multi-task).

Observation Space

Low-dimensional state: end-effector position, gripper opening, object positions, and goal positions. Optional image observations from fixed camera.

Action Space

4-DOF: 3D end-effector velocity + gripper torque.

Evaluation Protocol

Average success rate across held-out tasks (for meta-learning) or all tasks (for multi-task learning). ML45 evaluates generalization to 5 unseen tasks after meta-training on 45.

The Sim-to-Real Gap

Meta-World uses MuJoCo with low-dimensional state observations that have no visual component in default configuration. Policies trained on perfect state information cannot directly transfer because real robots must estimate state from noisy sensors. The simplified Sawyer model ignores real arm dynamics. Task parametrization creates artificial task diversity that does not match the continuous variation of real manipulation.

Real-World Data Needed

Real-world demonstrations of Meta-World task categories on physical hardware with full visual observations. Multi-task data showing how humans and robots adapt skills across different objects and configurations. Transfer learning validation data comparing simulated policy performance to real-world execution.

Complementary Claru Datasets

Manipulation Trajectory Dataset

Real-world recordings of diverse manipulation tasks provide the visual observations and contact dynamics missing from Meta-World's state-based evaluation.

Egocentric Activity Dataset

Human demonstrations of daily manipulation tasks provide natural multi-task data showing how skills transfer and adapt across contexts.

Custom Multi-Task Collection

Purpose-collected demonstrations across Meta-World task categories on real hardware provide direct sim-to-real validation data.

Bridging the Gap: Technical Analysis

Meta-World serves an important role in multi-task RL research by providing a standardized set of manipulation tasks for evaluating task generalization. However, its reliance on low-dimensional state observations means that policies trained on Meta-World learn skill coordination but not visual perception.

The 50-task suite creates useful diversity for studying multi-task learning, but the tasks share the same robot, similar workspace geometry, and parametrically varied objects. Real multi-task manipulation involves dramatically different visual scenes, object categories, and workspace configurations. A policy that masters Meta-World's 50 tasks may still fail at the 51st if it requires different visual processing.

The meta-learning evaluation (ML45 train, 5 test) measures whether policies can adapt to new tasks with few demonstrations. Real-world few-shot adaptation requires visual generalization that state-based training does not provide. Bridging this gap requires real-world multi-task data with rich visual observations.

Claru's manipulation and egocentric datasets provide this real-world multi-task signal — diverse manipulation tasks across varied environments with full visual observations, showing how skills adapt across contexts.

Key Papers

[1]Yu et al.. “Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning.” CoRL 2020, 2020. Link
[2]Kalashnikov et al.. “QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation.” CoRL 2018, 2018. Link
[3]Xu et al.. “Multi-Task Reinforcement Learning with Soft Modularization.” NeurIPS 2023, 2023. Link

Frequently Asked Questions

Meta-World was designed to isolate the multi-task learning problem from visual perception. State observations let researchers study task transfer without confounding visual complexity. However, real robots must perceive state from cameras, making the transition to visual observations a critical research gap.

Not directly. Meta-World policies receive perfect state (exact object positions) while real Sawyers must estimate state from camera images. The MuJoCo Sawyer model also ignores real joint dynamics. Fine-tuning with real-world visual data is needed for transfer.

Real multi-task manipulation involves dramatically different visual scenes, object categories, and workspaces. Meta-World tasks share the same robot, similar geometry, and parametrically varied objects. Real-world multi-task data provides the visual and physical diversity needed for policies that generalize broadly.

ML45 meta-trains an agent on 45 of Meta-World's 50 tasks, then tests few-shot adaptation to 5 held-out tasks. The agent receives only a handful of demonstrations of each new task and must achieve high success rates. This evaluates whether manipulation knowledge transfers — whether skills learned on training tasks accelerate learning of novel tasks.

Meta-World remains the standard benchmark for multi-task RL and meta-learning research because its clean state-based evaluation isolates the task transfer problem from visual perception. Newer benchmarks add visual complexity but confound task transfer with visual generalization. Meta-World's simplicity makes it valuable for studying the fundamental structure of multi-task manipulation learning.

Related Resources

Glossary

Transfer Learning Robotics →

Glossary

Manipulation Trajectory →

Glossary

Behavioral Cloning →

Guide

How To Build A Manipulation Dataset →

Guide

How To Evaluate Training Data Quality →

Get Real Multi-Task Manipulation Data

Discuss diverse, visually rich manipulation data for validating multi-task robot learning.

Get in Touch Browse the Data Catalog