Real-World Data for FurnitureBench

FurnitureBench provides standardized evaluation for robot learning. Real-world data validates whether simulation performance transfers to physical hardware.

FurnitureBench at a Glance

Sim

Environment

Multi

Tasks

Standard

Evaluation

Active

Community

Benchmark Profile

FurnitureBench is a real-world furniture assembly benchmark that provides standardized tasks, hardware setup, and evaluation protocols for contact-rich, long-horizon manipulation. Created by researchers at CMU, it defines assembly tasks using real IKEA-style furniture parts with a Franka Panda robot.

Task Set

3 assembly tasks of increasing difficulty: one-leg table assembly (4 steps), round table assembly (8 steps), and cabinet assembly (12 steps). Each requires precise alignment, insertion, and fastening.

Observation Space

RGB images from 2 cameras (front and wrist), robot proprioception (joint positions, velocities, gripper width), force/torque at end-effector.

Action Space

7-DOF end-effector delta poses (position + orientation + gripper). 10Hz control frequency.

Evaluation Protocol

Success rate for complete assembly. Per-step success rate for partial credit. Time to completion as secondary metric. Real robot evaluation required.

The Sim-to-Real Gap

FurnitureBench is a real-world benchmark by design, but teams that pre-train in simulation face gaps in insertion dynamics (real dowel-hole friction vs simulated), part alignment tolerances (real parts have manufacturing variance), and force feedback during tight-fit assembly.

Real-World Data Needed

Additional real-world assembly demonstrations beyond the provided dataset. Diverse demonstrator skill levels. Assembly demonstrations with force/torque data for contact-rich insertion steps.

Complementary Claru Datasets

Manipulation Trajectory Dataset

Real assembly manipulation recordings complement FurnitureBench demonstrations with diverse contact-rich interactions.

Force-Torque Manipulation Dataset

Force data during insertion and fastening captures the contact dynamics critical for furniture assembly policies.

Custom Assembly Collection

Additional demonstrations on FurnitureBench hardware expand the training distribution for the benchmark's specific tasks.

Bridging the Gap: Technical Analysis

FurnitureBench stands out among manipulation benchmarks because it is explicitly designed as a real-world benchmark. While most benchmarks exist in simulation with optional real-world transfer, FurnitureBench defines the hardware setup, furniture parts, and evaluation protocol for reproducible real-world experiments.

The assembly tasks test long-horizon planning and contact-rich manipulation simultaneously. The 12-step cabinet assembly requires maintaining a plan over minutes of execution while handling the precise alignment and insertion that furniture assembly demands. A single failed insertion step invalidates the entire assembly, making robustness to contact uncertainty critical.

The benchmark provides a small set of human teleoperation demonstrations, but teams consistently find they need more data — especially for the harder tasks. The provided demonstrations come from expert teleoperators; data from demonstrators of varying skill levels would enable research on learning from suboptimal demonstrations.

Force/torque data during assembly is particularly valuable for the insertion steps. Real dowel-into-hole insertion requires force modulation: enough force to seat the dowel but not so much that the table frame is damaged or pushed out of alignment. The force profile of a successful insertion differs from a failed one in ways that vision alone cannot capture.

FurnitureBench's real-world nature makes it an ideal validation target for sim-to-real transfer research. Teams that pre-train in simulation can evaluate directly on the benchmark's standardized hardware, providing rigorous comparison of transfer methods.

Key Papers

[1]Heo et al.. “FurnitureBench: Reproducible Real-World Benchmark for Long-Horizon Complex Manipulation.” RSS 2023, 2023. Link
[2]Zhao et al.. “Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware.” RSS 2023, 2023. Link

Frequently Asked Questions

FurnitureBench is a real-world furniture assembly benchmark that provides standardized tasks, hardware setup, and evaluation protocols for contact-rich, long-horizon manipulation. Created by researchers at CMU, it defines assembly tasks using real IKEA-style furniture parts with a Franka Panda robot.

Additional real-world assembly demonstrations beyond the provided dataset. Diverse demonstrator skill levels. Assembly demonstrations with force/torque data for contact-rich insertion steps.

FurnitureBench is a real-world benchmark by design, but teams that pre-train in simulation face gaps in insertion dynamics (real dowel-hole friction vs simulated), part alignment tolerances (real parts have manufacturing variance), and force feedback during tight-fit assembly.

Yes. Claru coordinates data collection on specific robot platforms and in specific environments to enable direct comparison between simulated and real performance for FurnitureBench tasks.

Related Resources

Glossary

Sim To Real →

Glossary

Manipulation Trajectory →

Glossary

Behavioral Cloning →

Get Real-World Data for FurnitureBench

Discuss purpose-collected data to validate and improve your FurnitureBench-trained policies on physical hardware.

Get in Touch Browse the Data Catalog