Real-World Data for FurnitureBench
FurnitureBench standardizes real-world furniture assembly evaluation with reproducible 3D-printed kits. Diverse assembly demonstrations build policies that handle the precision insertion and force control that simulation cannot teach.
FurnitureBench at a Glance
Assembly Task Breakdown
Each FurnitureBench task requires ordered subtask completion. Failure at any step prevents subsequent steps.
| Task | Assembly Steps | Key Skill Tested | Difficulty |
|---|---|---|---|
| One-Leg Table | 4 (pick leg, pre-align, insert, verify) | Precision peg insertion | Medium |
| Round Table | 8 (multi-leg + tabletop attachment) | Sequential precision + spatial planning | Hard |
| Lamp Assembly | 11 (base, pole, shade, bulb, wiring) | Delicate component handling + multi-stage insertion | Very Hard |
FurnitureBench vs. Related Assembly Benchmarks
| Feature | FurnitureBench | IKEA Furniture (Lee) | Factory (Isaac) | RLBench Assembly |
|---|---|---|---|---|
| Real hardware eval | Yes (standardized kits) | Simulation only | Simulation only | Simulation only |
| Reproducible | 3D-printed STL files | N/A (sim only) | N/A (sim only) | N/A (sim only) |
| Force feedback | 6-axis F/T sensor | Simulated contact | Simulated contact | Simulated contact |
| Task complexity | 4-11 ordered steps | Multi-step assembly | Single insertion | 1-3 steps |
Benchmark Profile
FurnitureBench is a real-world furniture assembly benchmark created by Heo et al. at CMU and KAIST, presented at RSS 2023. It uses real IKEA-style 3D-printed furniture kits with standardized assembly sequences, providing both a physical evaluation protocol with reproducible hardware and a matched simulation environment in Isaac Gym for training. It is one of the few benchmarks with standardized real-world evaluation that any lab can replicate.
The Sim-to-Real Gap
FurnitureBench uniquely provides both simulation (Isaac Gym) and real evaluation, enabling direct sim-to-real comparison. The main gaps are insertion physics — peg-hole alignment with sub-millimeter tolerances requires force-sensitive control that simulation models imprecisely. Real 3D-printed parts have slight compliance, layer-line texture, and manufacturing variation that printed copies of the same part are not identical. Visual appearance differs between the simulation renderer and real parts under lab lighting.
Real-World Data Needed
Assembly demonstrations on real furniture kits with 6-axis force/torque feedback during insertion phases. Diverse assembly strategies showing different approach angles, grip choices, and recovery from misalignment. Data from assembly in varied lighting conditions and workspace configurations. Multi-modal demonstrations (teleoperation and human hand) to capture both robot-executable and human-expert strategies.
Complementary Claru Datasets
Manipulation Trajectory Dataset
Contact-rich manipulation data with force measurements provides pretraining for the precise insertion and alignment skills that FurnitureBench's peg-in-hole assembly demands.
Custom Assembly Task Collection
Purpose-collected assembly demonstrations with diverse furniture kits and assembly strategies provide direct training data for multi-step assembly sequencing and error recovery.
Egocentric Activity Dataset
Human furniture assembly video from 100+ environments provides visual pretraining for understanding multi-step assembly sequences with natural error recovery and strategy adaptation.
Bridging the Gap: Technical Analysis
FurnitureBench is distinctive because it provides standardized physical evaluation — real 3D-printed furniture kits that any lab can reproduce from published STL files. This eliminates the hardware variability that makes comparing real-world results across labs difficult. Any researcher with a Franka Panda and a 3D printer can reproduce the exact evaluation conditions.
The assembly tasks test manipulation skills that most benchmarks ignore: precise alignment under sub-millimeter tolerances, multi-step sequencing with strict ordering dependencies (you cannot attach the tabletop before the legs), and force-sensitive insertion where visual observation alone is insufficient. These skills are directly relevant to manufacturing robotics and industrial assembly.
The insertion phase presents the hardest sim-to-real challenge. Peg-in-hole insertion with tight tolerances requires compliant, force-sensitive control — the robot must feel when the peg contacts the hole edge and adjust its approach angle and force vector. Isaac Gym models contact with simplified penalty-based methods that miss the stick-slip friction, part compliance, and alignment sensitivity of real insertion. A policy that learns to 'jam and push' in simulation will damage parts on real hardware.
The multi-step nature compounds the challenge. The lamp assembly requires 11 sequential steps, and failure at any step prevents completion. If the one-leg insertion has 85% success, the probability of completing all 4 steps of the one-leg table drops to ~52%. For the 11-step lamp, even 95% per-step success yields only ~57% full assembly completion.
Real-world assembly data with force measurements during insertion provides the training signal simulation cannot generate. Demonstrations showing how force profiles change during successful versus failed insertions, and how experienced assemblers adapt their approach angle based on tactile feedback, provide the compliant control strategies that are absent from simulation-only training.
Key Papers
- [1]Heo et al.. “FurnitureBench: Reproducible Real-World Benchmark for Long-Horizon Complex Manipulation.” RSS 2023, 2023. Link
- [2]Lee et al.. “IKEA Furniture Assembly Environment for Long-Horizon Complex Manipulation Tasks.” ICRA 2021, 2021. Link
- [3]Chi et al.. “Diffusion Policy: Visuomotor Policy Learning via Action Diffusion.” RSS 2023, 2023. Link
- [4]Zhao et al.. “Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware.” RSS 2023, 2023. Link
Frequently Asked Questions
FurnitureBench provides standardized physical furniture kits (3D-printed from published STL files) that any lab can reproduce. This enables direct comparison of real-world results across institutions — something most benchmarks cannot offer because of hardware and object variability. It is one of very few benchmarks with reproducible real-world evaluation.
Assembly tests manipulation skills most benchmarks ignore: precision alignment under sub-millimeter tolerances, multi-step sequencing with strict ordering dependencies, and force-sensitive insertion. These skills are directly relevant to manufacturing robotics, and the long-horizon nature (up to 11 steps) exposes compounding errors that single-step benchmarks miss.
Insertion phases require compliant, force-sensitive control — detecting when a peg contacts the hole edge and adjusting approach angle and force vector. Simulation models contact with penalty-based methods that produce qualitatively wrong force profiles. Real force measurements during insertion provide the ground truth for learning compliant control strategies that avoid jamming or part damage.
Assembly steps must complete in order — step N requires steps 1 through N-1. Success rates multiply across steps. Even 90% per-step reliability yields only ~31% success on the 11-step lamp assembly. This compounding math means small improvements in per-step precision have outsized effects on full-task completion, making FurnitureBench a sensitive test of manipulation reliability.
Published results show significant sim-to-real drops, particularly on insertion steps. Simulation-only approaches achieve reasonable transport performance (moving parts to the workspace) but fail on precision insertion where contact dynamics, part compliance, and manufacturing variation are critical. Hybrid approaches combining simulation pre-training with real-world fine-tuning on force-annotated demonstrations perform best.
Get Assembly Manipulation Data
Discuss real-world assembly and insertion data for FurnitureBench-style tasks with force/torque measurements.