Egocentric Assembly Line Video Dataset

First-person video of real manufacturing assembly tasks — part insertion, fastening, wiring, inspection — captured across diverse production facilities with step-level process annotations for training industrial cobots and quality monitoring AI.

Why Assembly Line Video Data Matters for Robotics

Manufacturing assembly is the largest near-term market for collaborative robots. Automotive, electronics, appliance, and aerospace manufacturers spend billions on manual assembly tasks that cobots could perform or assist with. Training these cobots requires data captured on real production lines with the specific parts, tooling, fixtures, and environmental conditions of actual manufacturing environments — not simplified lab setups.

The egocentric perspective captures what an assembly operator actually sees during work: parts at close range, hands navigating tight spaces, tooling engagement from the operator's viewpoint, and the visual cues that experienced workers use to verify assembly quality. This perspective maps directly to the camera viewpoint of a collaborative robot working alongside or replacing a human operator.

Assembly tasks demand a level of sequential precision that distinguishes them from general manipulation. Each step must be completed in order, with specific torque values, alignment tolerances, and verification checks. Training data must capture this sequential structure with step-level annotations that preserve the process logic, not just the physical motions.

Dataset at a Glance

70K+
Video clips
480+
Hours recorded
25+
Facility types
14+
Annotation layers

Collection Methodology

Claru collectors wear head-mounted cameras while performing genuine assembly tasks on production lines. Collection covers automotive component assembly, electronics PCB population, appliance assembly, aerospace sub-assembly, and light manufacturing. Collectors follow real work instructions and use production tooling, generating data with authentic cycle times, quality checks, and error recovery procedures.

Each collection session captures 30-120 minutes of continuous assembly work across multiple production cycles. Facilities span automotive OEMs and Tier 1 suppliers, consumer electronics manufacturers, appliance plants, and aerospace sub-assembly shops across North America, Europe, and Asia. This diversity ensures coverage of different tooling standards, ergonomic setups, and quality systems.

Raw video is captured at 1080p at 30fps with optional depth from wrist-mounted RealSense sensors. Each session includes the work instruction document, bill of materials for assembled components, and process parameters (torque specs, alignment tolerances). This metadata enables researchers to correlate visual data with engineering specifications.

Annotation Layers

🔧

Assembly Step Segments

Start/end timestamps for every discrete assembly operation mapped to work instruction steps. Enables training models that track process completion and detect step omissions.

📦

Part Identity Tracking

Bounding boxes and identity labels for every component handled during assembly, tracked through insertion and fastening operations. Supports pick-and-place policy training.

🛠️

Tool-Use Classification

Per-frame labels for tool type and engagement state: torque driver active, rivet gun firing, test probe contacting. Essential for training tool-aware manipulation policies.

Quality Verification Points

Annotations marking visual inspection moments, go/no-go checks, and measurement verification. Trains automated quality monitoring systems.

Comparison with Public Assembly Datasets

How Claru's assembly data compares to publicly available alternatives.

DatasetClipsHoursFacilitiesAnnotations
IndustReal~5K~201 labKeypoints, success/fail
Assembly101~100K5131 lab setupActions, objects
IKEA ASM~17K353 setupsActions, poses
Claru Assembly Line70K+480+25+Steps, parts, tools, quality, process params

Use Cases and Model Training

Cobot manufacturers training manipulation policies for assembly tasks use this data to learn the step sequences, tool usage patterns, and force application strategies that experienced assembly operators demonstrate. The step-level annotations provide the process structure that enables cobots to track where they are in an assembly sequence and what comes next.

Quality inspection AI systems train on the quality verification annotations to learn what visual features distinguish good assemblies from defective ones. The diversity of facility types ensures these models generalize across different product lines, lighting conditions, and quality standards rather than overfitting to a single production environment.

Digital twin platforms for manufacturing use the process timing data to calibrate simulation models against real-world cycle times. The correlation between visual data and process parameters (torque values, alignment measurements) provides ground truth for validating simulation fidelity.

Frequently Asked Questions

The dataset covers automotive component assembly, consumer electronics manufacturing, appliance production, aerospace sub-assembly, and general light manufacturing across 25+ facilities in North America, Europe, and Asia. Each facility type includes multiple product lines and assembly stations.

Yes. Each collection session includes the associated work instruction document, bill of materials, and relevant process parameters such as torque specifications and alignment tolerances. This enables researchers to correlate visual data with engineering requirements.

Yes. Claru delivers assembly line data in any standard robotics format including RLDS, HDF5, WebDataset, zarr, and LeRobot format. We handle all format conversion as part of the delivery pipeline.

Request an Assembly Line Sample Pack

Get a curated sample of egocentric assembly video with full annotations for your industrial robotics or quality inspection project.