Robotics

Workplace Egocentric Video Data for General-Purpose Robotics

10Workplace categories captured on-site
Robotics
summary.md

Challenge:Robotics training datasets are overwhelmingly collected in controlled lab environments or staged settings that fail to represent real-world work conditions — cluttered spaces, time pressure, improvised tool use, and the contextual decision-making that workers perform unconsciously.

Solution:We embedded data capture directly into real-world business operations across multiple countries and 10 workplace categories.

Result:The program established a fundamentally new data source for robotics research: active workplaces as scalable, cost-efficient contributors of egocentric training data.

0Distinct workplace categories captured on-site
0fpsCapture resolution via standard smartphones
Multi-country0Geographic coverage across global locations
<0hContributor onboarding time per business
// THE CHALLENGE

Robotics training datasets are overwhelmingly collected in controlled lab environments or staged settings that fail to represent real-world work conditions — cluttered spaces, time pressure, improvised tool use, and the contextual decision-making that workers perform unconsciously. Existing egocentric video datasets capture household tasks performed by research participants, but the distribution gap between a researcher's kitchen and a commercial kitchen during service is enormous. The lab needed a partner who could embed data collection into actual business operations without disrupting workflows, while meeting the quality and diversity requirements of frontier robotics research across multiple industries and geographies.

// OUR APPROACH

We embedded data capture directly into real-world business operations across multiple countries and 10 workplace categories. Business owners and workers were onboarded as contributors through a lightweight side-revenue model that kept participation voluntary and minimally disruptive to normal workflow.

Workplace categories spanned food service (barista, cooking), skilled trades (carpentry, tailoring, screen printing), repair services (phone repair, tool repair), textile work (clothing shop, ironing), and assembly (furniture assembly, paper cutting). Tasks were designed for handheld smartphone recording at 4K 60fps — no specialized hardware required. Research-level specifications for camera angle, framing, and activity coverage were translated into practical instructions that respected workplace realities: space constraints, safety requirements, hygiene protocols, and varying levels of technical comfort.

Activity coverage and task diversity were tracked continuously through a real-time monitoring dashboard. We balanced collection across workplace types and across task complexity levels within each type. QA validation focused on the characteristics that distinguish genuine workplace data from staged alternatives: natural pacing, contextual tool selection, environmental adaptation, and multi-step task sequencing under real constraints.

01
PartnerOnboard businesses across 10 workplace categories as data sources
02
InstructTranslate research specs into smartphone task guides per category
03
CaptureWorkers record 4K/60fps egocentric video during normal workflow
04
ValidateQA on natural pacing, framing, and task coverage
05
DeliverFormatted datasets with workplace-type metadata and activity labels
// RESULTS
10Distinct workplace categories captured on-site
4K/60fpsCapture resolution via standard smartphones
Multi-countryGeographic coverage across global locations
<48hContributor onboarding time per business
// IMPACT

The program established a fundamentally new data source for robotics research: active workplaces as scalable, cost-efficient contributors of egocentric training data. Ten distinct workplace categories — from barista stations to carpentry workshops to screen printing studios — demonstrated that the approach generalizes across industries, not just food service. The behaviors captured — improvisation under time pressure, adaptation to cluttered and constrained spaces, contextual tool selection — represent precisely the distribution that lab-collected datasets lack. The operational model (side-revenue for businesses, smartphone capture, lightweight onboarding) is economically viable for sustained collection at scale.

// SAMPLE DATA

Representative record from the annotation pipeline.

workplace_video_sample.json
// VIDEO SAMPLE
// WORKPLACE CATEGORIES (10)
BaristaFood Service

Espresso prep, milk steaming, order assembly

🍳
CookingFood Service

Ingredient prep, stove work, plating

🔨
CarpentrySkilled Trades

Sawing, sanding, joint assembly

✂️
TailoringSkilled Trades

Cutting, stitching, fitting

🎨
Screen PrintingSkilled Trades

Screen prep, ink application, drying

🔧
Phone RepairRepair Services

Disassembly, component swap, testing

⚙️
Tool RepairRepair Services

Diagnosis, part replacement, calibration

🧵
Clothing ShopTextile Work

Fabric handling, folding, display

👕
IroningTextile Work

Steam pressing, garment finishing

🪑
Furniture AssemblyAssembly

Part alignment, fastening, hardware install

// INDUSTRY GROUPS
Food Service(2)
Skilled Trades(3)
Repair Services(2)
Textile Work(2)
Assembly(1)
// CAPTURE SPECS
Resolution3840x2160
Frame Rate60 fps
HardwareSmartphone
PerspectiveEgocentric (1st person)
Session Length5-15 min
Onboarding<48 hours
// FAQ

Ready to build your next dataset?

Tell us about your project and we will scope a plan within 48 hours.