Training Data for Berkshire Grey

Berkshire Grey is building advanced robotic systems. Here is how real-world data accelerates their path from development to production deployment.

About Berkshire Grey

Berkshire Grey builds AI-enabled robotic systems for warehouse automation, retail fulfillment, and package handling. Founded in 2013 by Tom Wagner, the company has raised over $300 million, went public via SPAC in 2021, and was acquired by SoftBank Robotics in 2023. Their systems combine robotic picking, sorting, and packing with AI-powered perception and planning.

AI-powered robotic picking and sortingPackage singulation and handlingMulti-robot coordination for fulfillmentVision-based item recognitionAutonomous warehouse workflow orchestration

Berkshire Grey at a Glance

2016+
Founded
Funded
Stage
Global
Deployment
AI-First
Approach

Known Data Requirements

Berkshire Grey's enterprise robotic systems handle diverse items across retail, grocery, and parcel sorting. Their AI perception needs training data covering the enormous SKU diversity of retail fulfillment — from small electronics to oversized items, transparent packaging, and deformable bags. Scaling across customer deployments requires visual data from diverse warehouse environments.

Diverse manipulation demonstrations

Source: Berkshire Grey product deployments and research publications

Multi-modal recordings of manipulation tasks across diverse objects, environments, and conditions relevant to Berkshire Grey's deployment contexts.

Real-world environment recordings

Source: Berkshire Grey deployment requirements

Visual and geometric recordings of target deployment environments capturing the specific layouts, lighting, and conditions Berkshire Grey's robots encounter.

Perception pretraining data

Source: Berkshire Grey AI architecture requirements

Diverse egocentric and multi-view video for pretraining visual representations that ground Berkshire Grey's AI in real-world physical understanding.

How Claru Data Addresses These Needs

Lab NeedClaru OfferingRationale
Diverse manipulation demonstrationsManipulation Trajectory Dataset + Custom CollectionClaru captures multi-modal manipulation recordings with dense annotations across diverse environments, matching the diversity Berkshire Grey needs for robust policy training.
Real-world environment recordingsCustom Environmental Recording CampaignsClaru coordinates multi-sensor recordings across partner facilities in Berkshire Grey's target deployment environments, capturing authentic visual distributions.
Perception pretraining dataEgocentric Activity Dataset (386K+ clips)Purpose-collected first-person video of human activities provides visual pretraining data that grounds Berkshire Grey's AI in real physical interactions.

Technical Data Analysis

Berkshire Grey operates at the intersection of industrial robotics and AI, building complete workflow automation systems rather than single-robot solutions. Their IER platform coordinates multiple robots for end-to-end fulfillment — picking items from bins, sorting them by order, and packing for shipment.

The perception challenge is extreme item diversity. A grocery fulfillment center handles fresh produce, frozen items, glass bottles, flexible packaging, and heavy items all in the same workflow. The AI must recognize items it has never seen before, determine optimal grasp strategies, and handle fragile items appropriately. This long-tail recognition problem demands training data far broader than any single customer deployment generates.

Package singulation — separating individual packages from a pile on a conveyor — requires understanding the physics of how packages stack, slide, and interact. Simulating these dynamics for soft packages, irregular shapes, and mixed-size loads remains unreliable. Real-world data of package singulation across diverse package types provides the training signal that makes singulation policies robust.

Multi-robot coordination introduces data requirements beyond single-robot perception. The IER platform must predict handoff timing, manage shared workspace conflicts, and recover from individual robot failures without stopping the line. Training these coordination policies requires data from real multi-robot operations where timing jitter, communication latency, and mechanical variance create real coordination challenges.

Key Research & References

  1. [1]Brohan et al.. RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control.” CoRL 2023, 2023. Link
  2. [2]Open X-Embodiment Collaboration. Open X-Embodiment: Robotic Learning Datasets and RT-X Models.” ICRA 2024, 2024. Link
  3. [3]Kim et al.. OpenVLA: An Open-Source Vision-Language-Action Model.” arXiv 2406.09246, 2024. Link

Frequently Asked Questions

Berkshire Grey's enterprise robotic systems handle diverse items across retail, grocery, and parcel sorting. Their AI perception needs training data covering the enormous SKU diversity of retail fulfillment — from small electronics to oversized items, transparent packaging, and deformable bags. Scaling across customer deployments requires visual data from diverse warehouse environments.

Simulation cannot faithfully model the contact dynamics, material properties, and environmental conditions that Berkshire Grey's robots encounter in deployment. Real-world data provides the distributional coverage that fills simulation gaps.

Yes. Claru operates a global network of 10,000+ data collectors across 100+ cities who can capture teleoperated demonstrations, egocentric video, and sensor data in target environments using standardized recording protocols.

Accelerate Berkshire Grey's Data Pipeline

Talk to our team about purpose-built datasets for Berkshire Grey's robotic systems.