Egocentric Retail Video Dataset

First-person video of real retail environments — grocery stores, pharmacies, department stores — with product interaction annotations for training retail automation AI.

Dataset at a Glance

70K+
Video clips
500+
Hours recorded
25+ store types
Environments
6+
Annotation layers

Comparison with Public Datasets

How Claru's dataset compares to publicly available alternatives.

DatasetClipsHoursModalitiesEnvironmentsAnnotations
Metrabs Retail5K15RGB-DLab storePose, shelves
EgoProceL628RGBMixedProcedure steps
Claru Retail70K+500+RGB, Depth25+ store typesProducts, shelves, paths, hands, navigation

Use Cases

Shelf Monitoring Robots

Autonomous shelf scanning and out-of-stock detection using mobile platforms. Example models: Simbe Tally, Badger Technologies, BossaNova.

Shopping Assistant Robots

Customer interaction and in-store navigation assistance. Example models: Fellow Robots, SoftBank Pepper, LG CLOi.

Visual Retail Analytics

Understanding customer behavior and product interaction patterns. Example models: RetailNext, Trax, Standard AI.

Key References

  1. [1]Ragusa et al.. The MECCANO Dataset: Understanding Human-Object Interactions from Egocentric Videos.” WACV 2021, 2020. Link
  2. [2]Grauman et al.. Ego4D: Around the World in 3,000 Hours of Egocentric Video.” CVPR 2022, 2022. Link
  3. [3]Li et al.. Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives.” CVPR 2024, 2024. Link

How Claru Delivers This Data

Claru collectors capture first-person video in real retail environments. Unlike ceiling-mounted CCTV datasets, Claru's egocentric perspective matches how shelf-scanning robots perceive their environment. Product-level interaction annotations enable training models that understand shopping behavior at a granular level.

Frequently Asked Questions

Grocery stores, pharmacies, convenience stores, department stores, home improvement stores, and specialty retail, ranging from 2,000 to 100,000+ square feet.

Every hand-product contact event is labeled with timestamps, product category, interaction type (pick up, examine, return), and shelf location.

Yes. Shelf-state annotations label visible products, positions, and gaps, training models to detect out-of-stock conditions and planogram deviations.

Request a Sample Pack

Get a curated sample of egocentric retail video data with full annotations to evaluate for your project.