Egocentric Outdoor Urban Video Dataset
First-person video of urban pedestrian environments — sidewalks, crosswalks, plazas — captured across 30+ cities with navigation annotations for training delivery robots and outdoor autonomous systems.
Dataset at a Glance
Comparison with Public Datasets
How Claru's dataset compares to publicly available alternatives.
| Dataset | Clips | Hours | Modalities | Environments | Annotations |
|---|---|---|---|---|---|
| Cityscapes | 25K | ~50 | RGB, Stereo | 50 cities (vehicle) | Semantic segmentation |
| nuScenes | 40K | 5.5 | RGB, LiDAR, Radar | 2 cities (vehicle) | 3D boxes, maps |
| Claru Urban | 95K+ | 700+ | RGB, Depth, IMU | 30+ cities (pedestrian) | Pedestrians, surfaces, obstacles, weather |
Use Cases
Sidewalk Delivery Robots
Navigating pedestrian environments with dynamic foot traffic and urban obstacles. Example models: Serve Robotics, Nuro, Coco.
Legged Robot Navigation
Outdoor locomotion and path planning in unstructured urban terrain. Example models: Boston Dynamics Spot, ANYbotics ANYmal, Ghost Robotics.
Urban Scene Understanding
Scene parsing for identifying sidewalks, road surfaces, curb cuts, and construction zones. Example models: SegFormer, Mask2Former, OneFormer.
Key References
How Claru Delivers This Data
Claru's collector network spans 100+ cities, capturing genuine pedestrian-perspective urban navigation. Unlike vehicle-mounted datasets, Claru's data shows the world from sidewalk height — the perspective delivery robots and legged robots actually operate from.
Frequently Asked Questions
Driving datasets are captured from vehicle height with vehicle-centric annotations. Claru's urban dataset is captured from pedestrian/robot height (1-1.5m), showing curb details, ground textures, and leg-level obstacles that vehicle datasets miss.
The dataset spans clear, overcast, rainy, and snowy conditions across all four seasons, with night captures under artificial lighting. Each clip carries weather metadata for filtering.
Yes. Every outdoor clip includes GPS traces at 1Hz for spatial indexing and geographic diversity verification.
Request a Sample Pack
Get a curated sample of egocentric outdoor urban video data with full annotations to evaluate for your project.