Urban LiDAR Point Cloud Dataset
Dense LiDAR scans of real urban environments — streets, intersections, parking structures, pedestrian zones — captured across diverse cities with 3D bounding boxes, semantic segmentation, and lane-level annotations for training autonomous navigation and urban mapping systems.
Why Urban LiDAR Data Matters
LiDAR provides the geometric backbone for autonomous navigation in urban environments. Unlike cameras, LiDAR delivers precise distance measurements unaffected by lighting conditions — working equally well at night, in direct sunlight, and through light rain. Urban autonomous systems including self-driving vehicles, delivery robots, and mobile mapping platforms depend on LiDAR for accurate 3D scene understanding.
Urban environments present unique challenges for LiDAR perception: dense pedestrian traffic, varied vehicle types (cars, trucks, bicycles, scooters), complex intersection geometry, construction zones, and the vertical complexity of multi-story buildings creating shadow regions. Training data must capture this full urban complexity across diverse city layouts and traffic patterns.
The gap between available academic LiDAR datasets and real-world requirements is significant. Most public datasets cover a small number of cities with limited weather and lighting variation. Models trained on these narrow distributions fail when deployed in cities with different road widths, intersection designs, vegetation patterns, and traffic densities.
Dataset at a Glance
Collection Methodology
Claru deploys vehicle-mounted LiDAR rigs (Ouster OS1-128, Velodyne Alpha Prime, or equivalent) synchronized with GPS/IMU and multi-camera arrays across urban environments. Collection routes cover arterial roads, residential streets, downtown cores, industrial zones, and highway on/off ramps to ensure comprehensive coverage of urban driving scenarios.
Each collection vehicle captures 10-20Hz LiDAR point clouds synchronized with 30fps camera imagery and centimeter-accurate RTK GPS positioning. Sessions span 30-90 minutes of continuous driving across varied traffic conditions including rush hour, midday, nighttime, and weekend traffic patterns.
Geographic diversity spans 40+ cities across North America, Europe, and Asia, covering different road design standards, signage systems, lane configurations, and urban density profiles. Seasonal collection captures leaf-on and leaf-off vegetation, wet and dry road surfaces, and varying sun angles that affect LiDAR return intensity.
Annotation Layers
3D Bounding Boxes
Oriented 3D bounding boxes for all dynamic objects: vehicles, pedestrians, cyclists, scooters, strollers. Includes velocity vectors and track IDs for object tracking.
Semantic Segmentation
Per-point semantic labels: road surface, sidewalk, building, vegetation, pole, sign, vehicle, pedestrian, bicycle. 20+ categories aligned with standard autonomous driving taxonomies.
Lane Markings
3D polyline annotations for lane boundaries, stop lines, crosswalks, and turn arrows extracted from LiDAR intensity returns and verified against camera imagery.
Static Map Elements
Traffic lights, stop signs, speed limit signs, traffic cones, and barriers annotated with 3D positions and state labels for HD map construction.
Comparison with Public LiDAR Datasets
| Dataset | Scans | Cities | Classes | Annotation Type |
|---|---|---|---|---|
| nuScenes | 400K | 2 | 23 | 3D boxes, semantic |
| Waymo Open | 230K | 6 | 4 | 3D boxes, tracking |
| KITTI | 15K | 1 | 8 | 3D boxes |
| Claru Urban LiDAR | 60K+ | 40+ | 20+ | 3D boxes, semantic, lanes, maps |
Use Cases and Model Training
3D object detection models for autonomous vehicles train on the annotated LiDAR scans to detect and classify vehicles, pedestrians, and cyclists in point cloud data. The geographic diversity across 40+ cities ensures models learn features that generalize beyond the specific road geometries and traffic patterns of any single city.
Semantic segmentation networks for LiDAR process the per-point labels to build scene understanding models that classify every point in a scan. These models form the geometric backbone of autonomous driving perception stacks, distinguishing drivable surface from sidewalk, building from vegetation, and static from dynamic objects.
HD map construction pipelines use the lane marking and static map element annotations as training data for automated map building systems. These systems must detect lane boundaries, traffic signs, and infrastructure elements from LiDAR scans to maintain high-definition maps at continental scale.
Frequently Asked Questions
Primary collection uses Ouster OS1-128 and Velodyne Alpha Prime sensors providing 128-channel point clouds at 10-20Hz. Each scan contains 100K-300K points with intensity and ambient returns. Sensor configurations can be customized for specific project requirements.
The dataset covers 40+ cities across North America, Europe, and Asia including major metropolitan areas, mid-size cities, and suburban environments. Each city contributes diverse road types, intersection configurations, and traffic patterns.
Yes. All LiDAR collection includes synchronized multi-camera imagery (typically 6 cameras providing 360-degree coverage) and centimeter-accurate RTK GPS positioning. Camera-LiDAR extrinsic calibration is provided for sensor fusion applications.
Request an Urban LiDAR Sample Pack
Get sample urban LiDAR scans with full 3D annotations for your autonomous driving or urban mapping project.