Urban LiDAR Point Cloud Dataset

Dense LiDAR scans of real urban environments — streets, intersections, parking structures, pedestrian zones — captured across diverse cities with 3D bounding boxes, semantic segmentation, and lane-level annotations for training autonomous navigation and urban mapping systems.

Why Urban LiDAR Data Matters

LiDAR provides the geometric backbone for autonomous navigation in urban environments. Unlike cameras, LiDAR delivers precise distance measurements unaffected by lighting conditions — working equally well at night, in direct sunlight, and through light rain. Urban autonomous systems including self-driving vehicles, delivery robots, and mobile mapping platforms depend on LiDAR for accurate 3D scene understanding.

Urban environments present unique challenges for LiDAR perception: dense pedestrian traffic, varied vehicle types (cars, trucks, bicycles, scooters), complex intersection geometry, construction zones, and the vertical complexity of multi-story buildings creating shadow regions. Training data must capture this full urban complexity across diverse city layouts and traffic patterns.

The gap between available academic LiDAR datasets and real-world requirements is significant. Most public datasets cover a small number of cities with limited weather and lighting variation. Models trained on these narrow distributions fail when deployed in cities with different road widths, intersection designs, vegetation patterns, and traffic densities.

Dataset at a Glance

60K+

LiDAR scans

400+

Hours of driving

40+

Cities covered

Annotation layers

Collection Methodology

Claru deploys vehicle-mounted LiDAR rigs (Ouster OS1-128, Velodyne Alpha Prime, or equivalent) synchronized with GPS/IMU and multi-camera arrays across urban environments. Collection routes cover arterial roads, residential streets, downtown cores, industrial zones, and highway on/off ramps to ensure comprehensive coverage of urban driving scenarios.

Each collection vehicle captures 10-20Hz LiDAR point clouds synchronized with 30fps camera imagery and centimeter-accurate RTK GPS positioning. Sessions span 30-90 minutes of continuous driving across varied traffic conditions including rush hour, midday, nighttime, and weekend traffic patterns.

Geographic diversity spans 40+ cities across North America, Europe, and Asia, covering different road design standards, signage systems, lane configurations, and urban density profiles. Seasonal collection captures leaf-on and leaf-off vegetation, wet and dry road surfaces, and varying sun angles that affect LiDAR return intensity.

Annotation Layers

📦

3D Bounding Boxes

Oriented 3D bounding boxes for all dynamic objects: vehicles, pedestrians, cyclists, scooters, strollers. Includes velocity vectors and track IDs for object tracking.

🎨

Semantic Segmentation

Per-point semantic labels: road surface, sidewalk, building, vegetation, pole, sign, vehicle, pedestrian, bicycle. 20+ categories aligned with standard autonomous driving taxonomies.

🛣️

Lane Markings

3D polyline annotations for lane boundaries, stop lines, crosswalks, and turn arrows extracted from LiDAR intensity returns and verified against camera imagery.

🚦

Static Map Elements

Traffic lights, stop signs, speed limit signs, traffic cones, and barriers annotated with 3D positions and state labels for HD map construction.

Comparison with Public LiDAR Datasets

Dataset	Scans	Cities	Classes	Annotation Type
nuScenes	400K	2	23	3D boxes, semantic
Waymo Open	230K	6	4	3D boxes, tracking
KITTI	15K	1	8	3D boxes
Claru Urban LiDAR	60K+	40+	20+	3D boxes, semantic, lanes, maps

Use Cases and Model Training

3D object detection models for autonomous vehicles train on the annotated LiDAR scans to detect and classify vehicles, pedestrians, and cyclists in point cloud data. The geographic diversity across 40+ cities ensures models learn features that generalize beyond the specific road geometries and traffic patterns of any single city.

Semantic segmentation networks for LiDAR process the per-point labels to build scene understanding models that classify every point in a scan. These models form the geometric backbone of autonomous driving perception stacks, distinguishing drivable surface from sidewalk, building from vegetation, and static from dynamic objects.

HD map construction pipelines use the lane marking and static map element annotations as training data for automated map building systems. These systems must detect lane boundaries, traffic signs, and infrastructure elements from LiDAR scans to maintain high-definition maps at continental scale.

Frequently Asked Questions

Primary collection uses Ouster OS1-128 and Velodyne Alpha Prime sensors providing 128-channel point clouds at 10-20Hz. Each scan contains 100K-300K points with intensity and ambient returns. Sensor configurations can be customized for specific project requirements.

The dataset covers 40+ cities across North America, Europe, and Asia including major metropolitan areas, mid-size cities, and suburban environments. Each city contributes diverse road types, intersection configurations, and traffic patterns.

Yes. All LiDAR collection includes synchronized multi-camera imagery (typically 6 cameras providing 360-degree coverage) and centimeter-accurate RTK GPS positioning. Camera-LiDAR extrinsic calibration is provided for sensor fusion applications.

Related Resources

Semantic Segmentation→

Request an Urban LiDAR Sample Pack

Get sample urban LiDAR scans with full 3D annotations for your autonomous driving or urban mapping project.

Get in Touch Browse the Data Catalog