Robotics Training Datasets
Purpose-built datasets for training robot manipulation policies, VLA models, world models, and embodied AI systems. Each dataset includes dense annotations and delivery in your preferred format.
40 datasets available
Egocentric Kitchen Video Dataset
First-person video of real kitchen activities — cooking, cleaning, organizing — captured across diverse home and commercial kitchen layouts with dense manipulation annotations for training robotic kitchen assistants and embodied AI systems.
Egocentric Warehouse Video Dataset
First-person video of real warehouse operations — picking, packing, sorting, and navigation — captured across diverse fulfillment center layouts with logistics-specific annotations for training warehouse robotics and AMR systems.
Egocentric Outdoor Urban Video Dataset
First-person video of urban pedestrian environments — sidewalks, crosswalks, plazas — captured across 30+ cities with navigation annotations for training delivery robots and outdoor autonomous systems.
Egocentric Retail Video Dataset
First-person video of real retail environments — grocery stores, pharmacies, department stores — with product interaction annotations for training retail automation AI.
Egocentric Office Video Dataset
First-person video of real office environments — desks, meeting rooms, corridors — with workplace activity annotations for training telepresence robots and office automation AI.
Teleoperation Kitchen Dataset
Robot teleoperation data from real kitchen environments — synchronized camera-action-force triplets for training cooking robot manipulation policies.
Teleoperation Warehouse Dataset
Robot teleoperation data from real warehouse environments — pick-and-place trajectories with force sensing for training logistics manipulation policies.
Teleoperation Tabletop Dataset
Robot teleoperation data for tabletop manipulation — sorting, stacking, tool use — with synchronized camera-action-force triplets for training general-purpose policies.
Multi-View Manipulation Dataset
Synchronized multi-camera robot manipulation recordings — 3-5 calibrated viewpoints — with 3D annotations for training spatial manipulation policies.
RGB-D Kitchen Dataset
Paired RGB and depth video from real kitchen environments with registered depth maps and 3D annotations for training depth-aware kitchen robots.
RGB-D Manipulation Dataset
Paired RGB-D recordings of robot manipulation with 3D grasp annotations and force measurements for training depth-aware grasping policies.
Game Environment Dataset
High-fidelity video from game engines with pixel-perfect ground truth for pre-training vision models, world models, and sim-to-real transfer.
Synthetic Manipulation Dataset
Procedurally generated manipulation trajectories from physics simulators with perfect state information for scalable robot policy pre-training.
Dashcam Urban Dataset
Forward-facing dashcam video from urban driving environments with traffic annotations for training autonomous driving perception and world models.
Aerial Agricultural Dataset
Drone-captured agricultural imagery with crop health annotations for training agricultural robotics and precision farming AI.
Egocentric Construction Video Dataset
First-person construction site video for training construction robotics and safety monitoring AI. 55K+ clips across 20+ site types with PPE detection, tool usage, and structural progress annotations.
Egocentric Restaurant Video Dataset
First-person restaurant environment video for training food service robots and hospitality automation. 45K+ clips across 15+ restaurant types with food handling, plating, and service workflow annotations.
Egocentric Healthcare Video Dataset
First-person healthcare environment video for training medical assistance robots and clinical workflow AI. 35K+ clips from 10+ clinical settings with instrument tracking, procedure phase, and sterile field annotations.
Multi-View Assembly Dataset
Synchronized multi-camera recordings of assembly tasks for training 3D-aware manipulation policies. 30K+ trajectories across 15+ assembly configurations with part tracking, insertion state, and 3D pose annotations.
Thermal Industrial Dataset
Paired thermal-RGB imaging from industrial environments for training predictive maintenance robots and safety monitoring systems. 25K+ clips across 10+ facility types with thermal anomaly and equipment health annotations.
Egocentric Workshop Video Dataset
First-person workshop and maker-space video for training tool-use robots and craft manipulation AI. 40K+ clips across 20+ workshop types with tool grasp, material transformation, and assembly sequence annotations.
Point Cloud Indoor Dataset
Dense indoor point cloud scans with semantic annotations for training 3D scene understanding and indoor navigation. 15K+ scans across 500+ rooms with per-point semantic labels, instance segmentation, and room layout annotations.
Multi-Sensor Warehouse Dataset
Synchronized RGB, depth, LiDAR, and IMU data from warehouse environments for training autonomous mobile robots and pick-pack-ship automation. 50K+ clips across 25+ warehouse configurations.
Egocentric Agricultural Video Dataset
First-person video from agricultural settings for training harvesting robots and crop monitoring AI. 35K+ clips across 12+ farm types with dense manipulation and crop-state annotations.
Stereo Outdoor Dataset
Calibrated stereo camera pairs from outdoor environments for training depth estimation and terrain-aware navigation. 40K+ clips across 15+ terrain types with disparity maps, traversability labels, and obstacle annotations.
Egocentric Lab Video Dataset
First-person video of real laboratory workflows — pipetting, centrifuging, microscopy, sample handling — captured across diverse wet labs, dry labs, and cleanrooms with dense manipulation annotations for training robotic lab assistants and scientific automation systems.
Egocentric Outdoor Sports Video Dataset
First-person video of real outdoor sports activities — cycling, climbing, skiing, running, kayaking — captured with wearable cameras across diverse terrain and weather conditions with dense action and body pose annotations.
Egocentric Assembly Line Video Dataset
First-person video of real manufacturing assembly tasks — part insertion, fastening, wiring, inspection — captured across diverse production facilities with step-level process annotations for training industrial cobots and quality monitoring AI.
Urban LiDAR Point Cloud Dataset
Dense LiDAR scans of real urban environments — streets, intersections, parking structures, pedestrian zones — captured across diverse cities with 3D bounding boxes, semantic segmentation, and lane-level annotations for training autonomous navigation and urban mapping systems.
Warehouse LiDAR Point Cloud Dataset
Dense LiDAR scans of real warehouse and logistics facilities — aisles, shelving units, pallet racks, loading docks — with 3D annotations for shelving geometry, pallet positions, obstacle detection, and navigable paths for training autonomous mobile robots.
Force-Torque Manipulation Dataset
Force-torque sensor recordings from real manipulation tasks — grasping, insertion, polishing, assembly — paired with synchronized RGB video for training contact-aware robot policies.
Event Camera Manipulation Dataset
High-temporal-resolution event camera recordings of manipulation tasks — fast grasping, dynamic catching, tool use — for training reactive robot policies that require microsecond-level visual feedback.
Multi-View Kitchen Dataset
Synchronized multi-camera recordings of real kitchen activities from 4-8 viewpoints — enabling 3D reconstruction, novel view synthesis, and multi-perspective manipulation training for kitchen robotics.
Thermal Outdoor Dataset
Thermal infrared video of outdoor environments — pedestrians, vehicles, wildlife, terrain — captured across day/night and seasonal conditions for training robust perception systems that operate beyond the visible spectrum.
Stereo Manipulation Dataset
Calibrated stereo camera recordings of manipulation tasks — pick-and-place, assembly, tool use — providing dense depth estimation ground truth for training depth-aware robot policies.
Outdoor Point Cloud Dataset
Dense 3D point clouds of outdoor environments — parks, construction sites, agricultural fields, forests — from LiDAR scanning and photogrammetry for training outdoor navigation and terrain analysis models.
Highway Dashcam Dataset
Continuous dashcam video from highway and freeway driving across diverse road conditions — multi-lane traffic, merging, construction zones, weather events — with lane-level annotations for training highway driving assistance and autonomous highway systems.
Aerial Inspection Dataset
Drone-captured video of infrastructure inspection — bridges, powerlines, solar farms, building facades, cell towers — with defect annotations and structural element labels for training automated aerial inspection AI.
Underwater Inspection Dataset
Underwater video from ROVs and divers inspecting subsea infrastructure — pipelines, offshore platforms, ship hulls, port structures — with corrosion, biofouling, and structural damage annotations.
Synthetic Household Dataset
Photorealistic synthetic renders of household environments — living rooms, bedrooms, bathrooms, garages — with perfect ground truth annotations for pre-training household robot perception before real-world fine-tuning.
Need a Custom Dataset?
Claru builds custom datasets for any robotics application. Tell us your model architecture, target environment, and data requirements.