Game Environment Dataset

High-fidelity video from game engines with pixel-perfect ground truth for pre-training vision models, world models, and sim-to-real transfer.

Dataset at a Glance

66K+
Video clips
450+
Hours recorded
50+ environments
Environments
6+
Annotation layers

Comparison with Public Datasets

How Claru's dataset compares to publicly available alternatives.

DatasetClipsHoursModalitiesEnvironmentsAnnotations
SYNTHIA13K~5RGB-D (syn)Urban driving simSegmentation
Virtual KITTI21K~2RGB-D (syn)DrivingEverything (GT)
Claru Game Environments66K+450+RGB, Depth, PC50+ environmentsPerfect GT: depth, seg, flow, normals, poses

Use Cases

Vision Pre-Training

Massive supervised pre-training on perfect labels before real-data fine-tuning. Example models: DINOv2, SigLIP, InternImage.

World Model Pre-Training

Physically plausible environments for initializing world models. Example models: Genie 2, UniSim, DIAMOND.

Sim-to-Real Transfer

Pre-training on synthetic data reduces real-data requirements by 50-80%. Example models: Domain Randomization, RCAN, Transfer from Play.

Key References

  1. [1]Ros et al.. The SYNTHIA Dataset for Semantic Segmentation.” CVPR 2016, 2016. Link
  2. [2]Tobin et al.. Domain Randomization for Sim-to-Real Transfer.” IROS 2017, 2017. Link
  3. [3]Bruce et al.. Genie 2: A Large-Scale Foundation World Model.” DeepMind 2024, 2024. Link

How Claru Delivers This Data

Claru curates game data from 50+ virtual worlds with UE5/Unity fidelity. All data includes pixel-perfect ground truth annotations impossible to produce manually.

Frequently Asked Questions

Unreal Engine 5, Unity HDRP, and custom pipelines across 50+ environments with PBR materials.

Extracted from rendering engine: depth from Z-buffer, segmentation from object IDs, flow from motion vectors.

Best as pre-training, reducing real-data needs by 50-80% when combined with real-world fine-tuning.

Request a Sample Pack

Get a curated sample of game environment data with full annotations to evaluate for your project.