Building the Data Infrastructure for Physical AI
We believe the next generation of AI — robots that can see, move, and manipulate the physical world — needs fundamentally different training data. Claru exists to build it.
The Real-World Data Gap
Large language models trained on internet text. Image models trained on web-scraped photos. But robots that need to pick up a coffee mug, navigate a warehouse, or fold laundry cannot learn from internet data. They need egocentric video of real humans performing real tasks in real environments — captured with depth, pose, and action labels at millisecond precision.
This data does not exist on the internet. It cannot be synthesized in simulation without a crippling domain gap. Benchmarks like Ego4D and Open X-Embodiment have demonstrated the value of large-scale real-world datasets, but every frontier robotics lab we have spoken to cites the same bottleneck: not compute, not algorithms — data.
Claru closes that gap. We operate the capture infrastructure, enrichment pipelines, and annotation workforce to deliver training-ready datasets for robotics and physical AI — from brief to first delivery in days, not months.
Core Team
Built from inside Moonvalley's data pipeline. The team that solved licensed data capture at scale for AI video generation, now doing the same for physical AI.
John Thomas
Co-founded Moonvalley, where he built the licensed data operations behind Marey. Previously co-founded ContentFly (YC W21).
LinkedIn →Chad Birdsall
Previously led data and marketplace operations at Moonvalley and ContentFly. Scaled operations at Bungalow and Uber.
LinkedIn →Oni Baballari
Previously at Moonvalley. Background in creative strategy, brand, and go-to-market.
LinkedIn →The Broader Team
Our core engineering and research team includes ex-FAANG engineers and researchers with deep expertise in computer vision, robotics, and AI infrastructure. We have built large-scale data pipelines, trained production ML models, and shipped AI products used by millions. Our enrichment pipeline leverages foundation models including Depth Anything V2, ViTPose, and SAM3 to produce embodied AI datasets with six enrichment layers on every clip.
Beyond the core team, Claru operates a global data collection network: 10,000+ trained collectors across 100+ cities on 6 continents. Every collector is equipped with wearable cameras and follows structured capture protocols designed for each project. This network gives us the geographic and environmental diversity that physical AI models need to generalize beyond controlled lab settings.
Engineering
Full-stack AI infrastructure: capture pipelines, multi-model enrichment (depth, pose, segmentation), vector search, and delivery systems built for petabyte-scale datasets.
Research
Deep expertise in computer vision, embodied AI, and robot learning. We design annotation schemas and quality metrics in collaboration with each client's ML team.
Operations
Global data collection at scale. Collector recruitment, training, quality assurance, and logistics across 100+ cities. Brief to first delivery in days.
Claru by the Numbers
Explore Our Work
Talk to Our Team
Whether you are building a robotics foundation model, training VLAs, or need custom real-world datasets — we would like to hear about it.