field notes

Physical AI
Training Data

more articles

  1. physical-ai

    Bezos Project Prometheus $10B Physical AI Infrastructure 2026

    Jeff Bezos's reported $10B Project Prometheus initiative targets the infrastructure layers — data, simulation, evaluation — that physical AI and robotics foundation models still lack, signaling a platform play valued at levels comparable to GPT-4's total training investment.

    Read
  2. physical-ai

    π₀.₇ Foundation Model: Steerable Emergent Robot Capabilities 2026

    Physical Intelligence's π₀.₇ achieves 82.1% success on trained tasks and 47.3% zero-shot generalization across seven robot embodiments and 50+ manipulation tasks, according to the team's technical report (arXiv:2604.15483)—redefining what a single 7B-parameter generalist robotic foundation model can do without task-specific fine-tuning.

    Read
  3. humanoid robots

    Humanoid Robot Training Data Requirements in 2026

    Figure AI, 1X Technologies, and Agility Robotics all depend on multimodal training pipelines where a single misaligned sensor timestamp can break sim-to-real transfer — here are the actual specs, sync tolerances, and annotation schema decisions that determine whether humanoid robot training data produces working policies or silent failures.

    Read
  4. physical-ai

    Physical AI Training Data Provider: 2026 Decision Framework

    He et al. (arXiv:2510.21391v1) show that VLA policies trained on real manipulation data outperform sim-only baselines by 30–60% on contact-rich tasks — this framework helps ML engineers decide when to buy real-world physical AI training data versus generate synthetic.

    Read
  5. physical AI

    Physical AI Training Data Guide 2026

    Google DeepMind's RT-2 required 130K real-world robot episodes to generalize across 700+ manipulation instructions — this guide breaks down exact data specs, collection pipelines, and quality criteria by robot type.

    Read
  6. diffusion-policy

    Diffusion Policy Robotics: Training Data Specs 2026

    Chi et al.'s Diffusion Policy achieves 85.7% average success on Push-T with roughly 200 demonstrations (arXiv:2305.12171), but generalizing across objects, lighting, and embodiments demands 10–50× more data with specific diversity constraints that most teams underestimate.

    Read
  7. training data

    Training Data for Robotics: The Full Pipeline in 2026

    Google DeepMind's RT-2 needed 130K+ real-world episodes before language-conditioned manipulation worked reliably—here is the spec-level pipeline that makes datasets like that possible.

    Read
  8. training-data

    Gig Workers Training Humanoid Robots: Why Data Quality Beats Volume in 2026

    1X Technologies and Prosper Robotics have deployed hundreds of gig workers to collect teleop data at home, but the volume-first approach has a quality ceiling that determines whether humanoid policies actually generalize.

    Read
  9. VLA

    VLM vs VLA: What's the Actual Difference? (2026)

    VLMs generate text; VLAs generate motor commands. Here's exactly where the architectures diverge, what training data each needs, and why the distinction matters for robotics teams.

    Read
  10. VLA

    How Much Training Data Does a VLA Model Need? (2026)

    OpenVLA pre-trained on 970K trajectories fine-tunes in ~1.5 hours with 50–200 demos for simple tasks. Here are the concrete numbers for VLA data requirements across task complexity.

    Read
  11. sim-to-real

    The Sim-to-Real Gap Explained: Why It Happens and How to Close It (2026)

    Four specific causes of the sim-to-real gap — visual domain gap, physics approximation error, sensor noise mismatch, and long-tail scenario absence — and what real-world data addresses each.

    Read
  12. physical AI

    The Physical AI Stack: From Raw Sensor Data to Robot Action (2026)

    Layer-by-layer breakdown of how physical AI robots learn: perception (Depth Anything V2, ViTPose, SAM3), world modeling, policy learning (Diffusion Policy, ACT, π0), and language grounding.

    Read
  13. egocentric video

    7 Best Egocentric Video Data Providers for Robotics (2026)

    Side-by-side comparison of 7 egocentric video data providers for robotics and physical AI in 2026, covering Claru, Luel, Encord, Appen, Labelbox, Ego4D, and Scale AI.

    Read
  14. data enrichment

    Data Enrichment Pipeline for Physical AI (2026)

    How Claru's enrichment pipeline adds depth maps, pose estimation, semantic segmentation, and action labels to raw video to produce training-ready physical AI datasets.

    Read