AI

XDOF Raises $70M to Build Robot Training Data Infrastructure

The startup aims to solve AI's physical-world bottleneck by collecting, cleaning, and annotating the manipulation data that robotics models need to learn.

Omega Editorial· June 17, 2026· 3 min read

A new startup is betting that the next major constraint in artificial intelligence won't be compute power or model architecture, but something more fundamental: the training data needed to teach robots how to interact with objects in the physical world.

XDOF emerged from stealth this week with $70 million in funding from Thrive Capital, Spark Capital, Andreessen Horowitz, Lux Capital, and WndrCo. The company is building data collection infrastructure, annotation systems, and teleoperation tools specifically designed for robotics applications—work that frontier AI labs are pursuing but struggling to execute at scale.

According to co-founder and CEO Philippe Wu, XDOF is already working with 20 customers, including several major AI research organizations, though he declined to name them. The timing aligns with OpenAI's recent announcement that it would restart its robotics program after shuttering it in 2021.

Why it matters

Large language models succeeded in part because they could train on vast quantities of existing text scraped from the internet. Robotics has no equivalent data reservoir. YouTube videos and footage from gig workers lack the precision and physical grounding needed to train manipulation models. This data scarcity creates an infrastructure opportunity that could determine which companies lead in physical AI—and XDOF is positioning itself as the picks-and-shovels provider for that race.

From academic research to commercial infrastructure

Wu encountered the data problem firsthand as a PhD student at UC Berkeley, where he focused on teaching robots to learn skills from large datasets. The challenge was circular: without substantial training data, researchers couldn't even begin building foundation models for robotics.

With co-founder and CTO Fred Shentu, Wu developed GELLO, a low-cost teleoperation system that allows human operators to control robotic arms and generate training trajectories. The project became influential in robotics research because it addressed a widespread bottleneck.

The founders launched XDOF in October 2024 with third co-founder and COO Nemo Jin, employing roughly 60 people. The company is partnering with UC Berkeley's AI Research lab to release what it describes as the largest collection of high-quality robot training data ever assembled. The dataset, called ABC, contains 130,000 trajectories of robot manipulation data, 300 hours of simulation, and 100 hours of evaluations. Researchers have already used it to train robots on benchmark tasks including folding T-shirts, flattening boxes, and loading AirPods into cases.

A three-tier data strategy

XDOF plans to operate across three levels of data collection. The highest-value tier involves teleoperation data gathered on the specific robot being deployed in production. The second tier uses teleoperated robots like GELLO to collect more general manipulation data. The third tier captures "egocentric" data from humans performing everyday tasks, for which XDOF is developing its own wearable sensors.

Hardware design matters significantly in this work. Camera selection affects data quality, which in turn influences how well hand-tracking algorithms perform. Physical parameters must be carefully calibrated, and operators require proper training.

The company intends to hire and train large teams of teleoperators and data collectors globally—a labor-intensive model that raises questions about why major labs aren't handling this work internally. Wu's answer is operational: the infrastructure requires warehouses spanning hundreds of thousands of square feet, hundreds of robots, ongoing maintenance, calibration, and trained personnel. Most AI labs would prefer to outsource that complexity.

The company's name references "degrees of freedom," the robotics term for independent motions a system can perform. A human arm has seven degrees of freedom from shoulder to wrist; Figure.AI's latest humanoid robot has 30. The X represents unlimited scope.

These details were first reported by TechCrunch.

#robotics#training data#physical ai#teleoperation#machine learning infrastructure#xdof

This is an original analysis by the Omega editorial team. Source reporting: AI Watch.

Want systems like this working for your business?

Book a Call

More in AI

AI· 3 min read

AI Compute Demand May Not Justify Orbital Data Centers

Falling model costs and uncertain enterprise returns challenge the economic case for space-based infrastructure.

Via AI Watch · Jun 17, 2026
AI· 3 min read

AI World Model Startups Hit 80% Chip Utilization on AWS Trainium

Physics-simulating models achieve double the industry-standard compute efficiency, revealing a new class of infrastructure customer beyond language AI.

Via AI Watch · Jun 17, 2026
AI· 2 min read

AI Councils May Smooth Out Unique Insights, Experiment Finds

A new test reveals that multi-model AI systems can dilute distinctive perspectives, mirroring the 'design by committee' problem in human groups.

Via AI Watch · Jun 17, 2026