AI Verification Gap Threatens Business Value of Automation
New MIT research shows companies can generate AI outputs faster than they can check them, creating hidden risks that could undermine economic gains.

Artificial intelligence systems can now produce thousands of lines of code in minutes, but a fundamental problem is emerging: organizations cannot verify outputs as quickly as AI generates them. This growing gap between production and validation threatens to limit the economic value companies can extract from increasingly capable AI systems.
New research from MIT Sloan School of Management examines what happens when automation outpaces an organization's ability to check whether the work is correct, safe, or complete. The findings suggest that verification capacity—not just deployment capability—will determine which companies succeed with AI.
Why it matters
As AI systems take on more complex, autonomous tasks, the inability to verify their outputs at scale creates accumulating risk that remains hidden until failures occur. Companies rushing to deploy AI without adequate verification infrastructure may be building what researchers describe as a "hollow economy"—one where output volume surges but quality and reliability don't keep pace. This dynamic will separate firms that can stand behind their AI systems from those that cannot.
The measurability problem
In their paper "Some Simple Economics of AGI," MIT research scientist Christian Catalini and co-authors Xiang Hui of Washington University and Jane Wu of UCLA frame the challenge as a measurability crisis. AI systems are becoming more capable faster than organizations can build verification processes to match.
Performance on AI coding benchmarks jumped from 4.4% accuracy to 71.7% in a single year, while the length of tasks systems can complete is doubling over short periods. Meanwhile, human verification capacity remains constrained by time and expertise.
Catalini describes the shift as moving from "software as a service" to "liability as a service." The companies that profit will be those that understand the risks their AI systems create and can underwrite them.
Why AI checking AI doesn't solve it
Some organizations are attempting to use AI systems to verify other AI outputs, but the researchers warn this creates a "false sense of confidence." When both systems share the same assumptions and training data, they can reinforce identical errors rather than catch them.
Competitive pressure often drives companies to deploy systems before full human verification is complete, allowing risks to accumulate unnoticed. Catalini pointed to events like the 2010 flash crash as examples of what happens when complex automated systems fail in ways that weren't fully understood beforehand. "It is technical debt accumulating behind the scenes, and, at some point, it'll come due," he said.
The missing junior loop
Verification requires experienced judgment, but AI adoption is eroding the training ground where workers build that experience. As AI takes over entry-level work, fewer employees develop the skills needed to evaluate complex outputs later in their careers.
Research cited in the paper shows employment among younger workers in AI-exposed roles has already fallen by approximately 16%. Meanwhile, senior employees generate the training data for systems that may eventually replace them.
What organizations should do
The researchers recommend that companies scale automation only as fast as they can verify it. This means building organizational capacity to monitor outputs, understanding system limitations, and taking responsibility for outcomes at the leadership level.
For policymakers, the answer isn't restricting AI development but ensuring verification and safety mechanisms are built into deployment practices. For individual workers, the path forward involves moving from routine execution toward directing AI work and exercising judgment over results.
These findings were first reported by MIT Sloan in research conducted by Christian Catalini, Xiang Hui, and Jane Wu.
This is an original analysis by the Omega editorial team. Source reporting: AI Watch.
Want systems like this working for your business?
Book a Call
