Enterprise

Model Routing Emerges as Companies Rein In AI Spending

CFOs are pushing back on expensive frontier models for every task, threatening the revenue assumptions behind OpenAI and Anthropic valuations.

Omega Editorial· June 5, 2026· 3 min read

A fundamental shift in how enterprises deploy artificial intelligence is underway, driven by CFOs and boards alarmed at AI spending that has outpaced budgets. The solution gaining traction could significantly alter the economics of the AI industry.

For two years, the default approach has been routing all queries to the most powerful—and expensive—AI models regardless of task complexity. That's changing fast as companies discover they're paying premium prices for routine work that cheaper alternatives handle just as well.

The routing revolution

Model routing matches tasks to appropriate AI models based on complexity. Simple queries go to faster, less expensive options, while genuinely difficult problems get sent to frontier models from companies like OpenAI and Anthropic.

Scott Wu, CEO of Cognition, which produces the Devin coding agent, told CNBC that companies can achieve five to ten times better cost efficiency on routine work by using adequate rather than premium models. The example he offered: asking any model to name the third U.S. president will yield "Thomas Jefferson," regardless of whether that query costs pennies or dollars.

Glean CEO Arvind Jain estimates that roughly 95 percent of enterprise AI usage currently runs on the most expensive frontier models, even for tasks that cheaper alternatives could easily handle.

The cost crisis driving change

The pressure stems from AI expenses that have surprised even major technology companies. Jeetu Patel, Cisco's chief product officer, outlined the mathematics: at approximately $200 in token usage per employee weekly, annual costs reach $10,000 per person. For Cisco's 90,000 employees, that totals $900 million annually.

Cisco exceeded its own AI budget and has been forced to reallocate resources, with 30,000 engineers now building products largely with AI assistance. The company has prioritized token spending over other budget categories.

Vendor response and market implications

AI companies are responding to customer anxiety. Cognition announced an "AI productivity guarantee" backing its service with up to $10 million in funded usage if Devin fails to deliver promised engineering value. Wu emphasized that the metric should be actual human hours saved, not activity measures like tokens consumed or code lines generated.

Why it matters

If enterprises systematically route high-volume, low-complexity work to cheaper models—including open-source alternatives—OpenAI and Anthropic lose revenue from the bulk of queries and retain only complex tasks. Both companies have built their businesses and IPO expectations on assumptions of enormous demand at premium pricing. While frontier models will still command premiums for difficult work, the proportion of the market requiring that capability versus routine tasks will substantially determine these companies' valuations. Pricing power is shifting from AI sellers to buyers.

Patel doesn't believe this development will sink frontier labs, but expects the pricing model to evolve toward greater efficiency rather than simply higher charges. The question is no longer whether companies will continue spending as AI bills climb, but how they'll spend more strategically.

These details were first reported by CNBC.

#model routing#ai costs#enterprise ai#openai#anthropic#ai spending

This is an original analysis by the Omega editorial team. Source reporting: AI Watch.

Want systems like this working for your business?

Book a Call

More in Enterprise

Enterprise· 3 min read

Enterprises Should Focus on Use Cases, Not Every AI Model Release

GoodData CTO warns that chasing the latest models leads to fatigue and wasted investment as businesses struggle with AI's rapid evolution.

Via Automation Watch · Jun 6, 2026
Enterprise· 2 min read

Stockton Police Deploy AI Translation Body Cameras Citywide

More than 300 officers now use Axon technology that translates over 50 languages in real time during emergency calls and investigations.

Via AI Watch · Jun 6, 2026
Enterprise· 2 min read

SpaceX Lands $30 Billion Google AI Computing Deal Ahead of IPO

Elon Musk's rocket company will provide access to 110,000 Nvidia chips as Google races to meet surging cloud demand.

Via AI Watch · Jun 6, 2026