Model Routing Emerges as Companies Rein In AI Spending
CFOs are pushing back on expensive frontier models for every task, threatening the revenue assumptions behind OpenAI and Anthropic valuations.

A fundamental shift in how enterprises deploy artificial intelligence is underway, driven by CFOs and boards alarmed at AI spending that has outpaced budgets. The solution gaining traction could significantly alter the economics of the AI industry.
For two years, the default approach has been routing all queries to the most powerful—and expensive—AI models regardless of task complexity. That's changing fast as companies discover they're paying premium prices for routine work that cheaper alternatives handle just as well.
The routing revolution
Model routing matches tasks to appropriate AI models based on complexity. Simple queries go to faster, less expensive options, while genuinely difficult problems get sent to frontier models from companies like OpenAI and Anthropic.
Scott Wu, CEO of Cognition, which produces the Devin coding agent, told CNBC that companies can achieve five to ten times better cost efficiency on routine work by using adequate rather than premium models. The example he offered: asking any model to name the third U.S. president will yield "Thomas Jefferson," regardless of whether that query costs pennies or dollars.
Glean CEO Arvind Jain estimates that roughly 95 percent of enterprise AI usage currently runs on the most expensive frontier models, even for tasks that cheaper alternatives could easily handle.
The cost crisis driving change
The pressure stems from AI expenses that have surprised even major technology companies. Jeetu Patel, Cisco's chief product officer, outlined the mathematics: at approximately $200 in token usage per employee weekly, annual costs reach $10,000 per person. For Cisco's 90,000 employees, that totals $900 million annually.
Cisco exceeded its own AI budget and has been forced to reallocate resources, with 30,000 engineers now building products largely with AI assistance. The company has prioritized token spending over other budget categories.
Vendor response and market implications
AI companies are responding to customer anxiety. Cognition announced an "AI productivity guarantee" backing its service with up to $10 million in funded usage if Devin fails to deliver promised engineering value. Wu emphasized that the metric should be actual human hours saved, not activity measures like tokens consumed or code lines generated.
Why it matters
If enterprises systematically route high-volume, low-complexity work to cheaper models—including open-source alternatives—OpenAI and Anthropic lose revenue from the bulk of queries and retain only complex tasks. Both companies have built their businesses and IPO expectations on assumptions of enormous demand at premium pricing. While frontier models will still command premiums for difficult work, the proportion of the market requiring that capability versus routine tasks will substantially determine these companies' valuations. Pricing power is shifting from AI sellers to buyers.
Patel doesn't believe this development will sink frontier labs, but expects the pricing model to evolve toward greater efficiency rather than simply higher charges. The question is no longer whether companies will continue spending as AI bills climb, but how they'll spend more strategically.
These details were first reported by CNBC.
This is an original analysis by the Omega editorial team. Source reporting: AI Watch.
Want systems like this working for your business?
Book a Call
