Enterprise

Coinbase Cuts AI Costs in Half While Token Usage Hits Record High

CEO Brian Armstrong details five strategies the crypto exchange uses to maximize AI productivity without runaway spending.

Omega Editorial· June 29, 2026· 3 min read

Coinbase has managed to slash its AI spending to nearly half its peak level while simultaneously driving token usage to record highs, according to CEO Brian Armstrong.

In a Friday post on X, Armstrong outlined a five-part strategy the crypto exchange is using to control costs without throttling engineers' AI adoption — a balance many companies have struggled to strike as generative AI tools proliferate across the enterprise.

Experimenting with Chinese models as defaults

The cornerstone of Coinbase's approach involves shifting away from expensive frontier models as the default choice. Armstrong said the company is experimenting with open-weight Chinese LLMs — specifically GLM 5.2 from Z.ai and Kimi 2.7 from Moonshot AI — through its internal LLM gateway. These models cost significantly less than comparable offerings from Anthropic and OpenAI, though engineers remain free to select more powerful models when tasks demand them.

The second strategy centers on intelligent routing: matching prompts to models based on complexity. Armstrong noted that frontier models excel at planning tasks but represent overkill for execution. Eventually, he envisions AI systems automatically selecting the optimal model for each query, removing the burden from human users.

The remaining tactics focus on technical efficiency. Coinbase employs aggressive caching to reduce redundant inference costs, encourages engineers to start fresh sessions when switching contexts to keep prompts lean, and has built comprehensive visibility tools that let every employee track their token consumption in real time.

Why it matters

Coinbase's approach represents a middle path between two extremes that have defined enterprise AI adoption. Earlier this year, "tokenmaxxing" — encouraging unlimited AI usage — briefly gained traction before companies realized the budget implications. Many organizations then swung to hard usage caps. Coinbase's model suggests a third option: transparency and optimization rather than restriction. The company expects higher AI spending from employees who deliver proportionally greater impact, creating accountability without artificial limits.

The timing is notable. Coinbase laid off 14% of its workforce less than two months ago, with Armstrong explicitly citing AI's role in changing productivity expectations. In May, he wrote that engineers now "ship in days what used to take a team weeks." The cost-control measures appear designed to sustain that acceleration without ballooning budgets as headcount shrinks.

Armstrong included a graph showing token usage climbing to historic highs while spending dropped approximately 50%, though he did not specify the exact timeframe. "The goal isn't to suppress usage," he wrote. "It's to build the infrastructure that makes exponential growth sustainable."

The details were first reported by Business Insider.

#coinbase#ai costs#llm optimization#chinese ai models#enterprise ai#token usage

This is an original analysis by the Omega editorial team. Source reporting: AI Watch.

Want systems like this working for your business?

Book a Call

More in Enterprise

Enterprise· 3 min read

Data Centers Face Climate Threats as Severe Weather Drives Losses

Insurers report extreme heat, flooding, and storms now account for a third of claims in the booming AI infrastructure sector.

Via AI Watch · Jun 29, 2026
Enterprise· 3 min read

Healthcare AI Investment Hit $18B, But Outcomes Still Undefined

New World Economic Forum report argues the health AI race will be won by systems that prioritize patient and provider outcomes over technological sophistication.

Via AI Watch · Jun 29, 2026
Enterprise· 4 min read

59% of Georgia Teachers Now Use AI for Lesson Planning

State audit reveals widespread classroom adoption, but educators worry about student over-reliance and critical thinking skills.

Via AI Watch · Jun 28, 2026