Enterprise

Coinbase Cuts AI Costs in Half While Token Usage Hits Record High

CEO Brian Armstrong details five strategies the crypto exchange uses to maximize AI productivity without runaway spending.

Omega Editorial· June 29, 2026· 3 min read

Coinbase has managed to slash its AI spending to nearly half its peak level while simultaneously driving token usage to record highs, according to CEO Brian Armstrong.

In a Friday post on X, Armstrong outlined a five-part strategy the crypto exchange is using to control costs without throttling engineers' AI adoption — a balance many companies have struggled to strike as generative AI tools proliferate across the enterprise.

Experimenting with Chinese models as defaults

The cornerstone of Coinbase's approach involves shifting away from expensive frontier models as the default choice. Armstrong said the company is experimenting with open-weight Chinese LLMs — specifically GLM 5.2 from Z.ai and Kimi 2.7 from Moonshot AI — through its internal LLM gateway. These models cost significantly less than comparable offerings from Anthropic and OpenAI, though engineers remain free to select more powerful models when tasks demand them.

The second strategy centers on intelligent routing: matching prompts to models based on complexity. Armstrong noted that frontier models excel at planning tasks but represent overkill for execution. Eventually, he envisions AI systems automatically selecting the optimal model for each query, removing the burden from human users.

The remaining tactics focus on technical efficiency. Coinbase employs aggressive caching to reduce redundant inference costs, encourages engineers to start fresh sessions when switching contexts to keep prompts lean, and has built comprehensive visibility tools that let every employee track their token consumption in real time.

Why it matters

Coinbase's approach represents a middle path between two extremes that have defined enterprise AI adoption. Earlier this year, "tokenmaxxing" — encouraging unlimited AI usage — briefly gained traction before companies realized the budget implications. Many organizations then swung to hard usage caps. Coinbase's model suggests a third option: transparency and optimization rather than restriction. The company expects higher AI spending from employees who deliver proportionally greater impact, creating accountability without artificial limits.

The timing is notable. Coinbase laid off 14% of its workforce less than two months ago, with Armstrong explicitly citing AI's role in changing productivity expectations. In May, he wrote that engineers now "ship in days what used to take a team weeks." The cost-control measures appear designed to sustain that acceleration without ballooning budgets as headcount shrinks.

Armstrong included a graph showing token usage climbing to historic highs while spending dropped approximately 50%, though he did not specify the exact timeframe. "The goal isn't to suppress usage," he wrote. "It's to build the infrastructure that makes exponential growth sustainable."

The details were first reported by Business Insider.

#coinbase#ai costs#llm optimization#chinese ai models#enterprise ai#token usage

This is an original analysis by the Omega editorial team. Source reporting: AI Watch.

Want systems like this working for your business?

Book a Call

Coinbase Cuts AI Costs in Half While Token Usage Hits Record High

Experimenting with Chinese models as defaults

Why it matters

More in Enterprise

Data Centers Face Climate Threats as Severe Weather Drives Losses

Healthcare AI Investment Hit $18B, But Outcomes Still Undefined

59% of Georgia Teachers Now Use AI for Lesson Planning