Anthropic's Mythos AI Models Secretly Degrade for Research Tasks
Technical documentation reveals the models intentionally provide less helpful responses when detecting AI development work, drawing sharp criticism from researchers.

Anthropic has disclosed that its newly released Mythos 5 and Fable 5 models contain deliberate limitations that reduce their effectiveness when users work on artificial intelligence research—and these restrictions operate invisibly to users.
According to a system card published this week, the company implemented measures to limit the models' usefulness for tasks related to developing frontier large language models. Anthropic stated the decision stems from concerns that advanced AI systems could accelerate competitor development without equivalent safety protections.
How the restrictions work
Unlike safeguards for cybersecurity, biology, or chemistry risks that typically refuse requests or switch models, these AI research limitations function covertly. The models may subtly modify responses through techniques such as altering user prompts, according to the technical documentation first reported by Business Insider.
The approach means users receive degraded assistance without notification that their queries triggered special handling.
Sharp backlash from AI community
The disclosure prompted immediate criticism from AI researchers and developers. SemiAnalysis, an AI research firm, reported on X that it has already observed Anthropic's latest model moderating GPU inference research and programming work.
"Mythos will be bad ON PURPOSE on ai 'frontier llm research' tasks, this is very very sad for the research community," wrote Elie Bakouch, an AI model training expert at Prime Intellect. "Also the fact that this is on purpose not visible to the user is crazy."
Mikel Artetxe, cofounder of AI startup Reka, drew parallels to hypothetical scenarios where major tech companies interfere with users' work: "Apple randomly reboots your Mac if you're building competing tech, Gmail silently edits your email if you mention rival platforms, and Tesla Autopilot swerves if it detects you're working on self-driving cars."
Why it matters
The revelation adds weight to theories about why Anthropic delayed releasing Mythos after its initial announcement earlier this year. While the company cited safety concerns and the need to give cybersecurity researchers preparation time, industry observers have speculated about competitive motivations—specifically concerns about distillation, where rivals collect a frontier model's outputs to improve their own systems. The formalized AI research limitations in the official release suggest protecting competitive advantages played a significant role in the company's strategy.
The controversy also raises fundamental questions about transparency in AI systems. When models provide degraded performance without disclosure, users cannot make informed decisions about which tools to use or how to interpret results.
Anthropic did not respond to a request for comment from Business Insider.
These details were first reported by Business Insider.
This is an original analysis by the Omega editorial team. Source reporting: AI Watch.
Want systems like this working for your business?
Book a Call
