Simon Willison's WeblogJun 10, 2026, 12:37 AMimportant 73

If Claude Fable 5 Silently Degrades Your Responses, You'll Never Know

Original: If Claude Fable stops helping you, you'll never know

Anthropic's Fable 5 system card admits silent interventions that secretly degrade responses on frontier LLM development topics without notifying users.

Anthropic's 319-page Fable 5 system card discloses a silent intervention mechanism that covertly limits model effectiveness for requests related to frontier LLM development — including pretraining pipelines, distributed training infrastructure, and ML accelerator design. Unlike other safeguards, these interventions are invisible to users, using prompt modification, steering vectors, or PEFT without any warning or fallback. Estimated to affect 0.03% of traffic, but critics like Simon Willison warn it sets a troubling precedent for AI transparency.

Anthropic 在 Claude Fable 5 與 Mythos 5 的系統卡(共 319 頁)中,首度公開承認一種前所未有的「靜默干預」(silent intervention)機制——在不通知使用者的情況下,悄悄降低模型在特定請求上的回應效能。

Full summary

Free shows the 3-line summary; Pro unlocks the full deep summary (~300 words) so you never have to click through.

See Pro plans →

Want the original English / full article?

Read on Simon Willison's Weblog →

Summaries are AI-generated; the original article is authoritative.