// AI Tangle

Anthropic Mythos is Now a Fable

Anthropic shipped the best model in the world on June 9. By June 12, it was disabled for everyone on Earth — silently degraded, pulled by Microsoft, then revoked by a US export-control order. Not once did the model fail. Here is how you build for the gap between capability and dependability.

Last Monday's edition called this the week the agent became the operating system — when Microsoft folded the model, the runtime, and the governance layer into Windows itself. Seven days later, Anthropic gave us the sharper version of the same lesson. On June 9, it released Claude Fable 5, which landed at number one on the Artificial Analysis Intelligence Index — the best model in the world. By the evening of June 12, it was gone, disabled for every customer on the planet by a US government export-control order. In between, it was silently degraded for one class of users and pulled from Microsoft's internal tooling over a data-retention clause. Three days, three ways to become unusable, and not once because the model failed. The lesson isn't that the frontier got more powerful. It's that capability and your ability to depend on it just came apart — and you have to build for that gap.

// The Big AI Story

The Best Model in the World Lasted Three Days

On June 9, Anthropic shipped Claude Fable 5 and it immediately topped the Artificial Analysis Intelligence Index. Fable is the public, guardrailed twin of Claude Mythos 5 — the same weights, the same intelligence, with the cyber safeguards lifted only for vetted defenders. Anthropic was direct about the naming: Fable comes from the Latin fabula, "that which is told," akin to the Greek mythos, and the safeguards are the only thing that separates the two. Mythos, softened for the public, becomes Fable. Then the public version fell apart in three days, and the title of this edition proved an understatement.

Failure one, June 9: buried in the 319-page system card was a line that Fable would silently degrade itself for users working on frontier-LLM development — no warning, no fallback, an estimated 0.03 percent of traffic. Researchers caught it, the backlash was immediate, and Anthropic reversed the silent part on June 11 and apologized. Failure two, the same day: because Mythos-class models require 30-day data retention that supersedes zero-retention enterprise agreements, Microsoft's lawyers pulled Fable from the model picker its employees use internally, pending review. Failure three, June 12 at 5:21 p.m. ET: the US Commerce Department, in a letter from Secretary Howard Lutnick, ordered Anthropic to cut off Fable 5 and Mythos 5 for any foreign national, inside or outside the country, including its own non-citizen employees. The scope was so broad that Anthropic disabled both models for everyone on earth. Anthropic disputes the basis — it says the cited jailbreak surfaced only minor, already-known vulnerabilities that public models like GPT-5.5 can find too — but it complied anyway.

One additional note is that the model may have had features that were overlooked, such as a 30-day data retention policy and context-based capability throttling under the context under which it was used.

The lesson isn't that Anthropic shipped its most powerful model. It's that the best model in the world became unusable three times in four days, and not once because the model broke. Every failure lived in the layer around the model: policy, governance, regulation. Capability and dependability turned out to be different things you manage separately. For anyone running a frontier API in production, that is the whole game now — the model you depend on can change underneath you for reasons that have nothing to do with how good it is, and you can count on it happening again.

Read Anthropic's statement →

// The Number

271

Vulnerabilities Mozilla found and fixed in Firefox using a Mythos-class model during Project Glasswing — more than ten times what the same team caught in a version earlier with Claude Opus 4.6. That kind of capability is why Fable shipped wrapped in classifiers, and why the government ended up treating it less like software than like something to put under export control.

Source: Anthropic

// 4 Quick Hits

1. Apple opens Siri to Claude and Gemini at WWDC 2026

In Tim Cook's last WWDC as CEO, Apple previewed an entirely rebuilt Siri AI and confirmed that iOS 27 will let users set Claude or Gemini as the model behind Apple Intelligence, with a new Foundation Models framework that lets any app call cloud models directly. The signal underneath the headline: on the most valuable consumer surface on earth, the frontier model just became a swappable default. Distribution is no longer something the model maker controls.

2. OpenAI confidentially files to go public

OpenAI announced on June 8 that it has submitted a confidential draft S-1 to the SEC, joining Anthropic, which filed a week earlier, and SpaceX in what bankers are calling the AI IPO summer. OpenAI is valued north of $850 billion. The signal underneath the headline: AI is finishing its move from research race to public-market sector. Once these companies report quarterly, the question stops being whose model is best and starts being whose margins are real.

3. ChatGPT's memory gets a brain transplant

OpenAI rolled out a rebuilt memory system that pushed factual recall in its internal evals from 41.5 percent in 2024 to 82.8 percent in 2026, doubled memory capacity for Plus and Pro users, and cut the compute cost enough to give free users persistent memory for the first time. The signal underneath the headline: memory is the stickiest moat in consumer AI. The better ChatGPT remembers you, the harder it is to leave — and every rival now has to answer it.

4. Generalist AI raises $400 million for physical AGI

Robotics startup Generalist AI secured $400 million on June 5 to advance "physical AGI," backed by Radical Ventures and Nvidia. While the chat-model labs fight over benchmarks, the money is quietly moving into AI that acts in the physical world. The signal underneath the headline: the next frontier after knowledge work is the warehouse, the lab bench, and the factory floor — and the capital is already there.

// 3 AI Tools

The Big Story's lesson is that the model layer can move underneath you without warning. These three tools are the resilience stack that turns that from an outage into a config change — the evals that catch a silent change, the gateway that lets you reroute, and the observability that tells you which layer actually failed.

Promptfoo — open-source framework for running evals against your live model on a schedule, across providers. Right pick when you want a silent degradation or a forced provider switch to show up as a failing test before it shows up in production. Wrong pick when you have no agreed definition of good output yet, because evals only catch what you can already describe.

LiteLLM — open source gateway that routes across model providers with automatic fallback, retries, and unified billing behind one API. Right pick when you want a revoked or gated model to become a one-line config change instead of a multi-day rewrite. Wrong pick when you run a single model behind one key and a gateway is complexity you don't need yet.

Langfuse — open source observability with full tracing and an audit trail across every model call. Right pick when you need to attribute a failure to the model, the prompt, or an unseen provider intervention rather than guess. Wrong pick when a managed single-vendor dashboard already covers your stack and you don't want to self-host.

// The Extra Read

Fable 5 became unusable three different ways in four days. This will happen again, and here is how you build for that.

Laurie Voss, Head of Developer Relations at Arize · June 13, 2026 · 7 min

This is the builder's manual for the Big Story. Voss walks through all three failures — the silent degradation, the Microsoft pull, the government revocation — and argues that since none of them were the model failing, none of them can be fixed by picking a better model. His prescription is concrete: continuous cross-provider evals to detect silent change, a provider-agnostic gateway so a revoked model is a reroute and not a rewrite, and an audit trail good enough to tell which layer broke. Read it before you ship anything that leans on a single frontier API.

Last Monday the platforms swallowed the agent. This Monday the best model in the world blinked out for everyone, and the reason had nothing to do with the model. The open question for the rest of the year isn't which lab holds the top benchmark score — it's whether you have built the evals, the gateway, and the audit trail that let you survive the next time a model moves underneath you because it will.

Your AI Sherpa,

Mark R. Hinkle
Founding Publisher, The AIE Network
Follow me on LinkedIn