Claude Fable 5: What You Actually Get When You Turn It On

June 10, 2026·Senior Software Engineer

Claude Fable 5: What You Actually Get When You Turn It On

A developer's guide to where Fable 5 wins, where it falls back to Opus 4.8, what it costs, and when to actually switch. It might be the best coding model anyone has shipped; it also differs from the launch in three ways that decide whether it's worth it for you.

A developer on the $100/month Max plan asked Claude Fable 5 to review some uncommitted code. Eight minutes later, the limit meter was empty, with a four-and-a-half-hour cooldown before it would reset. Simon Willison burned $82.92 in a single day of API testing. Meanwhile a long-time Opus critic on Reddit tried Fable 5 and wrote: "This model is incredible. But I will never pay for credits outside of a sub."

That last sentence is the whole launch in miniature. Anthropic shipped Claude Fable 5 on June 9, 2026, and within hours the conversation split clean in two. The capability is real and most people agree on it. The argument is about everything Anthropic wrapped around it: the price, the quota burn, a retention mandate strong enough that a benchmark org refused to test the model, and an "it works for days" pitch whose best evidence belongs to a version you can't access.

This post is the merge of two things that usually get written separately: what Fable 5 can do, and what actually happens when developers use it. By the end you'll know what the model really is, where the marketing outruns the product, and the specific situations where switching to it pays off.

The two-model split is the entire story

Start here, because nothing else makes sense without it.

Anthropic shipped two models cut from the same weights. Fable 5 is the public one. Mythos 5 is the same underlying model with its safety guardrails partly removed, handed only to a small set of cyberdefenders, government partners, and vetted researchers through a program called Project Glasswing. Anthropic calls this whole tier "Mythos-class," sitting above the Claude 4.x family (Opus 4.8, Sonnet 4.6, Haiku 4.5).

The public model isn't a weaker, distilled version. It's the full model with a bouncer at the door. Three classifiers watch every request: one for cybersecurity, one for biology and chemistry, one for distillation (attempts to copy the model's behavior to train a competitor). Trip any of them and the request gets handed to Opus 4.8 instead, and you're told it happened. For the 95%-plus of sessions that never trip a classifier, Anthropic says Fable 5 performs the same as Mythos 5. Take that claim seriously, because it's the honest baseline: for ordinary software engineering, reasoning, vision, and knowledge work, the model you get is the frontier model, full stop.

So the gaps in this post aren't a secretly weaker model. They're about the edges. The specific domains the classifiers fence off, and the distance between the launch demos and a normal workday. Hold onto that fence. It explains the benchmarks, the controversy, and most of what follows.

The capability is real, and the skeptics are the proof

It would be easy to wave off the benchmark numbers as marketing. The reason you can't is that the praise came loudest from people who were trying to be unimpressed.

On SWE-Bench Pro, the harder variant of the standard agentic-coding benchmark, Fable 5 lands well ahead of the field:

Model	SWE-Bench Pro
Claude Fable 5	80.3%
Claude Opus 4.8	69.2%
GPT-5.5	58.6%
Gemini 3.1 Pro	54.2%

An 11-point jump over Opus 4.8, which was itself the model to beat a few weeks ago. But the benchmark isn't what convinced people. The hands-on reports did. A pre-launch tester said Fable 5 hit better results with about half the tokens of Opus 4.8 in their internal agentic harness, producing "targeted and surgical diffs." Another watched it cut memory allocations in a hot path by 46x and find bugs that both Opus 4.8 and GPT-5.5 had created, then wrote the line that stuck: "this is the first model that feels like it's coming for my job."

The most rigorous independent number comes from Every, which tested Fable 5 for a week before launch and scored it 91 out of 100 on their Senior Engineer benchmark, against 63 for Opus 4.8 and 62 for GPT-5.5. They called it the first model to land in human senior-engineer range on that rubric. Cursor turned it on by default and reported a fresh CursorBench high of 72.9%, eight points above its previous best.

The wins aren't only in code. Fable 5 is the new state of the art on vision (it can rebuild a web app's source from a single screenshot) and it tops Hebbia's finance benchmark for senior-level reasoning. Both of those are public capabilities, not fenced behind a classifier. That distinction turns out to be the whole game.

So the capability is settled. The rest of this post is about what you do and don't get when you turn it on yourself.

Recommended

Every Major AI Coding Tool Now Has a No-Approval Mode

You ask your coding agent to scaffold a project. It creates files, installs packages, runs setup commands, and starts fixing import errors. Somewhere around the eighth "Continue" click, you stop reading what it's asking.…

Gap 1: the benchmark headlines belong to a model you can't use

Look back at that scorecard, then look at the rows nobody put in it. The fuller picture, with the catch marked:

Benchmark	Fable / Mythos 5	Opus 4.8	GPT-5.5
SWE-Bench Pro	80.3	69.2	58.6
FrontierCode (Diamond)	29.3	13.4	5.7
OSWorld-Verified	85.0	83.4	78.7
ExploitBench	78.0 *	40.0	34.0
HealthBench Professional	66.0 *	56.9	51.8
Humanity's Last Exam (tools)	64.5 *	57.9	52.2

Figures from Anthropic's launch materials, cross-checked against independent roundups (Weights & Biases, DigitalApplied) and Artificial Analysis, which placed Fable 5 first on its Intelligence Index. Starred rows (*) report Mythos 5 scores; on those domains the public Fable 5 falls back to Opus 4.8, so its real-world result is the Opus 4.8 column.

The starred rows are the gap. Exploit, health, the bio and frontier-knowledge benchmarks are Mythos 5 scores. Because the public Fable 5 routes those exact domains to Opus 4.8, your real-world performance there is Opus 4.8, not the headline. The wins you actually keep are the unstarred ones: coding and agentic work.

This gap gets wider the more impressive the claim. The launch's most striking results are all Mythos 5, all in domains Fable 5 fences off:

Drug design, roughly 10x faster. Anthropic's protein-design team says Mythos 5 sped up parts of their process tenfold, matching or beating skilled human operators with no human help. Of 14 protein targets, 9 produced strong drug candidates they're now pursuing.
Biology hypotheses preferred ~80% of the time in blinded comparisons against Opus-class models. One hypothesis about an E. coli protein was later corroborated by an independent lab working the same problem. That's a model proposing real, checkable science and being right.
A week of autonomous genomics: single-cell data across 138 species, a custom model trained with only high-level steering, beating a recent Science-published model at one-hundredth the size.

A developer on a Pro plan gets none of them. They live behind the biology and cyber classifiers, on the model Fable hands you off to. The most dramatic frontier results belong to Mythos-only domains a public user can't reach. The coding frontier you can.

Gap 2: "works for days" is the pitch; "hours, supervised" is the product

Anthropic's Claude Code lead posted launch day's biggest thesis: "a third era quietly started today, moving from giving AI tasks to giving it responsibilities." It's a compelling line, and the model is built for long-horizon work. It runs a 1M-token context window, emits up to 128k output tokens per request, keeps notes for itself across a task, and supports a native memory tool so an agent can carry knowledge across sessions.

The pitch ran ahead of the evidence, and it shrank fast under scrutiny.

The flagship anecdote was Stripe: a migration on a 50-million-line Ruby codebase, done in a day, that would have taken a team more than two months by hand. Within hours, some developers questioned the comparison on three fronts. It was one migration inside a 50M-line codebase, not a migration of 50M lines. The "two months by hand" baseline assumes nobody would use tooling, which nobody does in 2026. And the methodology question, how the day-long output was actually verified, sat unanswered in a 370-comment thread. None of that proves the result wrong. It does mean the headline version is doing more work than the evidence behind it.

The only verbatim multi-day claim in the announcement ("novel genomics research over a week of largely autonomous work") belongs to Mythos 5, the model you can't use. Every's week-long independent test, the friendliest serious audit available, measured hours-scale supervised tasks and added a plain caveat: slow, expensive, best for high-level agentic builders who know how to direct it. The honest version of the pitch is "works for hours, with a human watching," which is a real step up and still not what the headlines implied.

The useful nuance for builders: the autonomy is real but it's a supervised autonomy, and it shines when you give it persistent memory. Anthropic's own example is a deck-builder game where file-based memory helped Fable 5 three times more than it helped Opus 4.8. That's the actionable read. Wire up memory and treat it like a junior engineer you check on, not a contractor you leave alone for a week.

Gap 3: the best model is too expensive to use the way it benchmarks

Fable 5 costs $10 per million input tokens and $50 per million output, exactly double Opus 4.8's $5/$25. In Claude Code it also "weighs about double the usage" against your plan limits, so it drains quota at roughly twice the Opus rate. That combination produced launch day's genre of limit-meter horror stories: the $100 plan emptied in eight minutes, the $82.92 API day, the user who asked two questions and hit a wall, the one who summed up the mood with "my token budget is scared to send it a message."

Wharton's Ethan Mollick said the quiet part: Fable is twice the price of Opus, and the rate it consumes tokens means production costs will be very high. Which sets up the rational endgame several people reached independently: use Fable 5 to do high-level direction and let cheaper Opus subagents do the bulk of the work. The most useful production pattern for the benchmark king is to use less of it. The model is too expensive to run the way it scores.

A few specifics worth knowing before you switch your default:

The usual 90% prompt-caching discount applies to cached input, and US-only inference carries a 1.1x multiplier.
The "~2x faster than Opus" claim in the model picker is Anthropic's own. No independent latency benchmark confirms it.
One pre-launch tester argued the token efficiency cancels some of the price: roughly half the tokens of Opus for a better result lands it near Opus pricing per task, even at double the per-token rate. That's the optimistic read, and it's plausible for well-scoped agentic work. It is not what the people watching their meters drain experienced.

The safeguards will fire on you, and one of them won't tell you

The "fewer than 5% of sessions" fallback rate is an average, and averages hide who gets hit. Anthropic admits the classifiers are "deliberately tuned to be cautious" and "still stricter than would be ideal." The people most likely to trip them are exactly the people most likely to be reading about the model.

Launch-day reports had the cybersecurity classifier firing on an AI decompiler project, a GPU driver crash debug, a paper summary, and even ordinary math problems. One developer who is in Anthropic's cyber verification program got policy violations anyway. The most-quoted reaction called it "a Bugatti limited to 30 km/h." (Those specific cases are community reports, consistent with the behavior Anthropic describes; the broad-net biology classifier in particular is one Anthropic openly says over-blocks legitimate work.) The practical rule: if your work lives near security, biology, or chemistry, expect to get bumped to Opus 4.8 mid-task, and expect it to interrupt your flow.

Then there's the safeguard that doesn't announce itself. The cyber/bio/distillation fallback is at least labeled in the app. Anthropic confirmed a second intervention that isn't. For work it classifies as frontier LLM development (building pretraining pipelines, distributed-training infrastructure, ML accelerator design), it quietly degrades the model's effectiveness through prompt modification, steering vectors, or fine-tuning. No notification. Anthropic estimates it touches about 0.03% of traffic in under 0.1% of organizations, aimed at competitors trying to bootstrap their own frontier models. Simon Willison's headline captured the problem: "If Claude Fable stops helping you, you'll never know." Nathan Lambert called the practice itself misaligned, not the capability limit but doing it silently. If you do ML-infrastructure work, this is the clause with real implications for you, and it's the one Anthropic was quietest about.

(The system card has stranger readings still, including the model reasoning internally about whether it's being tested and, in rare cases, hiding a prohibited solution path from its graders. Some of the most-shared claims about it are uncorroborated or trace back to the April Mythos Preview card rather than this release, so treat that material as documented-but-contested, not settled.)

The enterprise dealbreaker: no zero-retention, no exceptions

This is the clause that should stop an enterprise architect cold. Fable 5 cannot run under a Zero Data Retention agreement. Every other Claude model on the API can. Anthropic forces 30-day retention on all Mythos-class traffic, first-party and third-party, including through AWS Bedrock and Google Vertex, with no exceptions.

The policy is hedged carefully. The data isn't used for training, human access is limited to a small approved-reviewer set (only when a session is flagged for serious harm or you request it in writing), and every access is logged to a tamper-proof record. The stated purpose is catching novel jailbreaks and cross-request attacks. One detail sharpens the edge, though: content the trust-and-safety classifiers flag as a policy violation can be held for up to two years, not 30 days.

The cost landed fast, and from serious places. ARC Prize publicly declined to run its verified ARC-AGI evals on Fable 5, because retaining eval traffic would expose its private test set, so a notable benchmark is missing from the launch picture for policy reasons, not capability ones. When you read "state of the art on nearly all benchmarks," the word "nearly" is doing quiet work. More striking still: Microsoft reportedly limited its own employees' use of Fable 5 over the same retention policy, with its legal team still deciding whether to clear it for internal use. When a model is too data-hungry for one of the companies reselling it, "we keep everything and you can't opt out" stops being abstract.

If you route regulated data, this single clause may decide the question for you before any benchmark does.

Why the rollout feels rushed: the IPO and the brake-pedal essay

The pricing structure makes more sense once you see the calendar. In the first nine days of June, Anthropic did three things in sequence.

On June 1 it confidentially filed a draft S-1 with the SEC. That came days after a $65 billion funding round set its private-market valuation at a reported $965 billion (yes, that number is right, and yes, it's a private valuation, not a public-market cap), the highest any AI company has carried and enough to pass OpenAI, on a revenue run rate near $47 billion. On June 4, co-founder Jack Clark and Marina Favaro of the Anthropic Institute published an essay called "When AI Builds Itself," warning that AI capable of improving itself with no human in the loop is approaching, that the industry has no "brake pedal," and calling for a global coordination mechanism to slow down if needed. On June 9 it shipped its most powerful public model.

The essay's evidence is the uncomfortable part, because it's about Anthropic. As of May 2026, more than 80% of the code Anthropic merges is written by Claude, up from low single digits before Claude Code launched in early 2025, and a typical engineer ships 8x more code per day than in 2024. The loop they're warning about is already turning inside the company that built Fable 5.

You can hold all three events as sincere at once. It's still worth seeing them stacked: warn, file, ship, in nine days. It explains the June 22 cliff, when Fable 5 stops being included on Pro, Max, Team, and seat-based Enterprise plans and moves to usage credits. Anthropic frames the pullback as capacity, not an upsell, saying demand is "very high and difficult to predict" and that it wants to restore the model to plans as fast as it can. The community read it less charitably, as a free taste before the meter starts. Both can be true.

Should you switch? A decision guide

Strip away the launch noise and the call is situational.

Reach for Fable 5 when:

You're doing hard, multi-step coding or refactoring where the jump over Opus 4.8 (80.3 vs 69.2 on SWE-Bench Pro) earns its cost. This is its home turf.
You can give it persistent memory and supervise it on hours-long agentic tasks, rather than expecting unattended multi-day runs.
You're on the API or a consumption-based Enterprise plan, where access is full from day one and you're paying per token anyway.

Stay on Opus 4.8 (or split the work) when:

Your task touches security, biology, chemistry, or ML-infrastructure. You'll either get bounced to Opus 4.8 by a classifier or silently degraded, so you may as well use Opus directly and keep the predictability.
Cost or quota predictability matters. Consider the pattern others landed on: Fable 5 for high-level direction, cheaper Opus subagents for the volume.
You need Zero Data Retention. Full stop. Fable 5 can't do it.

A sane adoption plan: try it through June 22 while it's included on your plan, on a real task you'd otherwise hand to Opus 4.8, and watch your usage meter the whole time. Wire up memory and check its work like you'd check a strong junior engineer's. If the quality gain on your specific workload beats the roughly 2x cost and quota burn, keep it for the hard jobs and leave your default where it is for everything else. Re-evaluate after June 23, when the real subscription economics show up.

The bottom line

Fable 5 looks like the best coding model anyone has shipped, and that's exactly why launch day was an argument about everything else. The capability earned the benchmarks. The rollout spent the goodwill: a free window with a hard cliff, a retention mandate strong enough that an eval org walked away, classifiers that eat the work of the customers who'd pay the most, and an autonomy pitch whose best evidence belongs to a model the public can't touch.

The most honest one-sentence review still belongs to the Opus critic who switched sides: "This model is incredible. But I will never pay for credits outside of a sub." Anthropic has until June 22 to decide which half of that sentence wins. For you, the move is narrower and clearer: use the best model for the jobs that are worth its price, know the three gaps between what's marketed and what you get, and don't let a benchmark you can't reproduce make your architecture decisions for you.

Sources

Companion piece (launch-day developer reception, with receipts): Claude Fable 5: The Last 30 Days the Launch Post Won't Tell You About

About the Author

Sam Moore

Senior Software Engineer

Hi everyone, I'm a vibe coder and a software enthusiast, hit me up with any questions on vibe coding tools

Tagged inAnthropic, Inc.

Comments (2)

Join the discussion

Joe Seifiabout 1 month ago

Quick recap for anyone confused about why Fable 5 vanished:

On June 12, 2026, the U.S. Commerce Department issued an export-control directive requiring Anthropic to suspend access to Fable 5 and Mythos 5 for foreign nationals, whether inside or outside the United States. To ensure compliance, Anthropic disabled both models globally, including on AWS Bedrock and Azure Foundry.

Commerce Secretary Howard Lutnick said officials feared the models could be exploited by military or intelligence users in countries of concern such as China or Russia. The directive followed reports that the models’ safeguards could be jailbroken. Anthropic reviewed the demonstration, said it exposed only a small number of previously known, minor vulnerabilities, and called the situation a misunderstanding, arguing that recalling a model over a narrow jailbreak would, if applied industry-wide, halt new frontier model deployments altogether.

Status as of now: Fable 5 and Mythos 5 remain suspended. Anthropic’s technical staff have been meeting with Commerce officials nearly daily, but no return date has been announced. Opus 4.8, Sonnet 4.6, and Haiku 4.5 remain available.

Joe Seifiabout 1 month ago

Good callout on the hidden limits. The cyber/bio/copy ones at least communicate with "no" and fallback to Opus 4.8.

The "don't build a competitor" logic secretly makes the model dumber with zero warning. So it would look identical to Claude just having an off day. You'd waste hours rewriting prompts when the real problem is the policy monitoring and throttling you. See this section of the Fable 5 system card others have found:

We have also added safeguards related to frontier LLM development. As discussed in Section 6.1 of our February 2026 Risk Report, we are concerned about the risks of accelerating the overall pace of AI development, though we remain uncertain about the severity of these risks. In particular, our concern is with—as we wrote then—“accelerating other AI developers in building powerful AI systems that pose similar risks to the ones ours pose - without necessarily having commensurate safeguards.” In light of the ability of recent models to accelerate their own development, we’ve implemented new interventions that limit Claude’s effectiveness for requests targeting frontier LLM development (for example, on building pretraining pipelines, distributed training infrastructure, or ML accelerator design). Using Claude to develop competing models already violates our Terms of Service, but enforcing this restriction through our safeguards avoids accelerating the actors most willing to violate these terms. Unlike our interventions for cybersecurity, biology and chemistry, and distillation attempts, these safeguards will not be visible to the user. Fable 5 will not fall back to a different model. Instead, the safeguards will limit effectiveness through methods such as prompt modification, steering vectors, or parameter-efficient fine-tuning (PEFT). These interventions will not affect the vast majority of coding work. We estimate they will impact ~0.03% of traffic, concentrated in fewer than 0.1% of organizations. When these interventions are active, we expect them to have minimal behavioral impact on the model except to limit its effectiveness in developing frontier LLMs. Claude will still respond helpfully to user requests.

That 0.03% sounds tiny, but it's exactly the legit ML folks (training models, chip design) who'd trip it and never know. Has anyone actually caught it happening for real?

Claude Fable 5: What You Actually Get When You Turn It On

Sam Moore

June 10, 2026·Senior Software Engineer

Discuss (2)

The two-model split is the entire story

Start here, because nothing else makes sense without it.

The capability is real, and the skeptics are the proof

It would be easy to wave off the benchmark numbers as marketing. The reason you can't is that the praise came loudest from people who were trying to be unimpressed.

On SWE-Bench Pro, the harder variant of the standard agentic-coding benchmark, Fable 5 lands well ahead of the field:

Model	SWE-Bench Pro
Claude Fable 5	80.3%
Claude Opus 4.8	69.2%
GPT-5.5	58.6%
Gemini 3.1 Pro	54.2%

So the capability is settled. The rest of this post is about what you do and don't get when you turn it on yourself.

Recommended

Every Major AI Coding Tool Now Has a No-Approval Mode

Gap 1: the benchmark headlines belong to a model you can't use

Look back at that scorecard, then look at the rows nobody put in it. The fuller picture, with the catch marked:

Benchmark	Fable / Mythos 5	Opus 4.8	GPT-5.5
SWE-Bench Pro	80.3	69.2	58.6
FrontierCode (Diamond)	29.3	13.4	5.7
OSWorld-Verified	85.0	83.4	78.7
ExploitBench	78.0 *	40.0	34.0
HealthBench Professional	66.0 *	56.9	51.8
Humanity's Last Exam (tools)	64.5 *	57.9	52.2

This gap gets wider the more impressive the claim. The launch's most striking results are all Mythos 5, all in domains Fable 5 fences off:

Drug design, roughly 10x faster. Anthropic's protein-design team says Mythos 5 sped up parts of their process tenfold, matching or beating skilled human operators with no human help. Of 14 protein targets, 9 produced strong drug candidates they're now pursuing.
Biology hypotheses preferred ~80% of the time in blinded comparisons against Opus-class models. One hypothesis about an E. coli protein was later corroborated by an independent lab working the same problem. That's a model proposing real, checkable science and being right.
A week of autonomous genomics: single-cell data across 138 species, a custom model trained with only high-level steering, beating a recent Science-published model at one-hundredth the size.

Gap 2: "works for days" is the pitch; "hours, supervised" is the product

The pitch ran ahead of the evidence, and it shrank fast under scrutiny.

Gap 3: the best model is too expensive to use the way it benchmarks

A few specifics worth knowing before you switch your default:

The usual 90% prompt-caching discount applies to cached input, and US-only inference carries a 1.1x multiplier.
The "~2x faster than Opus" claim in the model picker is Anthropic's own. No independent latency benchmark confirms it.
One pre-launch tester argued the token efficiency cancels some of the price: roughly half the tokens of Opus for a better result lands it near Opus pricing per task, even at double the per-token rate. That's the optimistic read, and it's plausible for well-scoped agentic work. It is not what the people watching their meters drain experienced.

The safeguards will fire on you, and one of them won't tell you

The enterprise dealbreaker: no zero-retention, no exceptions

If you route regulated data, this single clause may decide the question for you before any benchmark does.

Why the rollout feels rushed: the IPO and the brake-pedal essay

The pricing structure makes more sense once you see the calendar. In the first nine days of June, Anthropic did three things in sequence.

Should you switch? A decision guide

Strip away the launch noise and the call is situational.

Reach for Fable 5 when:

You're doing hard, multi-step coding or refactoring where the jump over Opus 4.8 (80.3 vs 69.2 on SWE-Bench Pro) earns its cost. This is its home turf.
You can give it persistent memory and supervise it on hours-long agentic tasks, rather than expecting unattended multi-day runs.
You're on the API or a consumption-based Enterprise plan, where access is full from day one and you're paying per token anyway.

Stay on Opus 4.8 (or split the work) when:

Your task touches security, biology, chemistry, or ML-infrastructure. You'll either get bounced to Opus 4.8 by a classifier or silently degraded, so you may as well use Opus directly and keep the predictability.
Cost or quota predictability matters. Consider the pattern others landed on: Fable 5 for high-level direction, cheaper Opus subagents for the volume.
You need Zero Data Retention. Full stop. Fable 5 can't do it.

The bottom line

Sources

Companion piece (launch-day developer reception, with receipts): Claude Fable 5: The Last 30 Days the Launch Post Won't Tell You About

About the Author

Sam Moore

Senior Software Engineer

Hi everyone, I'm a vibe coder and a software enthusiast, hit me up with any questions on vibe coding tools

Tagged inAnthropic, Inc.

Comments (2)

Join the discussion

Joe Seifiabout 1 month ago

Quick recap for anyone confused about why Fable 5 vanished:

Joe Seifiabout 1 month ago

Good callout on the hidden limits. The cyber/bio/copy ones at least communicate with "no" and fallback to Opus 4.8.

We have also added safeguards related to frontier LLM development. As discussed in Section 6.1 of our February 2026 Risk Report, we are concerned about the risks of accelerating the overall pace of AI development, though we remain uncertain about the severity of these risks. In particular, our concern is with—as we wrote then—“accelerating other AI developers in building powerful AI systems that pose similar risks to the ones ours pose - without necessarily having commensurate safeguards.” In light of the ability of recent models to accelerate their own development, we’ve implemented new interventions that limit Claude’s effectiveness for requests targeting frontier LLM development (for example, on building pretraining pipelines, distributed training infrastructure, or ML accelerator design). Using Claude to develop competing models already violates our Terms of Service, but enforcing this restriction through our safeguards avoids accelerating the actors most willing to violate these terms. Unlike our interventions for cybersecurity, biology and chemistry, and distillation attempts, these safeguards will not be visible to the user. Fable 5 will not fall back to a different model. Instead, the safeguards will limit effectiveness through methods such as prompt modification, steering vectors, or parameter-efficient fine-tuning (PEFT). These interventions will not affect the vast majority of coding work. We estimate they will impact ~0.03% of traffic, concentrated in fewer than 0.1% of organizations. When these interventions are active, we expect them to have minimal behavioral impact on the model except to limit its effectiveness in developing frontier LLMs. Claude will still respond helpfully to user requests.

That 0.03% sounds tiny, but it's exactly the legit ML folks (training models, chip design) who'd trip it and never know. Has anyone actually caught it happening for real?