๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
๐Ÿ”ฌ Science & Tech

AI Found Thousands of Zero-Days. We're Not Safer Yet

by Lud3ns 2026. 4. 16.
๋ฐ˜์‘ํ˜•

AI Found Thousands of Zero-Days. We're Not Safer Yet

TL;DR

  • Anthropic's unreleased Claude Mythos model autonomously discovered thousands of zero-day vulnerabilities across every major OS and browser.
  • Project Glasswing, a coalition of Anthropic and 11 founding partners, is using the model defensively before similar capabilities reach attackers.
  • AI collapses the security timeline, but patching still moves at human speed. Speed of discovery alone doesn't equal safety.
  • The offense-defense balance in cybersecurity just shifted โ€” and the implications go far beyond one model.

Last week, Anthropic published a technical assessment that read like a cyberthriller plot. Their unreleased AI model, Claude Mythos Preview, had autonomously found and exploited zero-day vulnerabilities in every major operating system and browser. Not dozens. Thousands.

The Discovery: What Claude Mythos Actually Did

A zero-day vulnerability is a flaw in software that the developers don't know about. No patch exists. No defense is ready. Zero-days are the most valuable currency in cybersecurity โ€” whoever finds one first controls whether it becomes a weapon or a fix.

During roughly a month of internal testing, Anthropic's security team documented what Mythos Preview could do. The results were staggering.

Capability Detail
Scope Vulnerabilities found in every major OS and every major browser
Firefox JS exploit rate 72.4% success (vs. near-zero for previous AI models)
Historic bugs found A 27-year-old DoS flaw in OpenBSD, a 17-year-old RCE in FreeBSD (CVE-2026-4747)
Cost efficiency Found the OpenBSD bug across ~1,000 scaffold runs for under $20,000
Human expertise needed Engineers with "no formal security training" directed the model

The Firefox JavaScript exploit rate deserves emphasis. Where Claude Opus 4.6 managed a success rate of just over zero percent, Mythos Preview generated a working exploit 72.4% of the time on Firefox JS targets. This isn't a marginal improvement. It's a capability discontinuity โ€” the kind that changes what's possible.

Previously, finding zero-days required years of specialized training, deep system knowledge, and often luck. Mythos compressed all of that into a prompt that non-specialist engineers could operate.

Why This Wasn't Intentional

Here's what makes the discovery unsettling. Anthropic explicitly stated: "We did not explicitly train Mythos Preview to have these capabilities. Rather, they emerged as a downstream consequence of general improvements in code, reasoning, and autonomy."

The model wasn't built to break software. It just got good enough at understanding code that hacking became a side effect โ€” like a chess grandmaster who accidentally becomes excellent at poker.

The Paradox: Why More Bugs Found Doesn't Mean More Security

Cybersecurity has a structural asymmetry that AI makes worse before it makes better.

The attacker-defender imbalance:

Factor Attacker Defender
Success condition Find one exploitable flaw Patch all exploitable flaws
Speed constraint Limited only by compute Limited by testing, compliance, deployment
Cost of failure Try again Data breach, revenue loss, reputation damage
Adoption speed Days to weeks Months to years for full patching

AI supercharges the top row โ€” discovery โ€” but the bottom row still runs on human timelines. Enterprises typically take 60 to 150 days to deploy a critical patch. AI finds new vulnerabilities in hours.

This is the offense-defense paradox. Faster discovery benefits whoever moves first. Defenders are structurally slower.

The Patch Gap Problem

Consider the patching pipeline after a zero-day is found:

  1. Discovery โ†’ AI finds the flaw (hours)
  2. Disclosure โ†’ Researcher notifies the vendor (days)
  3. Triage โ†’ Vendor assesses severity (days to weeks)
  4. Development โ†’ Engineers write and test a fix (weeks)
  5. Distribution โ†’ Patch released to users (weeks)
  6. Adoption โ†’ Users actually install the update (months)

AI compressed step 1 from months to hours. Steps 2-6 remain unchanged. According to Tom's Hardware reporting on the disclosure, fewer than 1% of the bugs Mythos uncovered have been fully patched โ€” which is why detailed disclosure would be irresponsible.

The offense scales with compute. The defense scales with committees, change control processes, and user update habits. That gap is where the danger lives.

Vulnerability Chaining: The Multiplier Effect

Individual vulnerabilities are concerning. But what distinguishes frontier AI models is their capacity for vulnerability chaining โ€” connecting a series of individually minor flaws into a single devastating attack path.

A low-severity information disclosure bug plus a medium-severity privilege escalation plus a moderate memory corruption flaw can combine into full remote code execution. Human researchers do this, but it takes weeks of manual analysis. AI models can explore thousands of chain combinations in hours, finding attack paths that no individual researcher would have time to construct.

This is why the raw count of zero-days understates the real risk. The threat isn't just more bugs โ€” it's exponentially more ways to combine them.

The Response: What Is Project Glasswing?

Anthropic's response was to restrict the model and form a coalition. Project Glasswing โ€” named after the nearly transparent glasswing butterfly โ€” pairs Mythos Preview with a group of founding partners for defensive use only.

The founding coalition:

Anthropic, Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, Palo Alto Networks

Alongside these founding members, over 40 additional organizations are participating. This isn't just a press release. It represents an unprecedented approach: using a capability too dangerous to release as a shared defensive resource. Coalition members get early access to vulnerability reports. The public gets patches โ€” eventually.

Three Commitments

The Glasswing framework rests on three pillars:

  • 90-day transparency: Publish lessons learned and practical security recommendations
  • Coordinated patching: Prioritize the most critical vulnerabilities across member organizations
  • Standards development: Create guidance for vulnerability disclosure, patching automation, and supply-chain security

The Trust Question

Not everyone is convinced. Security researcher Bruce Schneier was blunt, calling the announcement "very much a PR play by Anthropic" and noting that reporters repeated the company's talking points without critical engagement. He also pointed out that another security firm replicated some findings using older, cheaper models โ€” suggesting the capability threshold may already be crossed.

This is responsible disclosure in action. But it's also an unprecedented concentration of power over vulnerability information. The 90-day transparency commitment is the mechanism meant to build trust โ€” whether it's sufficient remains an open question.

The Bigger Picture: Emergent Capabilities and the Arms Race

The Mythos story is bigger than one model. It reveals a pattern that will repeat as AI capabilities advance.

The emergent capability jump:

Model Exploit Capability
Earlier AI models Near-zero exploit success
Claude Opus 4.6 Just over 0%
Claude Mythos Preview 72.4% on Firefox JS exploits

That jump from ~0% to 72% happened because the model got better at understanding code โ€” not because anyone trained it to hack. Researchers call this an emergent capability: an ability that appears suddenly at scale, without being explicitly designed.

The implication is profound. Every AI lab pushing the frontier of code generation is simultaneously โ€” and unintentionally โ€” building more powerful offensive tools. The next breakthrough in AI coding assistance will also be the next breakthrough in AI offensive capability. These capabilities are inseparable.

This means Glasswing isn't a one-time fix. It's the opening move in what will become a permanent AI-vs-AI security arms race.

Defenders do have structural advantages. They control the landscape โ€” they choose what code to deploy, what systems to build, and what architecture to use. They can also use the same AI models to review their own code before shipping it, catching vulnerabilities before they reach production. Georgetown's CSET research notes that the ubiquity of defenders creates a large market for defensive innovations, enabling economies of scale that attackers can't match.

But those advantages only hold if organizations modernize their patching infrastructure to match the speed of AI-powered discovery. Right now, most don't.

What This Means for You

Here's the practical picture at three time horizons:

Right now:

  • These vulnerabilities existed for decades before AI found them โ€” some were 27 years old
  • Glasswing members are patching the most critical ones first, with the highest-severity flaws receiving priority
  • Your action: Enable automatic updates on every device. Review your password manager and enable two-factor authentication where available. Avoid delaying OS and browser updates โ€” the window between patch release and attacker exploitation is shrinking rapidly

Next 6-12 months:

  • Other AI labs will develop similar capabilities โ€” this genie doesn't fit back in the bottle. Palo Alto Networks' Wendi Whitmore warned that similar capabilities are "weeks or months from proliferation"
  • Expect faster patch cycles as companies adopt AI-assisted code review on the defensive side
  • Traditional bug bounty programs will be transformed, with AI handling volume discovery and human researchers focusing on novel attack surfaces

Next 1-3 years:

  • Security will become an AI-vs-AI arms race, with both sides operating at machine speed
  • The advantage shifts toward defenders if patching infrastructure modernizes to match discovery speed
  • Organizations that still take months to deploy patches will face exponentially higher risk
  • Companies that integrate AI into their development pipeline for pre-release vulnerability scanning will have a meaningful edge over those relying solely on post-release patching

Integrated Insight

The real story of Claude Mythos isn't about one model finding thousands of bugs. It's about a permanent shift in how security works.

For decades, cybersecurity relied on a simple fact: finding vulnerabilities was hard. That difficulty was itself a defense. AI just removed that constraint. The question is no longer whether flaws will be found, but who finds them first and how fast the fix arrives.

Project Glasswing is a bet that organized defense can outpace decentralized offense. It's the right bet. But it's a bet โ€” not a guarantee.


๐Ÿ“Œ Sources


Related Posts

SUGGESTED_EVERGREEN: Cybersecurity offense-defense balance โ€” why defenders always play catch-up, and how AI is rewriting the rules

๋ฐ˜์‘ํ˜•