AI
Anthropic Rolls Out Its Most Powerful Cyber AI Model — Days After Leaking Its Own Source Code
The launch of Claude Mythos Preview and Project Glasswing, mere days after Anthropic accidentally exposed 512,000 lines of its core product’s source code to the world, is either the most audacious act of strategic redirection in Silicon Valley history — or the most revealing window yet into the contradictions at the heart of frontier AI development.
There is a particular species of Silicon Valley irony that only manifests at the very frontier of technological ambition. On March 31st, 2026, an Anthropic employee made a mistake so elementary it would embarrass a first-year computer science undergraduate: a debug source map file was accidentally bundled into a public software release, pointing to a cloud-hosted archive of the company’s most commercially prized product — the source code of Claude Code, its flagship agentic coding assistant. Within hours, 512,000 lines of proprietary TypeScript code, across 1,906 files, were mirrored, forked, and torrent-distributed across the internet, never to be recalled. The repository on GitHub was forked more than 41,500 times before Anthropic could blink. Then, seven days later, Anthropic announced the most capable AI model it has ever built — a cybersecurity behemoth called Claude Mythos Preview — and launched Project Glasswing, a sweeping initiative to secure the world’s critical digital infrastructure. The company publicly described it as a watershed for global security. A watching world could be forgiven for raising an eyebrow.
History rarely serves up irony quite this rich. The firm that accidentally handed a blueprint of its proprietary agent harness to thousands of developers, threat actors, and competitors — the firm that inadvertently revealed the internal codename of its most powerful unreleased model buried in that same code — emerged days later as the standard-bearer for a new era of AI-powered cyber defence. It is, depending on your interpretation, either a masterclass in narrative control or a deeply unsettling indicator of the structural tensions now embedded in the development of frontier AI.
I. A Double Embarrassment: The Anatomy of the Leak
The facts of the Anthropic source code leak are simultaneously mundane and extraordinary. On the morning of March 31st, 2026, Anthropic pushed version 2.1.88 of its @anthropic-ai/claude-code package to the npm public registry. Buried inside was a 59.8-megabyte JavaScript source map file — a developer debugging tool that, when followed to its reference URL on Anthropic’s own Cloudflare R2 storage bucket, yielded a downloadable zip archive of the complete, unobfuscated TypeScript source for Claude Code.
Security researcher Chaofan Shou, an intern at Solayer Labs, spotted the exposure at 4:23 AM Eastern and posted a direct download link on X. It was, as The Register reported, “a mistake as bad as leaving a map file in a publish configuration” — a single misconfigured .npmignore field. A known bug in Bun, the JavaScript runtime Anthropic had acquired in late 2025, had been causing source maps to ship in production builds for twenty days before the incident. Nobody caught it.
This was, in fact, the second major accidental disclosure of the month. Days earlier, Fortune had reported on a separate leak of nearly 3,000 files from a misconfigured content management system — including a draft blog post describing a forthcoming model described internally as “by far the most powerful AI model” Anthropic had ever developed. That model’s codename: Mythos. Also, apparently: Capybara.
The March–April 2026 Anthropic Disclosure Timeline
| Date | Event |
|---|---|
| ~Late March 2026 | Fortune reports on ~3,000 leaked CMS files; first public confirmation of the Mythos model’s existence and capabilities. |
| March 31, 2026 | Claude Code v2.1.88 ships to npm with embedded source map; 512,000 lines of TypeScript exposed within hours. GitHub repository forked 41,500+ times. |
| March 31 – April 6 | Anthropic issues DMCA takedowns; threat actors seed trojanized forks with backdoors and cryptominers. Axios supply-chain attack occurs simultaneously. |
| April 7, 2026 | Anthropic officially announces Claude Mythos Preview and Project Glasswing. Partners include Apple, Microsoft, Google, Amazon, JPMorgan Chase, and others. |
What the leaked source revealed was considerable: 44 hidden feature flags for unshipped capabilities, a sophisticated three-layer memory architecture, the internal orchestration logic for autonomous “daemon mode” background agents, and — critically — confirmation that a model called Capybara was actively being readied for launch. The VentureBeat analysis noted that Claude Code had achieved an annualised recurring revenue run rate of $2.5 billion by March 2026, making the intellectual property exposure a genuinely material event for a company preparing to go public.
II. Claude Mythos Preview and Project Glasswing: A Technical Step-Change
To understand why the timing of the Mythos announcement matters, one must first grasp the scale of what Anthropic is claiming. Claude Mythos Preview is not a marginal improvement on its predecessors. It occupies, in Anthropic’s internal taxonomy, a fourth tier entirely above the existing Haiku–Sonnet–Opus range — a tier the company internally designates “Copybara.” According to SecurityWeek, it represents “not an incremental improvement but a step change in performance.”
The headline claim is breathtaking in its scope. In the weeks prior to the public announcement, Anthropic ran Mythos against real open-source codebases and, according to its own Project Glasswing announcement, the model identified thousands of zero-day vulnerabilities — flaws previously unknown to software maintainers — across every major operating system and every major web browser. The oldest vulnerability it uncovered was a 27-year-old bug in OpenBSD, a system famous for its security record. A 16-year-old flaw in video processing software survived five million automated test attempts before Mythos found it in a matter of hours. The model autonomously chained together a series of Linux kernel vulnerabilities into a privilege escalation exploit — the kind of attack chain that would previously have required a sophisticated, nation-state-grade human research team.
A single AI agent could scan for vulnerabilities and potentially take advantage of them faster and more persistently than hundreds of human hackers — and similar capabilities will be available across the industry in as little as six months.
The Axios reporting on the rollout puts the dual-use risk with uncomfortable clarity: Mythos is “extremely autonomous” and possesses the reasoning capabilities of an advanced security researcher, capable of finding “tens of thousands of vulnerabilities” that even elite human bug hunters would miss. This is precisely why Anthropic chose not to release it publicly. Instead, Project Glasswing gives curated preview access to 40-plus organisations responsible for critical software infrastructure — including Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, Nvidia, and Palo Alto Networks — backed by up to $100 million in usage credits and $4 million in direct donations to open-source security organisations including the Apache Software Foundation and OpenSSF.
The model is not cybersecurity-specific. CNBC noted that Mythos’s cyber prowess is a downstream consequence of its exceptional general-purpose coding and reasoning capabilities — a distinction with profound regulatory implications. You cannot restrict a model trained to think brilliantly about code from thinking brilliantly about vulnerabilities in that code.
III. The Deeper Meaning: Irony, Competence, and the New Security Paradigm
The central paradox demands direct engagement: Anthropic, a company whose founding proposition is responsible AI development, leaked its own product’s source code through a packaging error so elementary it required no sophistication to exploit. It then, within the same news cycle, announced an AI model so powerful its own CEO fears its public release — and positioned itself as the primary steward of global cyber defence. One is entitled to hold both thoughts simultaneously.
And yet the strategic coherence of the Mythos launch, viewed against the backdrop of the leak, is hard to dismiss entirely. Anthropic did not choose the timing. The Mythos project had been in development and partner testing for weeks before the Claude Code source code escaped its containment. But the company, having already suffered the reputational bruise of one accidental exposure too many, had an imperative to seize the narrative — to move from embarrassed leaker to principled guardian, rapidly. The result is a masterclass in what crisis communications professionals call “agenda replacement.”
The deeper issue, however, is structural and it transcends any single company. The Axios assessment is stark: Mythos is “the first AI model that officials believe is capable of bringing down a Fortune 100 company, crippling swaths of the internet or penetrating vital national defense systems.” Meanwhile, the head of Anthropic’s frontier red team, Logan Graham, told multiple outlets that comparable capabilities will be in the hands of the broader AI industry within six to eighteen months — from every nation with frontier ambitions, not just the United States. The window for getting ahead of this threat is not a decade. It is, at most, a year.
What the Mythos launch crystallises is a principle that the cybersecurity community has long understood but that corporate AI leaders and policymakers have been reluctant to internalise: the same model property that makes an AI system valuable for defence makes it catastrophically useful for offence. The technical writeup on Anthropic’s red team blog makes this explicit. Mythos can “reverse-engineer exploits on closed-source software” and turn known-but-unpatched vulnerabilities into working exploits. Gadi Evron, founder of AI security firm Knostic, told CNN that “attack capabilities are available to attackers and defenders both, and defenders must use them if they’re to keep up.” There is no asymmetry available — only the question of who moves first.
IV. The Geopolitical and Regulatory Reckoning
The implications of Anthropic Mythos extend well beyond corporate strategy. The U.S.-China AI competition has already entered the domain of active cyber operations. A Chinese state-sponsored group, as Fortune reported, used an earlier Claude model to target approximately 30 organisations in a coordinated espionage campaign before Anthropic detected and curtailed the activity. If a Claude model that predates Mythos by several capability generations was sufficient to mount a significant intelligence operation, the implications of Mythos-class capability in hostile hands are genuinely alarming.
A source briefed on Mythos told Axios: “An enemy could reach out and touch us in a way they can’t or won’t with kinetic operations. For most Americans, a conventional conflict is ‘over there.’ With a cyberattack, it’s right here.” This framing matters. The doctrine of nuclear deterrence rested partly on the difficulty of acquisition. The doctrine of cyber deterrence in the Mythos era rests on nothing — the marginal cost of deploying AI-accelerated attack capability approaches zero for any state or non-state actor with API access to a comparable model.
Anthropic’s relationship with Washington is, to put it diplomatically, complicated. The company is simultaneously briefing the Cybersecurity and Infrastructure Security Agency, the Commerce Department, and senior officials across the federal government on Mythos’s capabilities — while locked in active litigation with the Pentagon, which has labelled Anthropic a supply-chain risk following the company’s refusal to permit autonomous targeting or battlefield surveillance applications. The AI safety firm that declined to arm American drones is now, in the same breath, offering American critical infrastructure a first-mover advantage against AI-powered adversaries. The philosophical coherence of this position is defensible; its political navigation will be considerably harder.
For regulators, the Mythos announcement poses a question for which existing frameworks have no satisfying answer. The EU AI Act’s tiered risk classifications were not designed for a model that is simultaneously a breakthrough productivity tool, a national security asset, and a potential weapon of mass cyber-disruption. The Project Glasswing model — voluntary, industry-led, access-gated — is a plausible short-term mechanism. It is not a durable regulatory framework. And as Logan Graham made clear, the window before other frontier labs — and the Chinese state — reach comparable capability is measured in months, not years.
V. Verdict: A Reckoning Dressed as a Launch
Editorial Assessment
The Mythos announcement is not primarily a product launch. It is a reckoning — one that Anthropic has had the narrative dexterity to package as a strategic initiative rather than a confession. The source code leak was, at the level of operational security, an embarrassment of the first order. But it was also, unintentionally, a proof of concept for the vulnerability landscape that Mythos was built to address. Anthropic’s own systems failed a test far simpler than any that Mythos could conceivably pose to a determined adversary.
That irony is not merely cosmetic. It is instructive. No organisation — not even a frontier AI lab whose entire value proposition rests on the responsible management of powerful systems — is immune to the mundane failure modes of human error, toolchain misconfiguration, and the accumulated technical debt of moving too fast. The question is not whether Anthropic can be trusted with Mythos. The question is whether any institution, in any country, is structurally capable of managing the governance of AI capabilities that are advancing faster than the legal and regulatory architectures designed to contain them.
Dario Amodei framed the Project Glasswing rollout as an opportunity to “create a fundamentally more secure internet and world than we had before the advent of AI-powered cyber capabilities.” This is not rhetorical excess. It is, technically, accurate: the same capability that can chain together a 27-year-old kernel vulnerability into a privilege escalation exploit can, in the hands of defenders, systematically eliminate such vulnerabilities from the world’s most important software. The question is not whether this technology is transformative. It is whether the institutional infrastructure required to ensure that transformation benefits defenders more than attackers can be assembled in the time available.
Six months. Eighteen at the outside. That is the horizon Logan Graham has placed on the proliferation of Mythos-class capabilities across the industry. The global financial cost of cybercrime already runs to an estimated $500 billion annually, a figure that was compiled before any model approached Mythos’s level of autonomous vulnerability discovery. Policymakers in Washington, Brussels, and Beijing who are not currently treating this as an emergency are, as one source briefed on Mythos told Axios with commendable directness, “not remotely ready.”
Anthropic rolled out its most powerful cyber AI model days after leaking its own source code. The irony is real. So is the threat. And so, potentially, is the opportunity — if the institutions responsible for governing it can move at the speed the technology demands, rather than the speed at which governments customarily prefer to operate. History suggests that gap will be considerable. The Mythos timeline suggests that gap may, for once, be decisive.