3 min read

Someone Is Trying to Reverse-Engineer Anthropic's AI

Someone Is Trying to Reverse-Engineer Anthropic's AI
Someone Is Trying to Reverse-Engineer Anthropic's AI
5:50

Anthropic has a model it won't release. Someone on GitHub is trying to rebuild it from scratch anyway.

A developer named Kye Gomez published OpenMythos — an open-source reconstruction of what he believes Claude Mythos looks like architecturally — and it has accumulated over 10,000 GitHub stars and 2,700 forks in a matter of weeks. The repo ships with equations, academic citations, and a disclaimer that it has nothing to do with Anthropic. It is, by Gomez's own description, structured speculation in code form.

To understand why this matters, you need to understand what Mythos actually is.

What Claude Mythos Is and Why Anthropic Won't Release It

Claude Mythos leaked into public view in late March when Anthropic accidentally published draft materials describing it as the company's most capable model — a tier above Opus. What made it unreleasable wasn't its reasoning ability. It was what that reasoning ability did to computer systems.

During testing with Mozilla, Mythos autonomously found 271 vulnerabilities in Firefox. It became the first AI model to complete a 32-step corporate network attack simulation. Anthropic responded by locking it inside Project Glasswing — a vetted coalition of roughly 40 partners including Microsoft, Apple, Amazon, and the NSA. The general public never gets access. No API. No waitlist. No preview.

That decision — build the most capable offensive cybersecurity AI ever demonstrated, then restrict it to a coalition of 40 partners — is the context in which OpenMythos exists.

How the OpenMythos Reconstruction of Claude Mythos Works

Gomez's central architectural guess is that Mythos is a Recurrent-Depth Transformer, also called a looped transformer. Standard large language models stack hundreds of unique layers. Looped transformers take a smaller stack and run it through itself multiple times per forward pass — the same weights, more iterations, deeper reasoning in continuous latent space before any token gets generated.

The repo argues this architecture explains Mythos's two most unusual characteristics: exceptional performance on novel problems no other model can solve, alongside uneven raw memorization. That's the fingerprint of looping — the model composes rather than stores.

OpenMythos draws on several recent public research advances to make this plausible. The Parcae paper, published in April 2026 by researchers at UC San Diego and Together AI, solved a long-standing instability problem in looped models — a 770 million parameter Parcae model matches a 1.3 billion parameter fixed-depth transformer in quality, with predictable scaling laws for loop depth. The repo also incorporates DeepSeek's Multi-Latent Attention for memory compression and a Mixture-of-Experts architecture for domain breadth.

What it does not include is trained weights. The code defines model variants from 1 billion to 1 trillion parameters. Training them requires compute that runs into hundreds of thousands of dollars on H100 clusters. Nobody has done it yet.

Why the Vidoc Security Replication Makes OpenMythos More Significant

OpenMythos is the second attempt in a month to chip at the wall around Mythos, and the two efforts are doing different things.

Vidoc Security reproduced several of Mythos's most alarming vulnerability findings using GPT-5.4 and Claude Opus 4.6 inside an open-source agent — no Glasswing access required, at under $30 per scan. Vidoc is saying: you don't need Mythos to find the bugs Mythos found.

OpenMythos is saying something different: given enough compute and the right architecture, you might eventually be able to build something like Mythos yourself.

One replicates the outputs. The other attempts to reconstruct the mechanism. Together they suggest the moat around Mythos is narrower than Anthropic's containment strategy implied.

What the Open-Source AI Safety Debate Reveals

The repo is licensed MIT. The training script is publicly available. The readme is careful — it says "likely," "suspected," and "almost certainly" throughout, and Gomez is explicit that Anthropic has not confirmed any of his architectural guesses. Real Mythos may not be a looped transformer at all.

But that caveat coexists with a more uncomfortable observation: none of the underlying techniques in OpenMythos are proprietary. Looped transformers, Mixture of Experts, Multi-Latent Attention, Adaptive Computation Time, the Parcae stability fix — all of it is in public research literature. OpenMythos is, more than anything, an inventory of what is already known about how to build a Mythos-class model.

Anthropic made a considered decision that an AI capable of autonomously compromising enterprise networks at scale should not be publicly accessible. That reasoning is sound. The question OpenMythos raises is whether that decision is durable — whether restricting access to a model is meaningful when the architectural components required to reconstruct it are available to anyone with a GPU cluster and a research library.

This is not a hypothetical tension unique to Mythos. It is the central unresolved question in AI safety policy: containment strategies built on access restriction assume the knowledge required to build the contained thing remains scarce. The research literature suggests it does not stay scarce for long.

For anyone building AI strategy inside organizations that depend on cybersecurity, the timeline on this matters. Our team at Winsome Marketing helps growth leaders track the AI developments that carry real operational implications. Let's talk.