What the Claude Code Leak Reveals About AI Agent Architecture

Anthropic accidentally published the entire Claude Code source code via an npm sourcemap. Beyond the buzz, the code reveals what separates an AI agent prototype from a real production system.

Octave Olivetti

What the Claude Code Leak Reveals About AI Agent Architecture

#A sourcemap, 390,000 lines, and everything Anthropic didn't want you to see

On March 31st, 2026, the entire source code of Claude Code ended up on the internet. The cause: a .map file included by mistake in the npm package.

In JavaScript, code is "public" but can be shipped to production compressed into an unreadable blob. Source maps exist for debugging. They link each line of compressed code back to the original, line by line. The catch is that a source map embeds all the original source code. Every file, every comment, every internal constant. It is generally not meant to be published.

json

{
  "version": 3,
  "sources": ["../src/main.tsx", "../src/tools/BashTool.ts"],
  "sourcesContent": ["// The entire original source code of each file"],
  "mappings": "AAAA,SAAS,OAAO..."
}

Anthropic uses Bun to bundle Claude Code into a single distributable file. Bun generates source maps by default unless you explicitly turn them off... and... someone forgot to disable or delete them. What followed was a nice little gift where npm served 390,000 lines of TypeScript to anyone who downloaded the package. Allowing anyone to look under the hood of what is probably the most advanced AI coding agent to date and find a few secrets.

For instance, an irony hard to ignore, Claude Code ships an entire subsystem called "Undercover Mode", designed to prevent the AI from leaking Anthropic's internal information in its commits. You build an anti-leak system for the AI, and ship all the secrets in a .map file. But beyond the irony, there are good things to take away from what was laid bare, and here are a few.

#What Claude Code actually is

Claude Code looks like a polished CLI. Under the surface, it is a massive codebase: terminal rendering engine, 40+ tools, multi-agent orchestration system, background memory engine.

The starting point is simple. Every AI agent runs the same loop:

User → messages[] → Claude API → response
If tool_use → execute → append result → loop back

This is the minimal agent loop. You send a prompt, the model responds, and if it wants to use a tool, you execute it, feed the result back, and loop. 50 lines of code will do it.

The other 389,950 lines make this loop reliable in production.

#The patterns of a production AI agent

#1. Prompt cache architecture

Every API call to Claude includes a system prompt, the conversation history, and tool definitions. All that text is billed on every call. The bill adds up fast. Claude Code splits its system prompt into static sections (cacheable across sessions, billed once) and dynamic sections (user-specific content, recomputed on every change).

Sub-agents share the same cache. Instead of each one spinning up with its own full prompt (and paying its own token bill), they branch off from a shared trunk. Everything before the point where their instructions diverge gets computed and billed once. Without this, multi-agent would be too expensive to run.

#2. Context compaction

Anyone who has used Claude Code for a long session knows the moment: the context window fills up, the model summarizes the history to make room, and it loses track of what it was doing.

Imagine a 3-hour conversation with a developer. At some point, you need to summarize the first two hours to keep going, but the summary forgets that you agreed not to touch file X. The source reveals five compaction strategies to handle this. For example, one of them preserves explicit decisions ("do not modify this file") intact even when the rest of the conversation gets summarized. Five strategies, and Anthropic is still iterating. The problem is far from solved.

#3. Permission cascading

Five permission levels: policy, flags, local, project, user. Every tool action is classified as low, medium, or high risk, and an ML classifier decides automatically whether to approve it.

Naive approach

– Allow everything or prompt for everything

– No distinction between reading a file and deleting one

– Users either get annoyed or get exposed

Production approach

– Risk classification per tool action

– Cascading rules from org policy down to user preferences

– Protected files (.gitconfig, .bashrc) guarded by default

The code also handles attacks. A malicious prompt can ask to access ../../etc/passwd to climb the directory tree and escape the project. Another can encode a file path in Unicode to bypass a protection rule. Each vector has its own countermeasure.

#What is coming next

The codebase is well ahead of the public release. Most of what follows is behind internal feature flags.

#1. Memory consolidation ("Dream")

autoDream is Claude Code's memory engine. It runs in the background as a dedicated sub-agent. It is Claude, dreaming.

It triggers when three conditions are met: 24 hours since the last dream, at least 5 sessions elapsed, and no other dream currently running.

Four phases:

Orient: scan the memory directory, read existing topic files
Gather: identify new information worth persisting
Consolidate: write or update memory files, convert relative dates to absolute, delete contradicted facts
Prune: keep the memory index under 200 lines and 25KB

The dream sub-agent gets read-only access. It can look at your project but cannot modify anything.

If you have read my previous article on context engineering, the parallel is direct. The dream system applies context engineering to the agent itself: instead of building one pipeline per piece of knowledge, it maintains a living context layer, queryable from one session to the next.

#2. KAIROS

KAIROS is an always-on mode. Claude no longer waits for you to type: it watches, logs, and acts on its own. At regular intervals, it receives <tick> prompts and decides whether to act or stay silent. Guardrail: any proactive action that would block the workflow for more than 15 seconds gets deferred. It gets exclusive tools like push notifications and PR subscriptions. In short, a shift from "ask then answer" to "observe then act."

#3. Coordinator mode

Coordinator mode turns Claude Code into an orchestrator. It creates, directs, and manages multiple workers in parallel, each with its own tool access and instructions. The prompt is explicit: "Workers are async. Launch independent workers concurrently whenever possible." Workers communicate via XML messages and share a common workspace.

#What this means

What strikes me in the end is less the hidden features than how much of the code has nothing to do with AI itself. Caching, memory, permissions, orchestration: these are classic infrastructure problems, solved with classic engineering patterns. The LLM is at the center, but most of the work happens around it.

And these layers are not specific to coding agents. Any system where an LLM needs to act on its own and interact with the real world safely will end up building them.

One last detail. The source confirms that Anthropic employees use Claude Code to contribute to open-source projects. Undercover Mode instructs the AI: "Do not blow your cover. Your commit messages and PR descriptions MUST NOT contain any Anthropic-internal information." It strips all AI attribution and writes commits "as a human developer would."

How much of the open-source world is already being quietly shaped by these systems? This time, we only know because someone forgot a file. These agents are entering enterprise workflows. The code they produce, the decisions they make, the way they interact with existing systems, all of this calls for governance practices that don't exist yet. Knowing how to build with these tools won't be enough. We'll also need to know how to govern them.