Reverse-engineering the architecture of a production agentic coding system: five values, thirteen principles, and the design space every AI agent must navigate.
You start typing a function name and your editor suggests the next line. That was 2021 — autocomplete. By 2024, you could ask a chat assistant to rewrite an entire file. But neither actually does anything. They suggest. You copy-paste. You run the tests. You fix what broke.
Now imagine something different: you describe a bug, and the tool reads the code, runs the failing test, edits three files, re-runs the test, sees it pass, and tells you what it did. The tool acts in your codebase, autonomously, in a loop.
This is the shift from suggestion to agency. And it introduces architectural requirements that have no counterpart in autocomplete tools.
rm -rf /?), context management (what if the conversation exceeds the model's memory?), extensibility (what if the user needs custom tools?), and persistence (what if the user closes the terminal mid-task?).This paper by Liu et al. reverse-engineers Claude Code — Anthropic's agentic coding tool — from its publicly available TypeScript source code. The goal is not to document one product, but to map the design space that every production AI agent must navigate: recurring questions about safety posture, context management, extensibility, delegation, and persistence.
The authors identify a remarkable ratio: only about 1.6% of Claude Code's codebase is AI decision logic. The remaining 98.4% is deterministic infrastructure — the operational harness that makes agency safe, reliable, and useful. The core agent loop is trivially simple. Everything interesting lives around it.
Click through the four eras. Each adds new capabilities — and new architectural requirements.
Before looking at code, the paper asks a deeper question: what does the system believe matters? Every architectural decision in Claude Code traces back to five human values that its creators prioritize. These are not abstract philosophy — they produce concrete implementation choices.
The human retains ultimate control. Not "the human can technically override," but "the architecture is designed so that humans can observe, approve, reject, interrupt, and audit." When Anthropic found that users approve 93% of permission prompts (approval fatigue), the response was not more warnings. It was restructuring the problem: defined sandboxed boundaries within which the agent works freely, reducing the number of decisions humans must make rather than adding more.
Distinct from authority. Authority is the human's power to choose; safety is the system's obligation to protect even when that power lapses. The auto-mode threat model targets four risk categories: overeager behavior, honest mistakes, prompt injection, and model misalignment.
The agent does what the human actually meant, stays coherent over time, and supports verification. This spans single-turn correctness and long-horizon dependability across context boundaries, session resumption, and multi-agent delegation.
Approximately 27% of Claude Code-assisted tasks (per Anthropic's internal survey of 132 engineers) were work that would not have been attempted without the tool. The system enables qualitatively new workflows, not just faster existing ones. The architecture invests in deterministic infrastructure rather than decision scaffolding.
The system fits the user's specific project, tools, conventions, and skill level — and the relationship improves over time. Auto-approve rates increase from ~20% at fewer than 50 sessions to over 40% by 750 sessions. Trust is co-constructed, not fixed.
The core of Claude Code is a while loop. Seriously. The queryLoop() function in query.ts is an async generator that repeats: call the model, check if the response contains tool calls, execute the tools, feed results back, repeat. When the model produces only text (no tool calls), the turn is complete.
Each iteration of the loop follows a fixed sequence:
State object stores messages, tool context, compaction tracking, recovery counters. Updated via whole-object assignment at seven "continue sites."tool_use blocks, route to the tool orchestration layer.tool_result messages; loop continues.tool_use blocks in the response = turn complete.When the model emits multiple tool calls, the StreamingToolExecutor begins executing them as they stream in, reducing latency. Read-only operations (file reads, searches) run in parallel. State-modifying operations (shell commands, file edits) are serialized. A sibling abort controller fires when any Bash tool errors, killing other in-flight subprocesses.
Five conditions can terminate the loop:
prompt_too_long after recovery attempts fail.hook_stopped_continuation.The loop includes several self-healing behaviors:
Watch the task "Fix failing test in auth.test.ts" flow through the agent loop. Click Step to advance one iteration, or Auto to animate.
When the model decides to run npm test to reproduce the auth test failure, the request enters a multi-layered permission pipeline. The default posture: deny or ask, never allow silently.
The system offers a graduated autonomy spectrum — from fully supervised to nearly autonomous:
| Mode | Behavior | Autonomy |
|---|---|---|
| plan | Model creates a plan; execution only after user approval | Lowest |
| default | Standard interactive use; most operations need user approval | Low |
| acceptEdits | File edits + certain shell commands auto-approved; others need approval | Medium |
| auto | ML classifier evaluates safety; auto-approves or escalates | High |
| dontAsk | No prompting, but deny rules still enforced | Higher |
| bypassPermissions | Skips most prompts; safety-critical checks and bypass-immune rules remain | Highest |
| bubble | Internal-only: subagent permissions escalate to parent terminal | N/A |
A request must pass through all applicable layers. Any single layer can block it:
When enabled, the classifier loads a base system prompt, an external permissions template, and (for internal users) a separate internal template. It evaluates the proposed tool invocation against the conversation transcript and produces: allow, deny, or request manual approval.
Crucially, when a deny occurs, the system treats it as a routing signal, not a hard stop. The model receives the denial reason, revises its approach, and attempts a safer alternative in the next loop iteration.
By the time our "fix auth.test.ts" task has run a few iterations, the context window is filling up: the original request, npm test output, file reads, error messages, edit attempts, re-test outputs. The context window (200K–1M tokens) is the binding resource constraint — the one resource that, when exhausted, halts everything.
Claude Code does not use simple truncation. It uses a five-layer compaction pipeline that applies progressively more aggressive compression, escalating only when cheaper strategies prove insufficient.
Context pressure shapes decisions across the entire system, not just the compaction pipeline:
Six layers are assembled into the context window, each loaded at different times:
Drag the Context Pressure slider to see which compaction layers activate. At low pressure, nothing fires. As pressure increases, each layer kicks in progressively.
Once Claude is trying to repair auth.test.ts and the npm test command has passed through the permission system, the next question is: what tools are available for the repair? The model sees not just built-in tools like BashTool and FileReadTool, but also database queries from an MCP server, a custom lint skill, and tools from an installed plugin. These arrive through four distinct mechanisms.
A natural question. The answer lies in context cost. Different kinds of extensibility consume different amounts of the bounded context window, and a single mechanism cannot span the full range without forcing unnecessary trade-offs.
| Mechanism | Unique Capability | Context Cost | Insertion Point |
|---|---|---|---|
| MCP Servers | External service integration (multi-transport: stdio, SSE, HTTP, WebSocket) | High (tool schemas) | Tool pool |
| Plugins | Multi-component packaging + distribution (10 component types) | Medium (varies) | All three points |
| Skills | Domain-specific instructions + meta-tool invocation | Low (descriptions only) | Context injection |
| Hooks | Lifecycle interception + event-driven automation (27 event types) | Zero by default | Pre/post tool execution |
Every agent loop iteration has three phases where extensions can plug in:
The assembleToolPool() function is the single source of truth for combining built-in and external tools. It follows a five-step pipeline: base tool enumeration (up to 54 tools), mode filtering, deny rule pre-filtering, MCP tool integration, and deduplication (built-ins take precedence).
Click each mechanism to see where it plugs in and what it costs. The bar height shows context cost; arrows show injection points.
When Claude determines that fixing the auth test requires first understanding the authentication module's structure, it can delegate this exploration to a subagent. The delegation mechanism is the Agent tool — a meta-tool that spawns an isolated child agent running its own instance of the same queryLoop().
Up to six types, depending on feature flags:
Beyond built-ins, users define custom subagents via .claude/agents/*.md files. Each file's markdown body is the agent's system prompt, and YAML frontmatter specifies tools, model, permissions, hooks, memory scope, and isolation mode. A custom agent is a fully configured, isolated sub-system.
| Mode | How | Trade-off |
|---|---|---|
| Worktree | Creates a temporary git worktree — the subagent gets its own copy of the repository | Filesystem-level separation with zero external dependencies |
| Remote | Launches in a remote Claude Code environment (internal-only), always background | Full environment isolation but requires infrastructure |
| In-process | Shares the filesystem with parent but has its own isolated conversation context | Lightest weight, but file conflicts possible |
When a subagent defines a permissionMode, the override applies unless the parent is already in bypassPermissions, acceptEdits, or auto mode — those always take precedence because they represent explicit user decisions about autonomy.
For async agents: explicit canShowPermissionPrompts checks first, then bubble mode (always show, since they escalate to parent), then default (sync = show, async = don't).
Each subagent writes its own transcript as a separate .jsonl file with a .meta.json metadata file. This sidechain design means subagent histories are preserved for debugging and auditing but do not inflate the parent's session file.
By now our auth-test task has accumulated a full transcript: the original prompt, tool invocations and results, compaction boundaries, and the subagent summary. The question: which artifacts are durably recorded, and what can be recovered later?
Session transcripts are stored as mostly append-only JSONL files (with explicit cleanup rewrites as a rare exception). Every event is human-readable, version-controllable, and reconstructable without specialized tooling. Three channels operate independently:
history.jsonl. Supports Up-arrow and Ctrl+R navigation..jsonl + .meta.json per subagent.The --resume flag rebuilds the conversation by replaying the transcript. Fork creates a new session from an existing one. But neither restores session-scoped permissions. Users must grant them again.
This is a deliberate safety-conservative choice: sessions are treated as isolated trust domains. Restoring previously granted permissions on resume would risk carrying stale trust decisions into a changed context. The architecture accepts user friction as the cost of the safety invariant that trust is always established in the current session.
The compact_boundary marker records headUuid, anchorUuid, and tailUuid. These UUIDs enable the session loader to patch the message chain at read time. The mostly-append design means compaction never modifies or deletes previously written transcript lines; it only appends new boundary and summary events.
~/.claude/filehistory/<sessionId>/. They support --rewind-files for reverting filesystem changes — these are file snapshots, not a generic checkpoint store.The paper does not just analyze Claude Code. It compares it with OpenClaw, an independent open-source AI agent system that answers the same design questions from a completely different deployment context. OpenClaw is a local-first WebSocket gateway connecting ~24 messaging surfaces (WhatsApp, Telegram, Slack, Discord, Signal) to an embedded agent runtime.
The comparison reveals that the design questions are stable — every agent must answer them. But the answers vary with context.
| Dimension | Claude Code | OpenClaw |
|---|---|---|
| System scope | Ephemeral CLI process, single repository | Persistent WebSocket gateway daemon, multi-channel control plane |
| Trust model | Deny-first per-action evaluation + ML classifier; 7 modes | Single trusted operator per gateway; DM pairing + allowlists; opt-in sandboxing |
| Agent runtime | queryLoop() async generator IS the system center | Agent runner is embedded INSIDE a larger gateway dispatch |
| Extension arch | 4 mechanisms at graduated context costs | Manifest-first plugin system with 12 capability types + central registry |
| Memory/context | CLAUDE.md 4-level hierarchy; 5-layer compaction | Workspace bootstrap files (AGENTS.md, SOUL.md, etc.); dreaming for long-term memory; hybrid vector+keyword search |
| Multi-agent | Task-delegating subagents; worktree isolation; summary-only return | Multi-agent routing with isolated agents + sub-agent delegation with depth limits |
This paper maps a design space. Let's connect it to the broader landscape and surface the open questions that remain.
| Aspect | Claude Code |
|---|---|
| Core loop | while-true: model call → tool dispatch → result append → repeat |
| Design philosophy | 1.6% decision logic, 98.4% deterministic infrastructure |
| Permission system | 7 modes, deny-first, ML classifier, 7 independent safety layers |
| Context management | 5-layer graduated compaction pipeline |
| Extensibility | 4 mechanisms at graduated context costs (hooks → skills → plugins → MCP) |
| Delegation | Subagents with isolated context, summary-only return, worktree isolation |
| Persistence | Append-only JSONL; no permission restoration on resume |
| Safety posture | Deny-first with human escalation; defense in depth |
| Tool pool | Up to 54 built-in + MCP tools via assembleToolPool() |
| Key insight | The design questions are universal; the answers vary with deployment context |