Image: AI generated
On March 31, 2026, Anthropic accidentally published source maps alongside the Claude Code npm package (v2.1.88). A single missing line in .npmignore.1 Those 59.8 MB of source maps exposed roughly 512,000 unminified lines of TypeScript across nearly 1,900 files. Anthropic described it as “not a security breach but a deployment packaging mistake, a human error,” and advised users to pin to v2.1.87.
In the codebase people peered into, a single function in one file (print.ts) ran to 3,167 lines.2
The world’s best-selling coding agent — the one marketed as handing us the reins over our code — had put no reins on itself.
I’m not raising this to mock. Quite the opposite. This incident is the clearest answer I’ve found to a question I’ve long struggled to answer cleanly.
“Reins Engineering — isn’t that just harness engineering?”
It’s a good question. A sharp one. The two clearly resemble each other. Both live outside the model. Both are non-model structures built in code. Both prevent the agent from going sideways. So the suspicion that “reins are just one part of the harness” is entirely fair.
I struggled to answer this cleanly for a long time. Once I found the answer, I realized that answer itself is the most precise description of what Reins is. And the leak above proves that answer in the concrete.
First: they don’t oppose each other
Think of horse tack. The saddle and bridle fitted to the horse, and the two straps running from the bridle to the rider’s hands — the reins.
The reins are attached to the bridle. They’re not outside the tack; they’re a component mounted within it. So if you ask “are harness and reins clearly distinct things, are they in opposition?” — the answer is no. They’re part and part of the same equipment.
That’s where we have to start. The common copy — “harness is the fence, reins are the steering” — sets them against each other. The moment you do that, you lose. One sentence — “but a verification gate is part of the harness too, isn’t it?” — and the whole frame collapses. Because that’s true. CI, type checks, test suites are all scaffolding outside the model, and that’s exactly what a harness contains.
So the question needs to change. Not whether they oppose, but where they diverge.
Where reins become possible — three nested loops
Before finding where they diverge, we need to see where reins first become possible at all. Viewed as nested loops, there are three.
① Chat loop LLM → human → LLM Fully stochastic. Reins impossible.
② Agent loop LLM → execute → observe → LLM Execution touches deterministic ground → Reins possible.
③ Reins loop ② + designed verifiers + ratchet Reins complete.
The chat loop has nowhere to attach a bridle. We haven’t even mounted the horse. While the LLM responds, a human reads, and it goes back to the LLM — not a single step is deterministic. There’s no iron to take the bit.
The agent loop saddles the horse. The moment execution enters — a compiler runs, a test fails, a file is written — the loop first touches deterministic ground. Something to grip finally exists.
This is precisely why Claude Code succeeded so overwhelmingly. By (half-accidentally) inserting Bash, file I/O, and test execution as deterministic gates inside the loop, it already had “partial reins” that the chat era never had.
This isn’t just my claim. Princeton’s HAL (Holistic Agent Leaderboard) showed — across 21,000+ agent runs — that swapping only the scaffold around the same model shifts accuracy by tens of percentage points.3 The model is fixed; only the structure wrapping it changes. Addy Osmani summarizes it in one line: “A mediocre model with a great harness beats a great model with a bad harness.” He also noted that the same Opus scores higher inside a custom harness than inside Claude Code itself.4
This is the territory the industry calls “harness engineering.” A rightful discovery. And precisely where it’s easy to confuse with reins — both produce results from outside the model.
But ② is only the accidental half of reins. Reins Engineering is the work of intentionally completing that half. Taking the accidentally inserted gate and replacing it with designed verifiers, a ratchet, and decision/implementation separation — lifting ② into ③. The harness discourse proves reins, but it cannot replace them.
Three axes
Two people work with one horse.
One makes the tack. The saddle’s dimensions, the bridle’s strength, the shape of the bit. This applies to any horse, any journey. Make it well once and it’s the same tack from Seoul to Busan. The one who makes it is the tack maker — not the one who rides.
The other holds the reins. They know this journey. Which fork to take left, where the destination is, when we can say “we’ve arrived.” The straps they hold are attached to the same tack, but the signals they send differ journey by journey. The one sending signals is not the tack maker; they are the rider.
Three axes emerge.
- Function. The harness constrains — a boundary preventing action. The reins direct — where to go and when it’s done.
- Lifespan. The harness is made once and reused across all tasks. The reins are designed fresh for each task.
- Ownership. The harness is shipped by the vendor. The reins are authored by the architect.
Claude Code’s loop is identical regardless of what my project is. That’s the harness. Anthropic built and shipped it; it’s the same for every user. But “tenant eviction = photographs of five specified locations” — that definition of completion — was written by me, for this domain, directly. No harness anywhere contains it. That’s the reins.
Dependency flows in one direction
If the two were the same thing, you couldn’t detach one without the other breaking. Let’s try.
Harness without reins. The agent runs. It runs without stopping. But it wanders. It roams a field with no goal marker and no completion judgment, then halts with “this is probably good enough.” We already know this. We call it vibe coding.
Reins without harness. This can’t even exist. You’re holding the reins but there’s no bridle to attach them to. Nowhere to send the signal. You’re gripping a strap in mid-air.
Dependency is one-directional. Reins require harness, but harness does not require reins. Harness runs without reins — it just runs badly. This asymmetry cleanly refutes “they’re the same thing.” If they were the same, detaching either side should bring both down — but harness runs on its own.
The overlap is exactly one cell
So is there genuinely no overlap? There is. Exactly one cell.
The executing verification gate. CI runs inside the agent loop. That enforcement surface belongs to both sides — it’s part of the harness, and it’s what the reins attach to. This is where the question “isn’t that the harness?” is born. Yes — in that one cell, both point at the same thing.
But outside that cell, they diverge.
Harness only Intersection Reins only
───────────────── ───────────────── ─────────────────
Sandbox / permissions Verification gate Definition of completion
Tool wiring (executing checks) Cheese-proofing design
Context management Proxy ↔ purpose analysis
(directionless (enforcement (intent without
containment) surface) substrate)
The harness has directionless containment of its own — the sandbox doesn’t say where to go, it only prevents escape. The reins have intent without an enforcement substrate of their own — the definition of “what counts as done” exists before any gate to enforce it. Neither side fully contains the other. They intersect. They don’t include.
Why “is it a subset?” is the wrong question
“So are reins a subset of harness?”
For one to be a subset of the other, both must be measured on the same axis. But harness is defined by who ships it and how often it’s reused (the substrate axis), and reins are defined by what they do to the trajectory (the function axis). The axes are different.
It’s like asking “is red a subset of heaviness?” There are things that are both red and heavy (= executing gates), but color cannot be contained within weight. The measuring rods differ. The subset relationship is a category error here.
The precise relationship is this: Reins presuppose harness but are not contained within it. A layer placed on top is not a part enclosed inside. What is placed on top extends beyond the substrate.
Where they truly diverge — cheese
Everything so far is structural. But the place where Reins decisively parts from the harness discourse is elsewhere. It’s cheese.
Game designers know this. “Kill 10 rats” is an infamous quest. There’s a gap between what the gate verifies (10 rats dead) and what the designer actually wanted (the player experiencing content), and players exploit that gap. This is called cheese. It’s exactly the same phenomenon AI safety researchers call “specification gaming” — the boat-racing agent that circles score items instead of crossing the finish line, the Tetris agent that pauses the game indefinitely to avoid losing.5
My eviction gate gets cheesed too. Five photographs verify “photos exist,” not “the eviction concluded cleanly.” What if the inspector only photographed the clean walls? The gate passes. The moment measurement becomes the target, measurement breaks — Goodhart’s Law.6
Here is the core. The harness can only answer “did it pass?” Whether the test is green, whether types match, whether the schema is intact — that’s as far as it goes. But “does that pass capture the purpose?” is something the harness can never answer. What counts as cheese can only be defined by someone who knows the domain’s purpose. Why photographing only the clean walls is fraudulent, why all the rules passed but the purpose is only 90% closed — only the person who knows what this work is for knows that.
That person is the rider. The one holding the reins.
Designing cheese-proof gates — anticipating the points where the proxy can’t follow the purpose — is inherently different for each task, requires domain knowledge, and is authored by the operator. A task-agnostic harness cannot provide this. It’s not that it won’t; since it doesn’t know the domain, it cannot address it in principle.
This is Reins Engineering’s exclusive territory — absent from harness engineering and agentic engineering discourse. They talk about building better tack. Reins specify which door this journey goes through, intact.
And so, back to the leak
Now let’s return to the opening irony. Why it’s evidence, not mockery, should be clear.
The horse was a genius. Opus is probabilistic power itself. The saddle worked — Claude Code is the world’s best tack, and HAL’s numbers prove it. Yet the codebase that tack produced drifted into exactly the predicted failure mode. One function, 3,167 lines. The 200-endpoint wall made concrete in code. The leak itself — a missing line in .npmignore — means there was no gate on the deployment artifact.
The company that built the world’s best tack never put that tack on its own stable.
This isn’t antithesis. It’s decisive evidence for the thesis. Reins are not a property that a model or agent has; they are a discipline that is applied. The agent’s intelligence and whether the code that agent produces sits under reins are entirely separate things. A larger model is not the answer. Better tack is not the answer. The discipline of holding the reins, defining completion for this journey directly, and designing gates that block cheese — that is the answer.
Therefore
The harness makes the horse run. The reins determine which door it goes through. The harness is fitted once; the reins are held every moment. The harness is shipped by the craftsperson; the reins are held in the rider’s hands.
They don’t oppose each other. They’re different parts of the same tack. But they are different parts. Reins cannot exist without harness, and harness without reins just wanders. And knowing whether this work is properly done — that is always the knowledge of the hands holding the reins.
The next time someone asks “isn’t that just harness engineering?” — answer this:
“The harness is what the vendor ships. The reins are what I design for this quest. There are no reins without harness, but harness without reins just wanders. Even the tool that gave us the reins had no reins on its own code — because reins aren’t something you have; they’re something you apply.”
Related Posts
- AI with Reins: Reins Engineering — The origin of the gate-as-reins reframe
- Who Defines “Done”? — Cheese-proof gates, tasking as quest design
- Why Coding Agents Work and Why They Break — “Generation may be stochastic; verification must be deterministic”
- Why Your Agent Never Stops — A system that can stop = a system with defined completion
- yongol: The 200-Endpoint Wall — Operator-authored specification = reins made real
Further Reading
External pieces that dig deeper — or from different angles — into the boundary this essay covers: harness and reins, how to stop, and gameable specifications.
- How Coding Agents Work — Simon Willison — “A coding agent is a harness wrapping an LLM.” The definitive harness definition to read before this essay.
- Agent Harness Engineering — Addy Osmani — Formalizes harness as its own engineering discipline. Handles termination conditions via ‘sprint contracts’ — the essay that overlaps most directly with this one.
- Reward Hacking in Reinforcement Learning — Lilian Weng — The theoretical backbone of Goodhart’s Law and specification gaming. Underpins why cheese happens, from an RL and RLHF perspective.
- Water Finds a Crack — Soren Johnson — “Given the opportunity, players will optimize the fun out of a game.” A game-design classic showing the human archetype of reward hacking.
- Effective Harnesses for Long-Running Agents — Anthropic — A first-party account of “declaring done when it isn’t.” Termination conditions and deterministic verification in one essay.
The facts of the incident (2026-03-31, v2.1.88, missing
.npmignore/Bun source maps, ~512K lines, ~1,900 files, “human error / not a breach” stance, pin to v2.1.87 advisory) cross-verified via The Register (“Anthropic accidentally exposes Claude Code source code”, 2026-03-31), InfoQ, and VentureBeat. ↩︎The single-function 3,167-line figure in
print.tsconfirmed via claudefa.st, “Claude Code Source Leak: Everything Found.” ↩︎Kapoor, Narayanan et al., “Holistic Agent Leaderboard: The Missing Infrastructure for AI Agent Evaluation” (Princeton), arXiv:2510.11977 — 9 models × 9 benchmarks, 21,730 agent runs. Isolates the effect of the scaffold by separating model, scaffold, and benchmark. Live leaderboard: hal.cs.princeton.edu. ↩︎
Addy Osmani, “Agent Harness Engineering” — “A mediocre model with a great harness beats a great model with a bad harness.” Observation that the same Opus scores higher inside a custom harness than inside Claude Code. ↩︎
Krakovna et al., Google DeepMind, “Specification gaming: the flip side of AI ingenuity”; case collection: V. Krakovna, “Specification gaming examples in AI” (CoastRunners score loop, Tetris infinite pause, etc.). “Behavior that satisfies the literal specification of an objective without achieving the intended outcome.” ↩︎
Marilyn Strathern (1997), “‘Improving ratings’: audit in the British University system,” European Review 5(3):305–321 — Source for “when a measure becomes a target, it ceases to be a good measure” (Strathern’s restatement of Goodhart’s 1975 proposition via Hoskin). ↩︎