CLAUDE.md Is Not the Problem

A recent paper from ETH Zürich found that AGENTS.md and CLAUDE.md files tend to reduce task success rates while increasing inference costs by over 20%. Theo made a video about it, arguing you should probably delete yours. The Hacker News thread predictably erupted.

The question is not whether to use CLAUDE.md. It is how you use it.

This is the exact same problem code comments have. Every codebase has two kinds of comments: ones that explain why something is the way it is, and ones that restate what the code already says. The first kind is invaluable. The second kind is noise that rots over time and actively misleads.

CLAUDE.md has the same split.

Most context files I see are packed with information the agent can figure out on its own. Directory structures. Tech stacks. Framework versions. Build commands that are already in package.json. File naming conventions that are obvious from looking at the files. All redundant. All noise.

The study confirms this. When agents received context files, they faithfully followed every instruction, including unnecessary ones. Style guides, test patterns, and boilerplate conventions made tasks more complex, not less. The agents did more work, explored more files, and spent more tokens, all to follow rules that did not help them solve the actual problem.

Meanwhile, the same study found that developer-written context files outperformed LLM-generated ones for every agent tested. This should not be surprising. Asking an LLM to write your CLAUDE.md is asking it to introspect on what it needs, and LLMs cannot introspect. It will generate something that looks thorough: a nice directory tree, a style guide, a list of conventions. All noise. The model can only draw from its training data, not from any self-awareness of its own failure modes.

And when a repo had no other documentation at all, even basic context files improved performance by ~3%. The signal was there. It was just buried under noise.

Another study on agent “Skills” found the same pattern from a different angle: focused packages with 2-3 modules outperformed comprehensive documentation. Smaller models with the right focused context matched larger models without it. More is not better. Relevant is better.

So what belongs in a context file?

Two things:

1. Things the agent cannot understand from the code alone.

The scale of the app. The business context. Which parts of the codebase are legacy and should not be touched. Why a weird architectural decision was made. Who the users are. What the deployment constraints look like. External dependencies and their quirks. The political reality of why something is the way it is.

These are things no amount of code reading will reveal because they live outside the code.

2. Things that take too long to figure out by researching the code.

The non-obvious build step that requires a specific environment variable. The test that always fails locally but passes in CI. The module that looks unused but is loaded dynamically. The function that must never be called in a certain order because of a race condition nobody bothered to fix.

These are things an agent could eventually discover, but only after burning through tokens and time, and with a real risk of reaching the wrong conclusion.

Everything else should stay out.

If the information is in your code, let the agent read the code. If the information is in your package.json, let the agent read package.json. Duplicating it into a context file does not help. It just creates another place where things can go out of sync.

The people deleting their CLAUDE.md files are overcorrecting. The problem is treating it like a README for robots.

Think of it this way: a good CLAUDE.md is like a good onboarding conversation with a senior engineer. You would not sit someone down and read them the folder structure. You would tell them the things that no amount of coding expertise alone would reveal.