Your CLAUDE.md Is Probably Too Long

Most CLAUDE.md files are bloated with instructions the model already knows, documentation meant for humans, and duplicate rules that compete for limited attention. Here's how to fix yours.

There is a number that most Claude Code users don’t know about: 150 to 200. That’s roughly how many instructions a frontier thinking LLM can follow with reasonable consistency. Claude Code’s own system prompt already uses about 50 of those slots. Your CLAUDE.md, your rules files, your skills metadata, your project-level instructions: they all compete for what’s left.

If your CLAUDE.md is 300 lines long and Claude keeps ignoring half your rules, this is why. The model isn’t broken. Your instruction budget is overdrawn.

The instruction budget

Think of it like a context tax. Every line in your CLAUDE.md is loaded into every session, every time, whether or not it’s relevant. A 50-line file about Google Workspace conventions is there when you’re writing a Python script. Your Jira query templates are loaded when you’re debugging CSS. Your six-paragraph explanation of the DRY principle sits in context while Claude does something it already knows how to do.

The cost isn’t just tokens. It’s attention. When important rules (“never commit without asking”) compete with obvious ones (“use meaningful variable names”), the important ones sometimes lose. Not because the model can’t follow them, but because they’re buried in noise.

This is the same phenomenon that killed long system prompts in the early GPT-4 days. More instructions didn’t mean better behavior. It meant less reliable behavior. The difference with CLAUDE.md is that the file grows organically. A lesson learned here, a gotcha there, and nobody prunes it.

What typically goes wrong

After auditing dozens of CLAUDE.md files (including my own), the same patterns keep showing up.

The lecture

Blocks of text explaining principles Claude already knows. DRY, SOLID, fail-fast, separate concerns in PRs, write meaningful commit messages. These aren’t instructions that change behavior. They’re knowledge the model was trained on. Every line spent restating what the model already knows is a line stolen from something it doesn’t know, like your project’s specific build commands or your team’s deployment conventions.

The test is simple: would removing this line cause Claude to make a mistake? If you delete “always follow the DRY principle” and Claude still avoids duplication (it will), the line was wasting your budget.

The personal wiki

tmux shortcuts. IDE keybindings. Homebrew installation notes. Architecture decision records written for human teammates. These are documentation for you, not instructions for Claude. The model doesn’t control your terminal multiplexer. It doesn’t need to know your keyboard shortcuts. Every line of human-facing documentation in CLAUDE.md is a line that costs context but produces nothing.

The kitchen sink

Everything in one file. Build commands next to security policies next to Jira queries next to learned gotchas. Some of these need to be top-of-mind every session (security policies). Others are reference material that’s only relevant occasionally (Jira queries). Treating them the same means either the important stuff gets lost in noise, or you keep the file short and lose the reference material.

The duplicate web

The same instruction appears in ~/.claude/CLAUDE.md, in the project’s CLAUDE.md, and maybe in a rules file too. This is worse than wasting context. It can create conflicts. If two files give slightly different guidance for the same behavior, Claude picks one arbitrarily. You might not even notice which one it chose.

The pruning test

One heuristic keeps coming up, in Anthropic’s own docs and in every discussion I’ve found on the topic:

For every line in your CLAUDE.md, ask: “Would removing this cause Claude to make mistakes?” If not, delete it.

That’s it. Not “is this good advice?” Not “would a junior developer benefit from reading this?” The question is whether Claude specifically needs this instruction to behave correctly. If Claude already does the right thing without the instruction, the instruction is wasting your budget.

Applied ruthlessly, this test typically eliminates 40-60% of a CLAUDE.md file.

The layered architecture

The fix isn’t to throw everything away. It’s to put each instruction where it belongs. Claude Code already supports a layered system that most people underuse:

CLAUDE.md is for hard behavioral constraints. Things where a single miss causes real damage. “Never commit without asking.” “Never share documents publicly.” “Always run tests before claiming work is done.” These are your non-negotiable policies. Target: under 100 lines.

Rules files (~/.claude/rules/ or .claude/rules/) hold reference knowledge that Claude needs when relevant but doesn’t need in working memory during every task. Tool selection heuristics, Jira query templates, accumulated gotchas. These are always loaded but organizationally separate, making them easier to maintain and prune.

Rules files can also be scoped to specific file types using glob patterns:

---
globs: ["netlify/functions/**/*.mjs"]
---
Use ES module syntax. Always return { statusCode, body } objects.

This rule only loads when Claude touches your Netlify functions. Zero cost to every other session.

Skills are deep workflows loaded on demand. A skill for creating presentations. A skill for debugging. A skill for optimizing CLAUDE.md files. These aren’t loaded until triggered, so they have no ongoing context cost. If you have a 200-line workflow guide sitting in your CLAUDE.md, it should almost certainly be a skill instead.

Hooks are deterministic guardrails. Unlike CLAUDE.md instructions (which are advisory), hooks run scripts at specific points in Claude’s workflow and guarantee the action happens. “Never commit without asking” is a good CLAUDE.md rule. But if it absolutely must happen every time with zero exceptions, a pre-commit hook is more reliable than a line in a markdown file.

The mental model:

Layer	Purpose	When loaded	Reliability
CLAUDE.md	Policy	Every session	Advisory
Rules	Reference	Every session (or by glob)	Advisory
Skills	Workflows	On demand	Advisory
Hooks	Enforcement	On trigger	Deterministic

What belongs in CLAUDE.md

After the pruning test and the placement optimization, what’s left is surprisingly short:

Build, test, and lint commands Claude can’t infer from reading code
Code style rules that deviate from language defaults
Behavioral constraints that prevent damage (commit policies, sharing restrictions)
Environment context Claude can’t discover on its own (project paths, email addresses, tool aliases)
Skill triggers that override default behavior (“always invoke X skill before coding”)

Everything else is either reference material (move to rules), a workflow (move to a skill), a deterministic requirement (move to a hook), or something Claude already knows (delete).

A concrete example

A typical before/after:

Before: 265 lines, single file

58 lines of coding principles Claude already knows
46 lines of tmux documentation for the human
16 lines of detailed CLI reference material
5 lines of duplicate entries
5 lines of common system knowledge

After: 58 lines in CLAUDE.md + 39 lines across 4 rules files = 97 total lines

That’s a 63% reduction. The important rules, the ones that actually change Claude’s behavior, are more likely to be followed because they’re not buried under a lecture about the DRY principle.

The maintenance habit

CLAUDE.md isn’t a write-once file. It’s more like code. It needs review, it accumulates debt, and it benefits from regular pruning.

Run the pruning test monthly. Reread every line. Ask the question. Delete what doesn’t pass.

Watch for ignored instructions. If Claude keeps doing something despite a rule against it, the file might be too long. The rule isn’t failing because it’s badly written. It’s failing because it’s competing with too many other rules for attention.

Treat additions critically. When you learn a gotcha and want to save it, ask: does this belong in CLAUDE.md (behavioral), in a rules file (reference), or is it something Claude will learn from the codebase anyway? The default should be to not add it. Let the file stay short.

Use /init as a starting point, not the final product. The /init command generates a reasonable CLAUDE.md by analyzing your codebase. But it’s a starting point. It doesn’t know your behavioral preferences, your deployment conventions, or which rules actually need to be top-of-mind.

Beyond one file

Martin Fowler popularized the term “context engineering” to describe the discipline of crafting the right context for AI tools. CLAUDE.md is one surface of that, maybe the most important one, because it’s persistent, loaded every session, and shapes every interaction.

But the lesson is broader. Every AI coding tool has some equivalent: Cursor has .cursorrules, Windsurf has rules, GitHub Copilot has instruction files. The same principles apply. Keep it short. Keep it behavioral. Put reference material somewhere it doesn’t compete with policy. Prune relentlessly.

The models will get better at following more instructions. The context windows will get longer. But the principle won’t change: a focused set of instructions that Claude actually follows is worth more than an exhaustive manual it mostly ignores.