agents-md
Curated AGENTS.md preset that kills sycophancy, blocks drive-by refactors, and forces verification loops. Synthesizes Karpathy's principles with Cherny's Claude Code workflow.
Most AGENTS.md files in the wild are scratch pads - a list of build commands, a couple of “don’t break this” notes, and whatever the human thought was important last Tuesday. This one is a deliberate template, kept to roughly 200 lines on purpose so the rules actually stay resident in the agent’s context window across a long session.
The author cites two influences explicitly: Andrej Karpathy’s four principles on LLM coding failure modes, and Boris Cherny’s public Claude Code workflow that emphasises reactive pruning and tight files. Anthropic’s official Claude Code best practices and the author’s own “IJFW” principle (“it just f*cking works”) round out the credits. None of that matters if the rules don’t change agent behaviour, but they do.
What it actually changes
Nine behavioural shifts, in order of how often you’ll feel them:
- Pushback over compliance. Replaces the reflexive “You’re absolutely right!” pattern with the agent flagging when the request itself is wrong.
- Minimal diffs. Every changed line traces back to the request. No drive-by reformatting, no “while I was here” cleanups.
- Verification first. The agent writes the test, runs it, and reports the actual result - not “this should work.”
- Surface ambiguity. Stop the work and ask, instead of silently guessing the spec.
- Tight file size. The template is intentionally short so the rules don’t get evicted as the conversation grows.
The drive-by-refactor rule is the one most teams fight over and the one most worth keeping. It’s why agents that follow this template produce reviewable PRs instead of 4,000-line “improvements.”
Quick start
curl -o AGENTS.md https://raw.githubusercontent.com/TheRealSeanDonahoe/agents-md/main/AGENTS.md
Drop it at the project root. If you also use Claude Code or Gemini CLI, symlink so all three agents read the same source of truth:
ln -s AGENTS.md CLAUDE.md
ln -s AGENTS.md GEMINI.md
The one bit of manual work: Section 10 (Project Context) wants your stack, build commands, and directory layout filled in. Budget five minutes. Skipping this section is the most common reason the template “doesn’t work” - without project context the rules are advisory at best.
Which agents read it natively
- Native AGENTS.md readers - Codex, Cursor, Windsurf, Copilot, Aider, Devin, Amp, opencode, RooCode
- Reads via symlink - Claude Code (via
CLAUDE.md), Gemini CLI (viaGEMINI.md)
For Claude Code specifically, the template also expects topic-scoped rules in .claude/rules/*.md with path frontmatter. Cursor’s equivalent is .cursor/rules/*.mdc. Both are optional but recommended for repo-specific overrides that shouldn’t bloat the root file.
The Section 11 trick
Section 11 (“Project Learnings”) starts empty. The agent is instructed to append to it whenever the human corrects something - a one-line entry per correction. Over a few weeks this becomes the highest-signal part of the file: it’s the actual list of things this agent gets wrong on this codebase.
Most teams that try this for the first time are surprised by the same thing: the corrections cluster. You learn that the model keeps making the same three mistakes, which is a cue to either generalise the rule or accept the limitation.
When to use it
- New project, no existing
AGENTS.mdorCLAUDE.md. This is the highest-value moment to install it. - Existing project with a
CLAUDE.mdthat’s grown to 800 lines and stopped working. Replace it. - Multi-agent team (Codex + Claude + Gemini) that needs one source of truth.
When not to
- Heavily regulated codebases where every rule needs to be project-specific. The template is a starting point - you’ll spend more time stripping out generic rules than adding yours.
- Throwaway prototypes. The setup cost isn’t worth it if the project lives a week.
The real test
Run the same task on the same codebase with and without the template. The win shows up in three places: PR diff size goes down, tests-before-claim becomes the default, and the agent stops agreeing with you when you’re wrong. That last one is the hardest to install with anything other than explicit rules - which is the whole reason this file exists.
Similar tools
- Garden Skills
Three carefully-scoped skills: web-design-engineer (with an anti-cliche blocklist that breaks the generic-AI-landing-page loop), gpt-image-2 (80+ templates, three runtime modes including advisor-only fallback), and kb-retriever (layered data_structure.md navigation for bounded local-KB retrieval). Tested across Claude Code, Claude.ai, Cursor, Codex, Gemini, OpenCode.
- PostTrainBench
Benchmark measuring whether Claude Code, Codex CLI, Gemini CLI, and OpenCode can autonomously improve 4 small base models (Qwen3-1.7B/4B, SmolLM3-3B, Gemma-3-4B) on 7 evals (AIME, BFCL, GPQA, GSM8K, HealthBench, HumanEval, Arena Hard) within a single H100 GPU and 10 hours. Includes agent-as-judge anti-reward-hacking and baseline-replacement penalties for tampering.
- Claude Code Analysis
82 docs and 15 diagrams mapping every major subsystem of Claude Code's accidentally exposed 512K-line TypeScript source - YOLO classifier, 93% context compaction, prompt-cache layout, 88+ feature flags, the custom React-Fiber terminal renderer.
- wanman
Multi-agent runtime that spawns each Claude Code or Codex agent in its own git worktree and home directory. JSON-RPC subprocess control, task pooling, artifact storage. Solves the share-a-directory failure mode that breaks most multi-agent harnesses.