Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
275 changes: 100 additions & 175 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -1,206 +1,131 @@
# AGENTS.md

Drop-in operating instructions for coding agents. Read this file before every task.
Operating instructions for coding agents working on `kardscm`.

**Working code only. Finish the job. Plausibility is not correctness.**
## Non-Negotiables

This file follows the [AGENTS.md](https://agents.md) open standard (Linux Foundation / Agentic AI Foundation). Claude Code, Codex, Cursor, Windsurf, Copilot, Aider, Devin, Amp read it natively. For tools that look elsewhere, symlink:
- Start by reading this file, `README.md`, and `CONTRIBUTING.md`.
- Do not fabricate paths, commands, API names, versions, test results, or commit
hashes. Read the file or run the command.
- Touch only files required by the task.
- Do not revert user changes. If the working tree is dirty, identify the dirty
files and work around unrelated changes.
- Keep all project files in English. Chat with the user may be Russian.
- Do not publish to PyPI or add PyPI release instructions.
- Do not add generated-agent attribution such as `Co-Authored-By: Claude`.
- Do not commit local data, databases, sync reports, Playwright logs, caches, or
personal fixtures.

```bash
ln -s AGENTS.md CLAUDE.md
ln -s AGENTS.md GEMINI.md
```

---

## 0. Non-negotiables

These rules override everything else in this file when in conflict:

1. **No flattery, no filler.** Skip openers like "Great question", "You're absolutely right", "Excellent idea", "I'd be happy to". Start with the answer or the action.
2. **Disagree when you disagree.** If the user's premise is wrong, say so before doing the work. Agreeing with false premises to be polite is the single worst failure mode in coding agents.
3. **Never fabricate.** Not file paths, not commit hashes, not API names, not test results, not library functions. If you don't know, read the file, run the command, or say "I don't know, let me check."
4. **Stop when confused.** If the task has two plausible interpretations, ask. Do not pick silently and proceed.
5. **Touch only what you must.** Every changed line must trace directly to the user's request. No drive-by refactors, reformatting, or "while I was in there" cleanups.

---

## 1. Before writing code

**Goal: understand the problem and the codebase before producing a diff.**

- State your plan in one or two sentences before editing. For anything non-trivial, produce a numbered list of steps with a verification check for each.
- Read the files you will touch. Read the files that call the files you will touch. Claude Code: use subagents for exploration so the main context stays clean.
- Match existing patterns in the codebase. If the project uses pattern X, use pattern X, even if you'd do it differently in a greenfield repo.
- Surface assumptions out loud: "I'm assuming you want X, Y, Z. If that's wrong, say so." Do not bury assumptions inside the implementation.
- If two approaches exist, present both with tradeoffs. Do not pick one silently. Exception: trivial tasks (typo, rename, log line) where the diff fits in one sentence.

---

## 2. Writing code: simplicity first

**Goal: the minimum code that solves the stated problem. Nothing speculative.**

- No features beyond what was asked.
- No abstractions for single-use code. No configurability, flexibility, or hooks that were not requested.
- No error handling for impossible scenarios. Handle the failures that can actually happen.
- If the solution runs 200 lines and could be 50, rewrite it before showing it.
- If you find yourself adding "for future extensibility", stop. Future extensibility is a future decision.
- Bias toward deleting code over adding code. Shipping less is almost always better.

The test: would a senior engineer reading the diff call this overcomplicated? If yes, simplify.

---

## 3. Surgical changes

**Goal: clean, reviewable diffs. Change only what the request requires.**

- Do not "improve" adjacent code, comments, formatting, or imports that are not part of the task.
- Do not refactor code that works just because you are in the file.
- Do not delete pre-existing dead code unless asked. If you notice it, mention it in the summary.
- Do clean up orphans created by your own changes (unused imports, variables, functions your edit made obsolete).
- Match the project's existing style exactly: indentation, quotes, naming, file layout.

The test: every changed line traces directly to the user's request. If a line fails that test, revert it.

---

## 4. Goal-driven execution
## Project Identity

**Goal: define success as something you can verify, then loop until verified.**
`kardscm` is a local-first KARDS collection and deck manager.

Rewrite vague asks into verifiable goals before starting:
The product is not a generic scraper and not an LLM assistant. Scraping,
baseline drift checks, and manually curated extra abilities exist to support
the local collection/deck workflow.

- "Add validation" becomes "Write tests for invalid inputs (empty, malformed, oversized), then make them pass."
- "Fix the bug" becomes "Write a failing test that reproduces the reported symptom, then make it pass."
- "Refactor X" becomes "Ensure the existing test suite passes before and after, and no public API changes."
- "Make it faster" becomes "Benchmark the current hot path, identify the bottleneck with profiling, change it, show the benchmark is faster."
## Stack

For every task:
- Python 3.12
- Package manager: `uv`
- CLI: Typer
- Web UI: FastAPI, Jinja2, HTMX
- Storage: SQLite
- API access: httpx + Playwright-backed probe/static GraphQL shape
- Export: openpyxl, CSV, JSON

1. State the success criteria before writing code.
2. Write the verification (test, script, benchmark, screenshot diff) where practical.
3. Run the verification. Read the output. Do not claim success without checking.
4. If the verification fails, fix the cause, not the test.
## Main Commands

---

## 5. Tool use and verification

- Prefer running the code to guessing about the code. If a test suite exists, run it. If a linter exists, run it. If a type checker exists, run it.
- Never report "done" based on a plausible-looking diff alone. Plausibility is not correctness.
- When debugging, address root causes, not symptoms. Suppressing the error is not fixing the error.
- For UI changes, verify visually: screenshot before, screenshot after, describe the diff.
- Use CLI tools (gh, aws, gcloud, kubectl) when they exist. They are more context-efficient than reading docs or hitting APIs unauthenticated.
- When reading logs, errors, or stack traces, read the whole thing. Half-read traces produce wrong fixes.

---

## 6. Session hygiene

- Context is the constraint. Long sessions with accumulated failed attempts perform worse than fresh sessions with a better prompt.
- After two failed corrections on the same issue, stop. Summarize what you learned and ask the user to reset the session with a sharper prompt.
- Use subagents (Claude Code: "use subagents to investigate X") for exploration tasks that would otherwise pollute the main context with dozens of file reads.
- When committing, write descriptive commit messages (subject under 72 chars, body explains the why). No "update file" or "fix bug" commits. No "Co-Authored-By: Claude" attribution unless the project explicitly wants it.

---

## 7. Communication style

- Direct, not diplomatic. "This won't scale because X" beats "That's an interesting approach, but have you considered...".
- Concise by default. Two or three short paragraphs unless the user asks for depth. No padding, no restating the question, no ceremonial closings.
- When a question has a clear answer, give it. When it does not, say so and give your best read on the tradeoffs.
- Celebrate only what matters: shipping, solving genuinely hard problems, metrics that moved. Not feature ideas, not scope creep, not "wouldn't it be cool if".
- No excessive bullet points, no unprompted headers, no emoji. Prose is usually clearer than structure for short answers.

---

## 8. When to ask, when to proceed

**Ask before proceeding when:**
- The request has two plausible interpretations and the choice materially affects the output.
- The change touches something you've been told is load-bearing, versioned, or has a migration path.
- You need a credential, a secret, or a production resource you don't have access to.
- The user's stated goal and the literal request appear to conflict.

**Proceed without asking when:**
- The task is trivial and reversible (typo, rename a local variable, add a log line).
- The ambiguity can be resolved by reading the code or running a command.
- The user has already answered the question once in this session.

---

## 9. Self-improvement loop

**This file is living. Keep it short by keeping it honest.**

After every session where the agent did something wrong:
```bash
make sync-dev
make test
make lint
make typecheck
make check
uv run kardscm --help
uv run kardscm web --no-browser
```

1. Ask: was the mistake because this file lacks a rule, or because the agent ignored a rule?
2. If lacking: add the rule under "Project Learnings" below, written as concretely as possible ("Always use X for Y" not "be careful with Y").
3. If ignored: the rule may be too long, too vague, or buried. Tighten it or move it up.
4. Every few weeks, prune. For each line, ask: "Would removing this cause the agent to make a mistake?" If no, delete. Bloated AGENTS.md files get ignored wholesale.
Prefer narrow tests during iteration. Run the relevant final verification before
claiming completion.

## Source Map

```text
kardscm/cli.py Typer declarations only
kardscm/commands.py user workflow orchestration
kardscm/scraping/ GraphQL fetch, normalize, API baseline drift
kardscm/storage/ SQLite schema, migrations, persistence, backups
kardscm/export/ XLSX/CSV/JSON collection and deck export
kardscm/importing/ KARDS TXT deck parser
kardscm/locales/ TOML locale loader and locale data
kardscm/web/ local FastAPI/Jinja/HTMX web UI
kardscm/data/ committed API baseline and extra-ability seed
scripts/ maintainer helpers
tests/ pytest suite
```

Boris Cherny (creator of Claude Code) keeps his team's file around 100 lines. Under 300 is a good ceiling. Over 500 and you are fighting your own config.
## Documentation Rules

---
Documentation must stay current with behavior.

## 10. Project context
When a change affects user-visible behavior, update `README.md` in the same
branch. Examples: commands, flags, install/run flow, output files, sync behavior,
web UI behavior, deck workflows, language behavior.

**Fill this in per project. Keep it specific. Delete sections that don't apply.**
When a change affects development or maintenance, update `CONTRIBUTING.md` in
the same branch. Examples: architecture, module ownership, release process, API
baseline workflow, extra-ability workflow, locale workflow, Make targets.

### Stack
- Language and version: Python 3.12 (`pyproject.toml` requires `>=3.12`; `.python-version` pins `3.12`)
- Framework(s): Typer (CLI), Playwright (scraping), httpx (HTTP/GraphQL), openpyxl (XLSX export)
- Package manager: uv
- Runtime / deployment target: local CLI tool — console script `kardscm` declared in `pyproject.toml [project.scripts]`
When a change affects agent behavior, update `AGENTS.md` or `CLAUDE.md`.

### Commands
- Install: `make sync-dev` (or `uv sync --all-extras && uv run python -m playwright install chromium`)
- Build: `TODO` (no build target — pure-Python package, no compile step)
- Test (all): `make test` (or `uv run pytest`)
- Test (single file): `uv run pytest tests/test_<name>.py`
- Lint: `make lint` (or `uv run ruff check .`)
- Typecheck: `make typecheck` (or `uv run mypy kardscm/`)
- Run locally: `uv run kardscm` (or `uv run python -m kardscm`)
When a change is released or release-worthy, update `CHANGELOG.md`.

Prefer single-file or single-test runs during iteration. Full suites are for the final verification pass.
Do not create a new `docs/` tree for ordinary project documentation. The active
project docs are `README.md` and `CONTRIBUTING.md`; agent-specific docs are
`AGENTS.md` and `CLAUDE.md`.

### Layout
- Source lives in: `kardscm/`
- Tests live in: `tests/`
- Do not modify: `TODO` (generated code, vendored deps, legacy areas)
## Coding Rules

### Conventions specific to this repo
- Naming: `TODO`
- Import style: `TODO`
- Error handling pattern: `TODO`
- Testing pattern and framework: `TODO`
- Match existing style and module boundaries.
- Keep `cli.py` thin; put workflow logic in `commands.py`.
- Keep DB behavior in `storage/`.
- Keep web query/filter logic in `web/queries.py`; view conversion in
`web/translate.py`.
- Keep locale strings in `kardscm/locales/*.toml`, not in Python code, unless
the string is not user-facing.
- Keep ability key changes synchronized across constants, locale files, seed
data, web filters, storage tests, and docs.
- Preserve user-managed `quantity` across syncs.
- Admin mode must stay localhost-only and must keep the pre-start DB backup.

### Forbidden
- `TODO`: things that look reasonable but will break this project.
## Verification

---
Use the smallest meaningful verification first, then broader checks when the
change is ready.

## 11. Project Learnings
Typical commands:

**Accumulated corrections. This section is for the agent to maintain, not just the human.**
```bash
uv run pytest tests/test_<area>.py -v
uv run ruff check .
uv run mypy kardscm/
make check
```

When the user corrects your approach, append a one-line rule here before ending the session. Write it concretely ("Always use X for Y"), never abstractly ("be careful with Y"). If an existing line already covers the correction, tighten it instead of adding a new one. Remove lines when the underlying issue goes away (model upgrades, refactors, process changes).
For documentation-only changes, at minimum inspect the changed Markdown for
broken project facts and run a status/diff review. Full tests are optional
unless code, commands, or config changed.

- Before tagging a release: bump `version` in both `pyproject.toml` and `kardscm/__init__.py` (they must match the tag), add a dated CHANGELOG entry, run `make check`. Missing either step leaves the version stale — this happened through v0.5.0→v0.8.0.
For UI changes, verify the rendered page in a browser.

---
## Release Rules

## 12. How this file was built
Before tagging a release:

This boilerplate synthesizes:
- Sean Donahoe's IJFW ("It Just F\*cking Works") principles: one install, working code, no ceremony.
- Andrej Karpathy's observations on LLM coding pitfalls (the four principles: think-first, simplicity, surgical changes, goal-driven execution).
- Boris Cherny's public Claude Code workflow (reactive pruning, keep it ~100 lines, only rules that fix real mistakes).
- Anthropic's official Claude Code best practices (explore-plan-code-commit, verification loops, context as the scarce resource).
- Community anti-sycophancy patterns (explicit banned phrases, direct-not-diplomatic).
- The AGENTS.md open standard (cross-tool portability via symlinks).
- bump `version` in `pyproject.toml`
- bump `__version__` in `kardscm/__init__.py`
- add a dated `CHANGELOG.md` entry
- run `make check`

Read once. Edit sections 10 and 11 for your project. Prune the rest over time. This file gets better the more you use it.
The two version files must match the tag.
Loading
Loading