You cloned the repo. You want to add an agent, write a skill, build a hook, or just understand how the pieces fit. This guide gets you there.
Everything flows through a four-layer dispatch:
User request
|
v
Router (/do) -- classifies intent, picks agent + skill
|
v
Agent (*.md) -- domain expert (Go, Python, K8s, review...)
|
v
Skill (SKILL.md) -- workflow methodology (TDD, debugging, PR workflow...)
|
v
Script (*.py) -- deterministic operations (no LLM judgment)
The router parses your request, matches triggers to find the right agent, pairs it with a skill if the task type calls for one, and the agent executes using that skill's methodology. Hooks fire at lifecycle events (session start, before/after tool use, compaction, stop) to inject context, learn from errors, and enforce quality gates. Scripts do the mechanical work -- index generation, voice validation, learning database queries -- where you want deterministic behavior, not LLM judgment calls.
agents/ Domain experts. One .md per agent, optional references/ subdirectory
INDEX.json Generated routing index (don't hand-edit -- run the script)
skills/ Workflow methodologies. One directory per skill, each with SKILL.md
INDEX.json Generated skill index
hooks/ Event-driven Python scripts. Fire on lifecycle events
lib/ Shared utilities (hook_utils.py, learning_db_v2.py, feedback_tracker.py)
tests/ Hook-specific tests
scripts/ Deterministic CLI tools. Python scripts for mechanical operations
tests/ Script-specific tests
commands/ Slash command definitions (markdown files that wire up user-facing commands)
templates/ Template directories for scaffolding (e.g., reddit data templates)
adr/ Architecture Decision Records. Numbered markdown files tracking why decisions were made
A few things that aren't obvious from the listing: agents/ and skills/ both have INDEX.json files that are generated by scripts (scripts/generate-agent-index.py and scripts/generate-skill-index.py). The hooks/lib/ directory is where shared code lives -- hooks import from there, not from each other. The eval system lives in skills/skill-eval/ (methodology) and scripts/skill_eval/ (runner scripts). And the services/ directory exists for optional service integrations.
The toolkit has specialized creator agents for each component type. Tell /do what you want, describe the domain and purpose in your prompt, and the creator agent handles structure, registration, and routing integration.
/do create an agent for [your domain]
Describe what the agent should specialize in, what triggers should route to it, and what patterns it should enforce. The skill-creator handles the rest: file creation, frontmatter, operator context, anti-patterns, index registration, and routing table integration.
Example prompts:
/do create an agent for Terraform infrastructure management that knows HCL, state management, and module patterns/do create an agent for Redis caching that specializes in data structures, eviction policies, and cluster management
The agent creator uses the AGENT_TEMPLATE_V2.md template and produces a complete agent with operator context, hardcoded/default/optional behaviors, anti-patterns, and reference files.
/do create a skill for [your workflow]
Describe the methodology, phases, and quality gates. The skill-creator builds the skill directory, SKILL.md with frontmatter, phase definitions, and updates the index.
Example prompts:
/do create a skill for database migration safety with pre-migration checks, rollback validation, and post-migration verification/do create a skill for API contract testing that validates OpenAPI specs against implementation
/do create a hook for [your purpose]
Describe the event type, what it should detect or inject, and any performance constraints. The hook-development-engineer creates the hook with proper event handling, hooks/lib/ integration, and settings.json registration.
Example prompts:
/do create a hook that warns when test files are modified without updating test fixtures/do create a PostToolUse hook that detects when a git merge conflict marker is written to a file
Hooks must meet a 50ms performance target since they fire on every tool call or prompt.
- Agents (
agents/*.md) know what to do -- domain expertise, patterns, anti-patterns - Skills (
skills/*/SKILL.md) know how to structure work -- phases, gates, methodology - Hooks (
hooks/*.py) fire on lifecycle events -- JSON in, JSON out, 50ms budget - Scripts (
scripts/*.py) do deterministic work -- linting, indexing, validation
Multi-phase workflows that were previously standalone pipelines (in pipelines/) are now reference files within the workflow skill (skills/workflow/references/). If you need a structured multi-phase workflow, add a reference to skills/workflow/references/ rather than creating a standalone pipeline.
The /do router connects everything. After creating any component, test it:
/do [request that should trigger your new component]
The routing banner shows which agent and skill were selected. If routing doesn't match, the creator agents handle trigger registration automatically.
This repo uses a structured workflow for changes. It's more than "branch, commit, push" -- there's a review loop built in.
- Branch -- create a feature branch off main. Convention:
feature/description,fix/description,refactor/description. - Implement -- make your changes. Agents, skills, hooks, scripts, whatever.
- Wave review -- run
/pr-reviewwhich dispatches parallel reviewer agents against your changes. They check code quality, naming, security, dead code, error handling, and more. - Fix -- address reviewer findings. The PR workflow does up to 3 review-fix iterations automatically.
- Retro -- after significant work, the system captures learnings into the learning database. These get injected into future sessions.
- Graduate -- mature retro entries (high confidence, validated multiple times) get promoted into agent and skill files permanently.
- Commit -- conventional commit format. No AI attribution lines. Focus on what and why.
- Push -- push to remote with tracking.
- PR -- create the pull request via
gh pr create. - CI -- wait for CI checks to pass.
- Merge -- after CI and any human review.
The pr-workflow skill automates steps 3-10. You can invoke it with /pr-workflow or let /do route to it when you say "create a PR" or "submit changes."
For repos under protected organizations (configured in scripts/classify-repo.py), every git action requires user confirmation. The workflow won't auto-commit, auto-push, or auto-create PRs for those repos.
Tests use pytest. Two main test directories:
hooks/tests/ Hook-specific tests
scripts/tests/ Script-specific tests
# All tests
pytest -v
# Hook tests only
pytest hooks/tests/ -v
# Script tests only
pytest scripts/tests/ -v
# Single test file
pytest hooks/tests/test_learning_system.py -v
# With coverage
pytest --cov=hooks --cov=scripts -v- Hooks: Feed JSON input, assert JSON output. Test the happy path and the silent path (when the hook has nothing to say). Mock external dependencies like the learning database.
- Scripts: Test CLI interfaces. Scripts are deterministic -- given input X, they should always produce output Y. No LLM judgment to worry about.
- Agents/Skills: The
skills/skill-eval/skill andscripts/skill_eval/runner provide an evaluation harness for testing agent and skill quality. This is more about quality assessment than unit testing.
scripts/tests/fixtures/ contains test data. hooks/tests/ has its own fixtures inlined in test files.
Conventional commits. Format: type(scope): description. Types: feat, fix, refactor, docs, test, chore. Scope is optional but helpful. Examples: feat(reddit): add ban subcommand, fix(hooks): handle missing session ID.
No AI attribution. Don't add "Generated with Claude Code" or "Co-Authored-By: Claude" to commits. The CLAUDE.md is explicit about this.
Branch safety. Never commit directly to main. Always work on a feature branch. The hooks and skills enforce this -- pretool-branch-safety.py will block commits to protected branches.
Wabi-sabi in docs. Documentation should read like a human wrote it. Contractions are fine. Sentence fragments where they're clear. Varied sentence length -- short punchy ones mixed with longer explanatory ones. Never use "delve", "leverage", "comprehensive", "robust", "streamline", or "empower." The scripts/scan-ai-patterns.py script catches these.
INDEX.json is generated. Don't hand-edit agents/INDEX.json or skills/INDEX.json. Run the generation scripts. They parse frontmatter and build the index.
Hooks go in hooks/, sync deploys them. Write your hook in the repo's hooks/ directory. The sync hook copies it to ~/.claude/hooks/ on session start. Register it in settings.json using the $HOME/.claude/hooks/ path.
Scripts are deterministic. If it involves LLM judgment, it's an agent or skill. If it's mechanical (parse files, query a database, validate patterns), it's a script. Don't blur the line.
50ms hook budget. Hooks fire frequently. Keep them fast. Profile if you're not sure. The scripts/benchmark-hooks.py script can help.