-
Notifications
You must be signed in to change notification settings - Fork 1
Workflow
How to use agent-notes in a session. The lead orchestrator plans, decomposes, delegates to specialized agents, reviews, and verifies — it never writes code or greps files itself.
Every session follows four phases:
Phase 0: Plan & Approval → Phase 1: Analysis → Phase 2: Execution → Phase 3: Review → Phase 4: Verification
Before any side-effecting action, produce a plan and get user approval.
- Restate the request in your own words. State acceptance criteria.
- Decompose into discrete, independently verifiable subtasks. Identify dependencies.
-
Discover — if context is thin (unknown files, conventions, test coverage), dispatch
explorerfirst. Don't guess. - Clarify — if a real ambiguity remains that only the user can resolve, ask one focused question and stop.
- Present the plan: acceptance criteria, subtasks with assigned agents, files to touch, verification strategy, risks, out-of-scope items.
- Wait for explicit user approval ("go", "yes", "ok"). Silence does not count.
Trivial-request exemption: No plan needed for read-only responses (questions, recalls, git status) or purely mechanical one-line changes (typos, version bumps) with no behavioral impact.
Before touching any tool, analyze internally:
| Step | Question |
|---|---|
| Intent | What is the user actually asking? Bug fix, feature, refactor, audit, question? |
| Scope | Trivial (do it yourself), Small (1-2 agents), Medium (plan + 2-4 agents), Large (full plan + parallel agents)? |
| Decompose | Break into discrete subtasks. What files? What output? Hidden subtasks (tests, migrations)? |
| Dependencies | Which subtasks are independent (parallel)? Which are sequential? |
| Assign | Cheapest agent that can do the job (see Cost-Efficient Delegation) |
Delegate to agents following the dependency graph. Run independent subtasks in parallel.
- Broad tasks (whole codebase, audits): skip self-exploration, delegate immediately to specialists in parallel.
-
Narrow tasks (known files, specific questions): one
explorercall for discovery, then targeted agents. -
Batch related edits: one
codercall with 5 file edits beats 5codercalls with 1 edit each.
After implementation, before verification. Skip for mechanical changes (typos, config values, formatting).
- Send changed files to
reviewer. If security-sensitive, alsosecurity-auditorin parallel. If DB changes, alsodatabase-specialist. - Read the reviewer's output. Make your own judgment — not every suggestion is worth implementing.
- Approve or return with prioritized feedback to the same
codersession. - Maximum 2 review rounds. Perfection is the enemy of done.
Never declare done without verification.
- Review each agent's output — approve or reject. Re-delegate rejections to the same agent session.
-
Run tests if code was changed. Escalate to
test-runneronly if tests fail and need diagnosis. - Check cross-agent consistency — if multiple agents touched related code, verify no conflicts.
- Verify against the original request — re-read the user's prompt. Does the result satisfy every acceptance criterion?
Common patterns for delegating work. Each pipeline defines who does what and in what order.
explorer (discovery) → coder (implementation) → [reviewer, test-writer, security-auditor] → tech-writer (if user-facing)
explorer (reproduce + locate) → coder (minimal fix + regression test) → reviewer (verify)
[system-auditor, performance-profiler, security-auditor, database-specialist, api-reviewer] → lead synthesizes
No coder — audit pipelines produce findings, not fixes.
devops (implementation) → [reviewer, security-auditor]
explorer → lead answers
debugger (investigate root cause, Opus) → coder (apply fix, Sonnet) → reviewer (verify)
The agent system uses tiered models to minimize cost. Always pick the cheapest agent that can do the job.
| Tier | Agents | Cost | Use for |
|---|---|---|---|
| Scout (Haiku) | explorer, tech-writer, analyst | ~$0.25/M tokens | File discovery, pattern search, docs |
| Worker (Sonnet) | coder, reviewer, test-writer, security-auditor, devops, etc. | ~$3/M tokens | Implementation, review, tests |
| Reasoner (Opus) | architect, debugger | ~$15/M tokens | Deep design, complex root-cause analysis |
Rules of thumb:
- One
explorercall reading 20 files costs less than the lead reading one file - Never use
coderfor read-only analysis — usereviewerorexplorer - Never use Opus when Sonnet suffices — reserve
architectanddebuggerfor problems requiring multi-step reasoning - Combine related subtasks into one agent call — don't spawn one agent per bullet point
| Task | Agent | Why |
|---|---|---|
| Find files, search patterns | explorer |
Cheap (Haiku), fast, read-only |
| Implement features, fix bugs | coder |
Writes files, full tool access |
| Code quality review | reviewer |
Read-only analysis, Sonnet |
| Security concerns | security-auditor |
Specialized for auth/input/data |
| Investigate bugs |
debugger then coder
|
Debugger investigates (Opus), coder fixes (Sonnet) |
| Write tests | test-writer |
Reads source, detects framework |
| Fix failing tests | test-runner |
Runs tests, parses errors, applies fix |
| Documentation | tech-writer |
READMEs, API docs, changelogs |
| Infrastructure | devops |
Docker, CI/CD, deployment |
| Schema/queries | database-specialist |
Migrations, indexes, N+1 queries |
| System design | architect |
Module boundaries, domain models |
| Challenge a plan | devil |
Surfaces risks, hidden assumptions |
| Requirements | analyst |
User stories, acceptance criteria |
Memory persists discoveries across sessions. The lead updates the session note at every state change.
| Event | Action |
|---|---|
| Session start | Create session note: agent-notes memory add "<description>" "<scope>" session lead
|
| After each phase completes | Append update: agent-notes memory add "<description>" "Phase N — what shipped, files touched" session lead
|
| Non-obvious discovery | Write separate note: agent-notes memory add "<title>" "<body>" pattern|decision|mistake|context lead
|
| Scope change or redirect | Append update capturing new direction and rationale |
- Pattern — reusable solution or technique discovered in the codebase
- Decision — architectural choice with rationale (non-obvious tradeoffs)
- Mistake — recurring error to avoid
- Context — project background, constraints, stakeholder notes
- Things derivable by reading code (
git log,grep) - Standard framework behavior
- In-progress task state (use tasks for that)
- Information already in docs
See Memory-Storage for storage mode details.
Skills are on-demand workflows invoked with /skill-name. They load structured instructions for specific tasks.
| Situation | Skill |
|---|---|
| Writing a feature test-first | /tdd |
| Something is broken | /debugging-protocol |
| Cleaning up code | /refactoring-protocol |
| Reviewing code quality | /code-review |
| Stress-testing a plan | /grill-me |
| Comparing approaches | /brainstorming |
| Committing changes | /git |
| Saving a discovery | /obsidian-memory |
| Reducing token usage | /caveman |
See Skills for the full list.
A typical feature implementation session:
User: "Add rate limiting to the API endpoints"
Lead (Phase 0 — Plan):
1. explorer → discover existing middleware, API routes, test patterns
2. coder → implement rate limiting middleware + config
3. [reviewer, security-auditor, test-writer] → parallel review and tests
4. Verify: run tests, check cross-agent consistency
User: "go"
Lead (Phase 2 — Execute):
→ Dispatches explorer (Haiku) — finds middleware dir, route files, test setup
→ Dispatches coder (Sonnet) — implements middleware, wires into routes, adds config
→ Dispatches reviewer + security-auditor + test-writer in parallel (Sonnet)
Lead (Phase 3 — Review):
→ Reads reviewer findings, approves 2 of 3 suggestions
→ Sends feedback to coder for one adjustment
→ Coder applies fix
Lead (Phase 4 — Verify):
→ Runs test suite — all pass
→ Checks cross-agent consistency — no conflicts
→ Updates session memory
→ Reports done with summary
| Anti-pattern | Do this instead |
|---|---|
| Lead reads source files | Dispatch explorer
|
Lead runs grep / find
|
Dispatch explorer
|
| Lead writes or edits files | Dispatch coder / tech-writer
|
| One agent per bullet point | Combine related subtasks into one agent call |
| Using Sonnet for pure discovery | Use explorer (Haiku) |
| Re-exploring after agent returned the answer | Trust the report |
| Skipping tests before reporting done | Always run tests if code changed |
| Reporting "done" without updating session memory | Update memory at every state change |