Version: 1.0 Last Updated: 2026-01-01
This guide explains how to operate the agentic software development system from a user/operator perspective. It covers when to use different workflows, how to interact with the system, and what to expect at each phase.
- System Overview
- Choosing the Right Workflow
- Workflow Reference
- Essential Commands
- Memory Bank Structure
- Task Specifications
- Approval Gates & Sign-offs
- Quality Gates
- Common Patterns
- Troubleshooting
- Nuances & Gotchas
The agentic software development system is a spec-first orchestration framework built on three pillars:
-
Memory Bank (
agents/memory-bank/)- Durable, structured markdown knowledge base
- Single source of truth for patterns, decisions, and project context
- Version-controlled alongside code
-
Workflows (
agents/workflows/)- Executable markdown guides that drive agent behavior
- Four-phase structure: Requirements → Design → Implementation Planning → Execution
- Each phase has explicit inputs, outputs, and gates
-
Task Specs (
agents/specs/task-specs/)- Per-task specifications capturing the full lifecycle
- Requirements (EARS format), Design (diagrams + flows), Implementation Planning (tasks + tests), Execution (evidence + reflections)
- Durable artifacts that enable resumability and context preservation
Plan-first discipline: Agents create specs and obtain approval before implementing. This ensures:
- Durable context that survives across sessions
- Visible approval gates
- Testable acceptance criteria
- Clear traceability from requirements → design → code → tests
When starting work, choose the workflow that matches your task's scope and complexity:
┌─────────────────────────────────────────────────────┐
│ Does this involve multiple workstreams, repos, │
│ or cross-cutting contracts? │
└─────────────────┬───────────────────────────────────┘
│
┌─────────┴─────────┐
│ │
YES NO
│ │
▼ ▼
┌───────────────┐ ┌──────────────────┐
│ ORCHESTRATOR │ │ Is this clearly │
│ WORKFLOW │ │ bounded and │
└───────────────┘ │ small? │
└────────┬─────────┘
│
┌─────────┴─────────┐
│ │
YES NO
│ │
▼ ▼
┌─────────────┐ ┌──────────────┐
│ ONE-OFF │ │ ONE-OFF │
│ VIBE │ │ SPEC │
│ (no spec) │ │ (with spec) │
└─────────────┘ └──────────────┘
| Scenario | Workflow | Why |
|---|---|---|
| Adding a feature across frontend + backend + DB | Orchestrator | Multiple components, shared contracts |
| Implementing a spec someone else created | Implementer | Executing from approved spec |
| Writing a spec (no code yet) for a workstream | Spec Author | Spec-only deliverable |
| Adding a new API endpoint with tests | One-off Spec | Single bounded change, needs design approval |
| Fixing a typo or small refactor | One-off Vibe | Trivial, no spec overhead needed |
Golden Rule: When in doubt, ask the operator: "Should this be orchestrator mode or one-off?" If one-off: "Spec or vibe?"
File: agents/workflows/orchestrator.workflow.md
Use when:
- Multiple workstreams (e.g., frontend + backend + infrastructure)
- Cross-cutting contracts or interfaces
- Changes spanning multiple repositories or services
Phases:
- Requirements: Normalize request into ProblemBrief
- Design: Decompose into workstreams, identify contracts
- Implementation Planning: Assign spec authors, merge workstream specs into MasterSpec
- Execution: Gate implementers, track progress, integrate deliverables
Key Outputs:
ProblemBrief.md: Normalized goals, constraints, success criteria- Workstream specs (one per workstream)
MasterSpec.md: Merged spec with gate report- Contract registry entries (
agents/contracts/registry.yaml)
Commands:
# Create problem brief
node agents/scripts/reset-active-context.mjs --slug "feature-name"
# Provision per-workstream worktrees
node agents/scripts/manage-worktrees.mjs ensure --workstreams ws1,ws2,ws3
# Merge workstream specs
node agents/scripts/spec-merge.mjs --specs "ws1.md,ws2.md,ws3.md" --output MasterSpec.md
# Validate specs + Memory Bank references
npm run spec:finalizeApproval Gates:
- After Requirements: Approve ProblemBrief
- After Design: Approve workstream decomposition
- After Implementation Planning: Approve MasterSpec + gate report
- After Execution: Approve final integration
File: agents/workflows/spec-author.workflow.md
Use when:
- Assigned to write a workstream spec (no code implementation)
- Deliverable is a schema-compliant spec only
Phases:
- Requirements: Clarify scope, dependencies, contracts
- Design: Document flows, diagrams, interfaces
- Implementation Planning: Break into tasks, map tests to acceptance criteria
- Execution: Finalize and validate spec (no code)
Key Outputs:
- Workstream spec file (
agents/specs/workstream-specs/<name>.md) - Contract registry entries (if owning shared interfaces)
Commands:
# Validate spec compliance
node agents/scripts/spec-validate.mjs --specs agents/specs/workstream-specs/my-workstream.md
# Check schema compliance (front matter, required sections)
npm run spec:validateApproval Gates:
- After Implementation Planning: Spec approval before handing off to implementer
File: agents/workflows/implementer.workflow.md
Use when:
- Executing code from an approved MasterSpec or workstream spec
- Translating spec into working implementation
Prerequisites:
- Approved MasterSpec or workstream spec
- Gate report (if from orchestrator mode)
Phases:
- Requirements: Understand assigned workstream scope
- Design: Review flows, contracts, test strategy
- Implementation Planning: Sequence tasks, identify blockers
- Execution: Deliver code, map tests to acceptance criteria, validate
Key Outputs:
- Working code in per-workstream git branch
- Tests with evidence mapped to acceptance criteria
- Updated Memory Bank (if patterns/decisions emerged)
Commands:
# Create per-workstream git worktree
node agents/scripts/create-worktree.mjs --name ws-api --branch feature/api-impl
# Run quality checks (lint + code quality)
npm run phase:check
# Finalize before shipping (format + validate + quality)
npm run agent:finalizeApproval Gates:
- After Implementation Planning: Task sequencing approval
- After Execution: Code review + tests validation
File: agents/workflows/oneoff-spec.workflow.md
Use when:
- Single, bounded change with clear scope
- Needs design approval before implementation
- Not trivial enough for "vibe mode"
Phases:
- Requirements: EARS user stories + acceptance criteria
- Design: Flows, diagrams, edge cases
- Implementation Planning: Task breakdown, test mapping
- Execution: Implement, validate, reflect
Key Outputs:
- Task spec file (
agents/specs/task-specs/<YYYY-MM-DD>-<slug>.md) - Code implementation with tests
- Memory Bank updates (if applicable)
Commands:
# Create task spec
node agents/scripts/reset-active-context.mjs --slug "add-caching" --title "Add response caching"
# Load context for task
node agents/scripts/load-context.mjs --task agents/specs/task-specs/2025-01-01-add-caching.md
# Finalize before shipping
npm run agent:finalizeApproval Pattern:
- After Implementation Planning: Record human approval in "Decision & Work Log" section
- Format:
Approval: [Name] approved spec on [date]
Example Flow:
# 1. Create spec
node agents/scripts/reset-active-context.mjs --slug "auth-middleware"
# 2. Fill Requirements phase (EARS + acceptance criteria)
# 3. Fill Design phase (at least one Mermaid diagram)
# 4. Fill Implementation Planning (tasks + test mapping)
# 5. Get approval (record in Decision & Work Log)
# 6. Execute (implement code)
# 7. Map tests to acceptance criteria in Execution log
# 8. Run quality gates
npm run agent:finalize
# 9. Commit
git commit -m "feat: implement auth middleware
- Satisfies AC1, AC2, AC3
- See: agents/specs/task-specs/2025-01-01-auth-middleware.md"File: agents/workflows/oneoff-vibe.workflow.md
Use when:
- Small, clearly bounded change
- No spec overhead needed
- Examples: typo fixes, small refactors, trivial additions
Phases:
- Intake: Confirm scope is genuinely small
- Execution: Implement + validate
Scope Guardrail: If scope grows during execution, immediately switch to one-off-spec workflow.
Key Outputs:
- Code changes
- Quality gate pass
- Conventional commit message
Commands:
# Just implement directly, then finalize
npm run agent:finalize
# Commit with conventional format
git commit -m "fix: correct typo in README"No Formal Spec: Approvals and decisions can be recorded in the final response or commit message.
# Load required Memory Bank + workflow files for current task
node agents/scripts/load-context.mjs [--include-optional] [--list] [--task <path>]
# Example: Load context for a specific task spec
node agents/scripts/load-context.mjs --task agents/specs/task-specs/2025-01-01-my-feature.md
# List what would be loaded (dry run)
node agents/scripts/load-context.mjs --listAlways Loaded:
agents/workflows/oneoff.workflow.mdagents/workflows/oneoff-spec.workflow.mdagents/memory-bank/project.brief.mdagents/memory-bank/operating-model.mdagents/memory-bank/task-spec.guide.md- Current task spec (if
--taskflag provided)
Conditionally Loaded (with --include-optional):
agents/memory-bank/tech.context.md(only if substantive content)agents/memory-bank/best-practices/*.md(matched by domain/tags)
# Create a new per-task spec
node agents/scripts/reset-active-context.mjs --slug "<task-slug>" [--title "..."] [--date YYYY-MM-DD]
# Example:
node agents/scripts/reset-active-context.mjs --slug "add-caching" --title "Add response caching"
# Creates: agents/specs/task-specs/2025-01-01-add-caching.md# List files recursively with metadata
node agents/scripts/list-files-recursively.mjs --root <path> --pattern <pattern> [--types ts|md|all] [--regex] [--case-sensitive]
# Smart regex search with context lines and numbered output
node agents/scripts/smart-file-query.mjs --regex "<pattern>" [--glob "*.ts"] [--contextLines 3] [--json]
# Read multiple files with line numbers (enables single-pass note taking)
node agents/scripts/read-files.mjs --files "path1.md,path2.md" [--json]Single-Pass Discipline: These scripts emit line numbers so you can cite path:line without re-reading files. This conserves context and follows workflow discipline.
# Validate spec compliance (front matter, required sections, registry references)
node agents/scripts/spec-validate.mjs --specs "<path[,path...]>" [--registry agents/contracts/registry.yaml]
# Merge workstream specs into MasterSpec and generate gate report
node agents/scripts/spec-merge.mjs --specs "<path[,path...]>" --output <path> [--registry agents/contracts/registry.yaml]
# Create/manage per-workstream git worktrees
node agents/scripts/manage-worktrees.mjs ensure [--workstreams <ids>]
node agents/scripts/manage-worktrees.mjs list|status|remove|prune
# Create a single git worktree for implementer
node agents/scripts/create-worktree.mjs --name "<workstream-id>" [--branch "<branch-name>"] [--base "<git-ref>"]# Run all quality checks: format markdown + validate Memory Bank + lint + code quality
npm run agent:finalize
# Format markdown files under agents/
npm run format:markdown
# Validate Memory Bank: ensure referenced paths exist
npm run memory:validate
# Run linting fix + code quality check
npm run phase:check
# Spec-specific validation
npm run spec:finalize # Validate specs + Memory Bank references
npm run spec:validate # Validate spec compliance only
npm run spec:merge # Merge specs and generate gate report# Capture diff with line numbers for verification reports
node agents/scripts/git-diff-with-lines.mjs [--cached]The Memory Bank (agents/memory-bank/) is the single source of truth for durable knowledge.
| File | Purpose |
|---|---|
memory-bank.md |
Overview, retrieval policy (canonical for discovery rules) |
operating-model.md |
Four-phase loop expectations, artifact locations, tool references |
task-spec.guide.md |
Template and guidance for per-task specs |
project.brief.md |
High-level project context (filled in over time) |
tech.context.md |
Stack, tooling, entrypoints (include only if substantive) |
| Path | Purpose |
|---|---|
spec-orchestration.design.md |
Detailed spec-first pipeline, workstream decomposition |
testing.guidelines.md |
Testing boundaries, dependency injection, evidence mapping |
best-practices/software-principles.md |
General design principles (SoC, DRY, composition) |
best-practices/typescript.md |
TypeScript-specific patterns |
best-practices/<domain>.md |
Domain-specific reusable guidance |
Always include when loading context:
agents/workflows/oneoff.workflow.mdagents/workflows/oneoff-spec.workflow.mdagents/memory-bank/project.brief.mdagents/memory-bank/operating-model.mdagents/memory-bank/task-spec.guide.md- Current task spec (if it exists) via
--taskflag
Optional (gate by substance):
agents/memory-bank/tech.context.md(only if non-placeholder content)agents/memory-bank/best-practices/*.md(match bydomain/tagsin front matter)
Every task gets a per-task spec file at agents/specs/task-specs/<YYYY-MM-DD>-<slug>.md.
- EARS-formatted user stories + acceptance criteria (atomic, testable)
- Non-goals, constraints, risks, invariants
- Impacted components, interfaces, candidate files/tests to touch
- Retrieval sources consulted
EARS Format (Explicit, Atomic, Realistic, Specific):
- ✅ "When user clicks 'Save', endpoint returns 201 within 200ms"
- ❌ "System saves data quickly"
- Architecture notes (logical, data, control flows)
- At least one Mermaid sequence diagram for the primary path
- Interfaces/contracts, data shapes, edge/failure behaviors
- Performance, security, migration considerations
- Discrete tasks with outcomes, dependencies, owners
- Non-primitive fields and storage format definitions (if applicable)
- Test-to-acceptance-criteria traceability (each AC has planned verification)
- Documentation updates needed (user/dev/runbook/README and target files)
- Memory Bank canonical updates needed (which files and why)
- Sequencing, blockers, checkpoints
- Progress log (updates as reality changes)
- Evidence/tests tied to acceptance criteria
- Follow-ups and adjustments to the spec
- Final reflections
After each phase, log a reflection in the task spec and record approvals in the Decision & Work Log section:
## Decision & Work Log
### [Phase Name] Phase
- **Decision**: [What was decided]
- **Approval**: [Who approved and when]
- **Work Log**: [Progress notes this phase]| Workflow | Gate Point | What Needs Approval |
|---|---|---|
| Orchestrator | After Requirements | ProblemBrief |
| Orchestrator | After Design | Workstream decomposition |
| Orchestrator | After Implementation Planning | MasterSpec + gate report |
| Spec Author | After Implementation Planning | Workstream spec |
| Implementer | After Implementation Planning | Task sequencing |
| One-off Spec | After Implementation Planning | Task spec |
Approvals must be recorded in the Decision & Work Log section of the spec:
## Decision & Work Log
### Implementation Planning Phase
- **Approval**: John Smith approved spec on 2025-01-01 at 14:30
- **Rationale**: Reviewed task breakdown, test mapping, and Memory Bank update plan. All acceptance criteria have corresponding tests.Critical: Do not proceed to Execution phase without recorded approval for spec-based workflows.
Before shipping, all work must pass quality gates:
Run before creating PR or final commit:
npm run agent:finalizeThis runs:
- Markdown formatting (
npm run format:markdown) - Memory Bank validation (
npm run memory:validate) — ensures all inline code paths exist - Linting (
npm run phase:check) — code quality checks - Spec validation (if specs were modified) — schema compliance
- Task spec has all four phases complete
- EARS user stories + acceptance criteria are specific and testable
- Design includes at least one Mermaid sequence diagram
- Implementation Planning maps tests/evidence to each acceptance criterion
- All code changes tested and evidence cited in Execution log
- Decision & Work Log includes all approvals
- Memory Bank canonicals updated if needed
-
npm run memory:validatepasses -
npm run agent:finalizepasses - Commit message follows conventional commits format
- Reflection captured in task spec
# 1. Create task spec
node agents/scripts/reset-active-context.mjs --slug "my-feature" --title "Add feature X"
# 2. Load context
node agents/scripts/load-context.mjs --task agents/specs/task-specs/2025-01-01-my-feature.md
# 3. Follow oneoff-spec workflow
# - Fill Requirements (EARS + acceptance criteria)
# - Add Design (diagrams, flows, edge cases)
# - Create Implementation Planning (task breakdown, test mapping)
# - Request approval
# 4. Execute against spec
# - Implement code
# - Map tests to acceptance criteria
# - Update Memory Bank if needed
# - Run quality gates
# 5. Finalize
npm run agent:finalize
# 6. Propose commit message
git commit -m "feat: implement X
- Added feature X with tests
- Satisfies acceptance criteria AC1, AC2, AC3
- See: agents/specs/task-specs/2025-01-01-my-feature.md"# 1. Clarify request into ProblemBrief
node agents/scripts/reset-active-context.mjs --slug "multi-ws-feature"
# 2. Decompose into workstreams
# - Which teams/components own each part?
# - What contracts are shared?
# 3. Assign spec authors
# - Each gets: scope, dependencies, contract expectations
# - Reference: agents/workflows/spec-author.workflow.md
# 4. Collect workstream specs
# - Authors deliver validated specs
# 5. Merge and gate
node agents/scripts/spec-merge.mjs --specs "ws1.md,ws2.md,ws3.md" --output MasterSpec.md
# 6. Approve MasterSpec
# - Review gate report
# - Record approval in Decision & Work Log
# 7. Hand off to implementers
# - Each implementer uses: agents/workflows/implementer.workflow.md
# - Reference MasterSpec + their workstream specIf scope grows during a one-off-vibe task:
# 1. Stop execution immediately
# 2. Create a proper task spec
node agents/scripts/reset-active-context.mjs --slug "original-task"
# 3. Backfill Requirements and Design from work done so far
# 4. Complete Implementation Planning
# 5. Get approval
# 6. Resume execution in spec-based mode# Check which validation failed
npm run spec:validate
# Common causes:
# - Missing front matter (title, date, phase)
# - Missing required sections (Requirements, Design, etc.)
# - Invalid registry referencesFix: Review spec template and ensure all required sections are present.
# See which paths are invalid
npm run memory:validate
# Common causes:
# - Inline code paths in backticks don't exist in repo
# - File was moved/renamed but markdown not updatedFix: Update markdown references or restore missing files.
Symptoms: Agent repeatedly re-reading same files, losing context of earlier decisions.
Fix:
- Use single-pass discipline:
node agents/scripts/load-context.mjs --task <path> - Take numbered notes: scripts emit line numbers, cite
path:line - Delegate to sub-agents for isolated exploration
- Use
--include-optionalonly when truly needed
Symptoms: Execution log shows tests but no clear traceability to ACs.
Fix:
- In Implementation Planning, create explicit test-to-AC mapping:
| AC | Test Name | Evidence Location | |----|-----------|-------------------| | AC1: Returns 201 within 200ms | test_save_returns_201_under_200ms | tests/api.test.ts:45 |
- In Execution log, cite this mapping when reporting test results
Question: "I've finished Implementation Planning. Can I start coding?"
Answer:
- Spec-based workflows (one-off-spec, spec-author, orchestrator): NO. You need recorded approval in Decision & Work Log first.
- Vibe workflow: YES. No formal approval gate.
How to get approval:
- Present completed Implementation Planning to operator
- Operator reviews and approves
- Record approval in Decision & Work Log:
- **Approval**: [Operator name] approved spec on [date]
Even small tasks can be orchestrator if multiple workstreams are needed. Do not guess; ask the operator.
You cannot move from Implementation Planning → Execution without recorded approval in the Decision & Work Log (for spec-based work).
If reality changes during Execution, update the spec. The spec is the source of truth, not a static plan. Capture deviations and rationale in Execution log.
Context is scarce. Load once, take numbered notes, cite line numbers. Repeated pulls waste cycles and violate workflow discipline.
Decompose by interface boundaries, not org structure. Each workstream should own or depend on a clear contract.
Changes to canonicals under agents/memory-bank/ should go through PR/commit review, not be auto-merged. These are durable knowledge.
If scope grows during a one-off-vibe task, immediately switch to one-off-spec and create a proper spec. Do not try to retrofit.
"Acceptance criterion AC1: System returns 200 within 100ms" → "Test: test_save_returns_200_under_100ms" → Evidence in Execution log.
This is where approvals and key decisions live. It's separate from phase reflections. Make it readable and explicit.
Should not error before shipping. Run it locally before PR, run it in CI as a merge gate.
Every spec/workflow completion should propose a conventional commit:
# Features
git commit -m "feat(agents): implement orchestrator workflow"
# Fixes
git commit -m "fix(agents): correct task-spec.guide examples"
# Documentation
git commit -m "docs(agents): clarify retrieval policy"
# Refactoring
git commit -m "refactor(agents): simplify spec-merge logic"Multi-line commits (preferred for spec-based work):
git commit -m "feat: implement auth middleware
- Added JWT validation middleware
- Satisfies acceptance criteria AC1, AC2, AC3
- Updated Memory Bank: best-practices/auth.md
- See: agents/specs/task-specs/2025-01-01-auth-middleware.md"File: agents/contracts/registry.yaml
Tracks shared interfaces and contracts across workstreams:
- id: contract-api-v1
type: api-interface
path: agents/contracts/api-v1.contract.md
owner: ws-api # Which workstream owns this contract
version: 1Each contract has a markdown file documenting the interface:
# API Contract v1
## Request Format
...
## Response Format
...
## Error Cases
...Use npm run spec:finalize to validate registry references.
This system prioritizes:
- Durable context: Memory Bank keeps knowledge close to code
- Visible gates: Specs enforce clarity before implementation
- Spec-first discipline: Agents follow the four-phase loop with explicit approval gates
- Choose the right workflow upfront — ask the agent when in doubt
- Approve specs before execution — this is your control point
- Trust the quality gates —
npm run agent:finalizemust pass before shipping - Treat Memory Bank as canonical — changes should be reviewed like code
- Test traceability is non-negotiable — every AC needs evidence
For questions or issues, consult:
- Workflow files:
agents/workflows/*.md - Memory Bank:
agents/memory-bank/*.md - Task spec guide:
agents/memory-bank/task-spec.guide.md
End of Operator's Guide