diff --git a/docs/plans/2026-04-06-v09-formulas-polish.md b/docs/plans/2026-04-06-v09-formulas-polish.md new file mode 100644 index 0000000..4bbcb41 --- /dev/null +++ b/docs/plans/2026-04-06-v09-formulas-polish.md @@ -0,0 +1,172 @@ +# v0.9 — Formulas + Polish + +## Context + +All four roles are wired (concierge v0.5, architect v0.4, expert v0.2, researcher v0.8). The architect currently sends tasks one at a time via `send_task`. Recurring multi-step patterns (feature implementation, bug triage, code review) require manually re-creating the same dependency graph each time. Formulas codify these patterns as reusable TOML templates that the architect instantiates in one call, bulk-creating tasks with correct dependency edges. The daemon's existing taskboard + `EvaluateDeps` handles the rest. + +Additionally: config changes currently require daemon restart, and there's no hardening around partial writes for config files. + +## Phase 1: Formula Parsing — `internal/formula/` + +New package. Minimal structs, TOML parsing, DAG validation. + +### Files + +- **`internal/formula/formula.go`** — `Formula`, `Step` structs + `Load(path)` + `LoadAll(dir)` + `Validate()` +- **`internal/formula/formula_test.go`** — happy path, cycle detection, duplicate IDs, missing depends_on refs, empty steps + +### Design + +```go +type Step struct { + ID string `toml:"id"` + Role string `toml:"role"` // "concierge", "architect", or expert name + Title string `toml:"title"` + Description string `toml:"description"` + DependsOn []string `toml:"depends_on"` +} + +type Formula struct { + Description string `toml:"description"` + Steps []Step `toml:"steps"` +} +``` + +- `Load(path string) (*Formula, error)` — parse single TOML file, call `Validate()` +- `LoadAll(dir string) (map[string]*Formula, error)` — scan `*.toml`, key by filename sans extension +- `Validate(f *Formula) error` — checks: + - At least one step + - No duplicate step IDs + - All `depends_on` refs point to valid step IDs within the formula + - No cycles (Kahn's algorithm, same approach as `taskboard.DetectCycles`) + - Non-empty `id`, `role`, `title` on each step + +Uses `github.com/BurntSushi/toml` (already a dependency via `config`). + +## Phase 2: Formula Instantiation — architect MCP tool + +New `instantiate_formula` tool registered in `RegisterArchitectTools`. + +### Files + +- **`internal/mcp/architect_tools.go`** — add `instantiate_formula` tool registration + handler +- **`internal/mcp/architect_tools_test.go`** — test instantiation produces correct messages + +### Tool Schema + +```text +instantiate_formula + formula: string (required) — formula name (filename without .toml) + prefix: string (required) — ID prefix for generated tasks (e.g., "feat-auth") + overrides: string (optional) — JSON object: step ID → custom body text + experts: string (optional) — JSON object: step ID → specific expert name +``` + +### Handler Logic + +1. Load formula from `{poolDir}/formulas/{formula}.toml` via `formula.Load()` +2. Parse `overrides` and `experts` JSON if provided +3. For each step, build a `mail.Message`: + - `ID` = `{prefix}-{step.id}` + - `From` = `"architect"` + - `To` = step's role, overridden by `experts[step.id]` if present + - `Type` = `mail.TypeTask` + - `Body` = `overrides[step.id]` if present, else step's description (prepend step title as heading) + - `DependsOn` = step's `depends_on` with prefix applied (e.g., `["gather"]` → `["feat-auth-gather"]`) + - `Priority` = `mail.PriorityNormal` + - `Timestamp` = `time.Now().UTC()` +4. Post all messages via `postMessage(poolDir, msg)` +5. Return summary: formula name, N tasks created, task IDs + +### Validation + +- Formula must exist +- Steps with `role = "experts"` MUST have an entry in `experts` map (architect provides the name) +- Prefix must be filename-safe (same check as message ID: `filepath.Base(prefix) == prefix`) + +## Phase 3: Config Hot-Reload + +Watch `pool.toml` for changes, re-parse, validate, swap config under lock. + +### Files + +- **`internal/daemon/daemon.go`** — add `reloadConfig()` method, handle pool.toml events in `Run()` +- **`internal/daemon/watcher.go`** — extend `Run()` to also emit events for `Write` on `.toml` files (not just `Create` on `.md`) +- **`internal/daemon/daemon_test.go`** — test: modify pool.toml while daemon runs, verify new expert becomes spawnable + +### Watcher Changes + +Current watcher only handles `Create` events on `.md` files. For config reload: + +- Add a new event type flag or use `WatcherEvent.Dir` to distinguish config events +- In `Run()`, also match `fsnotify.Write` events on files ending in `.toml` +- Apply `waitForStable` before emitting +- Pool directory is already watched (for postoffice). But `pool.toml` lives at `{poolDir}/pool.toml` — the watcher watches `{poolDir}/postoffice/`, not `{poolDir}/` itself. **Must add `poolDir` itself to the watcher.** + +### Daemon Changes + +In `Run()`, detect pool.toml events (check `event.Path` ends with `pool.toml`): + +```go +func (d *Daemon) reloadConfig() error { + newCfg, err := config.LoadPool(d.poolDir) + if err != nil { + d.logger.Warn("Config reload failed, keeping current config", "error", err) + return err // keep old config + } + d.mu.Lock() + defer d.mu.Unlock() + oldCfg := d.cfg + d.cfg = newCfg + // Rebuild shared expert lookup set + d.sharedSet = make(map[string]bool, len(newCfg.Shared.Include)) + for _, name := range newCfg.Shared.Include { + d.sharedSet[name] = true + } + // Diff and log changes + d.logConfigDiff(oldCfg, newCfg) + return nil +} +``` + +For expert add/remove: after reload, call `ensureDirs()` and add new inbox dirs to the watcher. For removals, stop watching but preserve directories. + +New event type: `EventConfigReloaded`. + +## Phase 4: Hardening + +- **`internal/daemon/daemon.go`** — add `formulas/` to `ensureDirs()` +- **Audit atomic writes** — verify all daemon-consumed file paths use `atomicfile.WriteFile` (taskboard already does, postoffice already does, verify nothing was missed) +- **TOML validation in reload** — already handled by `config.LoadPool` which calls `Validate()`. Parse failure = log warning + keep old config (built into Phase 3) + +## Implementation Order + +1. Phase 1 (formula parsing) — standalone, no dependencies +2. Phase 2 (instantiation tool) — depends on Phase 1 +3. Phase 3 (config hot-reload) — independent of 1+2 +4. Phase 4 (hardening) — after all above, quick sweep + +Phases 1 and 3 can be developed in parallel, but I'll do them sequentially since 1 is a prerequisite for 2 and I want to test the full flow. + +## Key Patterns to Follow + +- **Test file structure**: comment block at top documenting coverage matrix (see existing `*_test.go` files) +- **Chicago-school tests**: real objects, fakes only at I/O boundaries +- **Error wrapping**: `fmt.Errorf("context: %w", err)` +- **Atomic writes**: `atomicfile.WriteFile` for all daemon-consumed files +- **Reuse `postMessage`** (`internal/mcp/postoffice.go`) for formula instantiation +- **Reuse `splitCSV`** (`internal/mcp/architect_tools.go`) for parsing parameters +- **MCP tool registration** follows `RegisterArchitectTools` pattern +- **Event bus** emits at slog log points — add events for new behaviors + +## Verification + +After each phase: +1. `make test` — all tests pass +2. `make check` — vet + lint + test + +End-to-end verification: +1. Create a test formula TOML, load and validate it +2. Instantiate via the MCP tool, verify N messages posted with correct dependency edges +3. Modify pool.toml in a daemon integration test, verify config swap + new expert watcher +4. `make test-cover` — check coverage for new packages diff --git a/internal/daemon/daemon.go b/internal/daemon/daemon.go index c04e2d9..27602d6 100644 --- a/internal/daemon/daemon.go +++ b/internal/daemon/daemon.go @@ -147,6 +147,11 @@ func (d *Daemon) Run(ctx context.Context) error { } defer watcher.Close() + // Watch pool dir itself for pool.toml changes (config hot-reload) + if err := watcher.Add(d.poolDir); err != nil { + return fmt.Errorf("watching pool dir: %w", err) + } + // Watch postoffice postofficeDir := filepath.Join(d.poolDir, "postoffice") if err := watcher.Add(postofficeDir); err != nil { @@ -254,6 +259,14 @@ func (d *Daemon) Run(ctx context.Context) error { return nil } + // Config reload events + if event.Kind == EventKindConfig { + if filepath.Base(event.Path) == "pool.toml" { + d.reloadConfig(watcher) + } + continue + } + if event.Dir == postofficeDir { d.handlePostoffice(childCtx, event.Path) } else if event.Dir == approvalsDir { @@ -1245,6 +1258,7 @@ func (d *Daemon) ensureDirs() error { filepath.Join(d.poolDir, "postoffice"), filepath.Join(d.poolDir, "contracts"), filepath.Join(d.poolDir, "approvals"), + filepath.Join(d.poolDir, "formulas"), // Built-in roles get top-level inbox + logs directories filepath.Join(d.poolDir, "architect", "inbox"), filepath.Join(d.poolDir, "architect", "logs"), @@ -1295,6 +1309,154 @@ func (d *Daemon) ensureDirs() error { return nil } +// reloadConfig re-reads pool.toml, validates, and swaps the config under lock. +// On parse/validation failure, the old config is kept and a warning is logged. +// The watcher parameter is used to add inbox watches for new experts. +// +// The lock is held only for the config swap and diff. Filesystem I/O (dir +// creation, watcher registration) runs outside the lock to avoid blocking +// concurrent goroutines that need d.mu (registerTask, handleInbox, etc.). +func (d *Daemon) reloadConfig(watcher *Watcher) { + newCfg, err := config.LoadPool(d.poolDir) + if err != nil { + d.logger.Warn("Config reload failed, keeping current config", + "error", err, + ) + return + } + + // Phase 1: Swap config and compute diff under lock (fast, no I/O) + d.mu.Lock() + oldCfg := d.cfg + d.cfg = newCfg + + // Rebuild shared expert lookup set + d.sharedSet = make(map[string]bool, len(newCfg.Shared.Include)) + for _, name := range newCfg.Shared.Include { + d.sharedSet[name] = true + } + + // Rebuild curation scheduler with new thresholds + d.curation = newCurationScheduler(&newCfg.Curation, d.poolDir, d.logger) + + // Diff pool-scoped experts + oldExperts := make(map[string]bool, len(oldCfg.Experts)) + for name := range oldCfg.Experts { + oldExperts[name] = true + } + var addedExperts, removedExperts []string + for name := range newCfg.Experts { + if !oldExperts[name] { + addedExperts = append(addedExperts, name) + } + } + for name := range oldExperts { + if _, ok := newCfg.Experts[name]; !ok { + removedExperts = append(removedExperts, name) + } + } + + // Diff shared experts + oldShared := make(map[string]bool, len(oldCfg.Shared.Include)) + for _, name := range oldCfg.Shared.Include { + oldShared[name] = true + } + var addedShared, removedShared []string + for _, name := range newCfg.Shared.Include { + if !oldShared[name] { + addedShared = append(addedShared, name) + } + } + for _, name := range oldCfg.Shared.Include { + if !d.sharedSet[name] { + removedShared = append(removedShared, name) + } + } + + d.mu.Unlock() + + added := append(addedExperts, addedShared...) + removed := append(removedExperts, removedShared...) + + // Phase 2: Filesystem setup outside the lock (slow I/O) + var setupFailed bool + + // Pool-scoped expert dirs + watchers + for _, name := range addedExperts { + expertBase := filepath.Join(d.poolDir, "experts", name) + for _, sub := range []string{"inbox", "logs"} { + if err := os.MkdirAll(filepath.Join(expertBase, sub), 0o755); err != nil { + d.logger.Error("Failed to create directory for new expert", + "expert", name, + "dir", sub, + "error", err, + ) + setupFailed = true + } + } + inboxDir := mail.ResolveInbox(d.poolDir, name) + if err := watcher.Add(inboxDir); err != nil { + d.logger.Error("Failed to watch new expert inbox", + "expert", name, + "error", err, + ) + setupFailed = true + } + } + + // Shared expert dirs + watchers + for _, name := range addedShared { + inboxDir := mail.ResolveSharedInbox(d.poolDir, name) + if err := os.MkdirAll(inboxDir, 0o755); err != nil { + d.logger.Error("Failed to create shared inbox directory", + "expert", name, + "error", err, + ) + setupFailed = true + } + logDir := mail.ResolveSharedLogDir(d.poolDir, name) + if err := os.MkdirAll(logDir, 0o755); err != nil { + d.logger.Error("Failed to create shared log directory", + "expert", name, + "error", err, + ) + setupFailed = true + } + if err := watcher.Add(inboxDir); err != nil { + d.logger.Error("Failed to watch shared expert inbox", + "expert", name, + "error", err, + ) + setupFailed = true + } + } + + for _, name := range removed { + d.logger.Info("Expert removed from config (inbox preserved)", + "expert", name, + ) + } + + if setupFailed { + d.logger.Warn("Config reloaded with partial setup failures", + "experts_added", added, + "experts_removed", removed, + ) + return + } + + d.logger.Info("Successfully reloaded config", + "experts_added", added, + "experts_removed", removed, + ) + + d.events.emit(Event{ + Type: EventConfigReloaded, + Timestamp: time.Now(), + Data: ConfigReloadedData{ExpertsAdded: added, ExpertsRemoved: removed}, + }) +} + // sharedNamesMap returns the cached shared expert lookup set. May be nil. func (d *Daemon) sharedNamesMap() map[string]bool { return d.sharedSet diff --git a/internal/daemon/daemon_test.go b/internal/daemon/daemon_test.go index bfbd9c4..a7ebb80 100644 --- a/internal/daemon/daemon_test.go +++ b/internal/daemon/daemon_test.go @@ -37,6 +37,10 @@ // - ResearcherSpawn: message to researcher routes and spawns with configured model // - ResearcherConfigResolution: researcher uses haiku, auth uses sonnet // - ResearcherInboxDrainOnStart: pre-existing inbox message processed on startup +// +// Config hot-reload: +// - ConfigReloadAddsExpert: modify pool.toml to add expert, verify new inbox watchable +// - ConfigReloadInvalidKeepsOld: write invalid pool.toml, verify old config preserved package daemon_test import ( @@ -2487,3 +2491,122 @@ model = "haiku" waitForTaskStatus(t, poolDir, "task-predrain-res", taskboard.StatusCompleted) shutdownDaemon(t, cancel, errCh) } + +// --- Config hot-reload tests --- + +func TestDaemon_ConfigReloadAddsExpert(t *testing.T) { + poolDir := t.TempDir() + + // Start with only "auth" expert + initialToml := `[pool] +name = "reload-test" +project_dir = "` + poolDir + `" + +[experts.auth] +` + cfg := writePoolConfig(t, poolDir, initialToml) + spawner := &fakeSpawner{} + cancel, errCh := startTestDaemon(t, cfg, poolDir, spawner, + daemon.WithDrainTimeout(2*time.Second)) + + // Modify pool.toml to add a second expert + newToml := `[pool] +name = "reload-test" +project_dir = "` + poolDir + `" + +[experts.auth] + +[experts.payments] +model = "haiku" +` + if err := os.WriteFile(filepath.Join(poolDir, "pool.toml"), []byte(newToml), 0o644); err != nil { + t.Fatalf("writing updated pool.toml: %v", err) + } + + // Wait for the config reload to create the new expert inbox + paymentsInbox := filepath.Join(poolDir, "experts", "payments", "inbox") + deadline := time.Now().Add(5 * time.Second) + for time.Now().Before(deadline) { + if _, err := os.Stat(paymentsInbox); err == nil { + break + } + time.Sleep(100 * time.Millisecond) + } + + if _, err := os.Stat(paymentsInbox); os.IsNotExist(err) { + t.Error("expected payments inbox to be created after config reload") + } + + // Send a task to the new expert and verify it gets spawned + writeMessage(t, filepath.Join(poolDir, "postoffice"), "task-reload-001", "architect", "payments") + waitForTaskStatus(t, poolDir, "task-reload-001", taskboard.StatusCompleted) + + spawner.mu.Lock() + found := false + for _, call := range spawner.calls { + if call.Name == "payments" { + found = true + break + } + } + spawner.mu.Unlock() + + if !found { + t.Error("expected payments expert to be spawned after config reload") + } + + shutdownDaemon(t, cancel, errCh) +} + +func TestDaemon_ConfigReloadInvalidKeepsOld(t *testing.T) { + poolDir := t.TempDir() + + initialToml := `[pool] +name = "reload-test" +project_dir = "` + poolDir + `" + +[experts.auth] +` + cfg := writePoolConfig(t, poolDir, initialToml) + spawner := &fakeSpawner{} + + // Capture log output to verify warning + var logBuf bytes.Buffer + logger := slog.New(slog.NewTextHandler(&logBuf, &slog.HandlerOptions{Level: slog.LevelDebug})) + d := daemon.New(cfg, poolDir, logger, + daemon.WithSpawner(spawner), + daemon.WithDrainTimeout(2*time.Second)) + + ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) + errCh := make(chan error, 1) + go func() { errCh <- d.Run(ctx) }() + time.Sleep(500 * time.Millisecond) + + // Write invalid TOML + if err := os.WriteFile(filepath.Join(poolDir, "pool.toml"), []byte(`this is [[[not valid toml`), 0o644); err != nil { + t.Fatalf("writing invalid pool.toml: %v", err) + } + + // Wait for the reload attempt + time.Sleep(1 * time.Second) + + // Verify daemon is still running and the old config is preserved — + // send a task to auth (original expert) and verify it completes + writeMessage(t, filepath.Join(poolDir, "postoffice"), "task-still-works", "architect", "auth") + waitForTaskStatus(t, poolDir, "task-still-works", taskboard.StatusCompleted) + + // Verify warning was logged + if !strings.Contains(logBuf.String(), "Config reload failed") { + t.Error("expected 'Config reload failed' warning in logs") + } + + cancel() + select { + case err := <-errCh: + if err != nil { + t.Errorf("daemon returned error: %v", err) + } + case <-time.After(3 * time.Second): + t.Error("daemon did not shut down in time") + } +} diff --git a/internal/daemon/events.go b/internal/daemon/events.go index b69380d..6b299fe 100644 --- a/internal/daemon/events.go +++ b/internal/daemon/events.go @@ -16,6 +16,7 @@ const ( EventTaskCancelled EventType = "task.cancelled" EventTaskUnblocked EventType = "task.unblocked" EventCurationTriggered EventType = "curation.triggered" + EventConfigReloaded EventType = "config.reloaded" ) // Event is a structured daemon event emitted at state transitions. @@ -68,6 +69,11 @@ type CurationTriggeredData struct { Reason string `json:"reason"` } +type ConfigReloadedData struct { + ExpertsAdded []string `json:"experts_added,omitempty"` + ExpertsRemoved []string `json:"experts_removed,omitempty"` +} + // EventBufSize is the subscriber channel buffer capacity. Subscribers that // can't keep up will miss events once the buffer fills (non-blocking emit). const EventBufSize = 64 diff --git a/internal/daemon/watcher.go b/internal/daemon/watcher.go index 05f42d2..d0e9f7e 100644 --- a/internal/daemon/watcher.go +++ b/internal/daemon/watcher.go @@ -7,15 +7,27 @@ import ( "os" "path/filepath" "strings" + "sync" "time" "github.com/fsnotify/fsnotify" ) +// EventKind distinguishes the type of file event. +type EventKind int + +const ( + // EventKindMail is a new .md file (mail delivery). + EventKindMail EventKind = iota + // EventKindConfig is a .toml file write (config change). + EventKindConfig +) + // WatcherEvent carries a validated file event. type WatcherEvent struct { - Path string // full path to the new file - Dir string // which watched directory this came from + Path string // full path to the new file + Dir string // which watched directory this came from + Kind EventKind // mail or config } // Watcher watches directories for new .md files and emits events @@ -24,7 +36,9 @@ type Watcher struct { fsw *fsnotify.Watcher events chan WatcherEvent logger *slog.Logger - dirs map[string]bool // tracked directories for Dir resolution + + mu sync.Mutex + dirs map[string]bool // tracked directories for Dir resolution } const ( @@ -49,12 +63,15 @@ func NewWatcher(logger *slog.Logger) (*Watcher, error) { } // Add registers a directory to watch. The directory must exist. +// Safe to call concurrently with Run (dirs map is guarded by w.mu). func (w *Watcher) Add(dir string) error { absDir, err := filepath.Abs(dir) if err != nil { return err } + w.mu.Lock() w.dirs[absDir] = true + w.mu.Unlock() return w.fsw.Add(absDir) } @@ -64,8 +81,11 @@ func (w *Watcher) Events() <-chan WatcherEvent { } // Run processes raw fsnotify events until ctx is cancelled. -// It filters for Create events on .md files and applies a partial-write -// stability check before emitting on the Events channel. +// It handles two kinds of events: +// - Create events on .md files (mail delivery) → EventKindMail +// - Write events on .toml files (config changes) → EventKindConfig +// +// Both kinds apply a partial-write stability check before emitting. func (w *Watcher) Run(ctx context.Context) { defer close(w.events) @@ -79,34 +99,52 @@ func (w *Watcher) Run(ctx context.Context) { return } - if !event.Has(fsnotify.Create) { - continue - } - - if !strings.HasSuffix(event.Name, ".md") { - continue - } - - // Skip temp files from our own atomic writes base := filepath.Base(event.Name) - if strings.HasPrefix(base, ".routing-") { - continue - } - path := event.Name - if err := waitForStable(path); err != nil { - w.logger.Warn("Skipping file. Reason: not stable after polling", - "path", path, - "error", err, - ) + + // Mail events: Create on .md files + if event.Has(fsnotify.Create) && strings.HasSuffix(base, ".md") { + // Skip temp files from our own atomic writes + if strings.HasPrefix(base, ".routing-") { + continue + } + + if err := waitForStable(path); err != nil { + w.logger.Warn("Skipping file. Reason: not stable after polling", + "path", path, + "error", err, + ) + continue + } + + dir := w.resolveDir(path) + select { + case w.events <- WatcherEvent{Path: path, Dir: dir, Kind: EventKindMail}: + case <-ctx.Done(): + return + } continue } - dir := w.resolveDir(path) - select { - case w.events <- WatcherEvent{Path: path, Dir: dir}: - case <-ctx.Done(): - return + // Config events: Write or Create on .toml files. + // Write catches in-place edits. Create catches atomic + // temp-file + rename updates (which emit Create, not Write). + if (event.Has(fsnotify.Write) || event.Has(fsnotify.Create)) && strings.HasSuffix(base, ".toml") { + if err := waitForStable(path); err != nil { + w.logger.Warn("Skipping config file. Reason: not stable after polling", + "path", path, + "error", err, + ) + continue + } + + dir := w.resolveDir(path) + select { + case w.events <- WatcherEvent{Path: path, Dir: dir, Kind: EventKindConfig}: + case <-ctx.Done(): + return + } + continue } case err, ok := <-w.fsw.Errors: @@ -127,10 +165,9 @@ func (w *Watcher) Close() error { func (w *Watcher) resolveDir(path string) string { absPath, _ := filepath.Abs(path) dir := filepath.Dir(absPath) - if w.dirs[dir] { - return dir - } - // Fallback — shouldn't happen if all watched dirs are registered + w.mu.Lock() + _ = w.dirs[dir] // lookup under lock for race safety + w.mu.Unlock() return dir } diff --git a/internal/formula/formula.go b/internal/formula/formula.go new file mode 100644 index 0000000..bfd8d7b --- /dev/null +++ b/internal/formula/formula.go @@ -0,0 +1,177 @@ +// Package formula handles TOML workflow template parsing and validation. +// +// Formulas are reusable DAG templates that the architect instantiates to +// bulk-create tasks with correct dependency edges. The daemon's existing +// taskboard + EvaluateDeps handles dispatch — no LLM needed for sequencing. +package formula + +import ( + "fmt" + "os" + "path/filepath" + "strings" + + "github.com/BurntSushi/toml" +) + +// Step is a single node in a formula's dependency graph. +type Step struct { + ID string `toml:"id"` + Role string `toml:"role"` // "concierge", "architect", or expert name + Title string `toml:"title"` + Description string `toml:"description"` + DependsOn []string `toml:"depends_on"` +} + +// Formula is a TOML workflow template with a description and step DAG. +type Formula struct { + Description string `toml:"description"` + Steps []Step `toml:"steps"` +} + +// Load reads and validates a single formula TOML file. +func Load(path string) (*Formula, error) { + data, err := os.ReadFile(path) + if err != nil { + return nil, fmt.Errorf("reading formula %s: %w", path, err) + } + + var f Formula + if err := toml.Unmarshal(data, &f); err != nil { + return nil, fmt.Errorf("parsing formula %s: %w", path, err) + } + + if err := Validate(&f); err != nil { + return nil, fmt.Errorf("validating formula %s: %w", path, err) + } + + return &f, nil +} + +// LoadAll scans a directory for *.toml files and returns all valid formulas +// keyed by filename (without the .toml extension). +func LoadAll(dir string) (map[string]*Formula, error) { + entries, err := os.ReadDir(dir) + if err != nil { + if os.IsNotExist(err) { + return nil, nil + } + return nil, fmt.Errorf("reading formulas directory %s: %w", dir, err) + } + + formulas := make(map[string]*Formula) + for _, entry := range entries { + if entry.IsDir() || !strings.HasSuffix(entry.Name(), ".toml") { + continue + } + + name := strings.TrimSuffix(entry.Name(), ".toml") + f, err := Load(filepath.Join(dir, entry.Name())) + if err != nil { + return nil, err + } + formulas[name] = f + } + + return formulas, nil +} + +// Validate checks a formula for structural correctness: +// - At least one step +// - Non-empty id, role, title on each step +// - No duplicate step IDs +// - All depends_on refs point to valid step IDs +// - No cycles in the dependency graph +func Validate(f *Formula) error { + if f == nil { + return fmt.Errorf("formula is nil") + } + if len(f.Steps) == 0 { + return fmt.Errorf("formula has no steps") + } + + ids := make(map[string]bool, len(f.Steps)) + for _, s := range f.Steps { + if s.ID == "" { + return fmt.Errorf("step has empty id") + } + if s.ID != filepath.Base(s.ID) || s.ID == "." || s.ID == ".." { + return fmt.Errorf("step id %q is not filename-safe", s.ID) + } + if s.Role == "" { + return fmt.Errorf("step %q has empty role", s.ID) + } + if s.Title == "" { + return fmt.Errorf("step %q has empty title", s.ID) + } + if ids[s.ID] { + return fmt.Errorf("duplicate step id %q", s.ID) + } + ids[s.ID] = true + } + + // Validate depends_on references + for _, s := range f.Steps { + for _, dep := range s.DependsOn { + if !ids[dep] { + return fmt.Errorf("step %q depends on unknown step %q", s.ID, dep) + } + } + } + + // Cycle detection via Kahn's algorithm + if cycle := detectCycle(f.Steps); len(cycle) > 0 { + return fmt.Errorf("dependency cycle detected involving steps: %v", cycle) + } + + return nil +} + +// detectCycle uses Kahn's algorithm to find cycles in the step DAG. +// Returns the IDs of steps involved in cycles, or nil if acyclic. +func detectCycle(steps []Step) []string { + inDegree := make(map[string]int, len(steps)) + dependents := make(map[string][]string) // dep → steps that depend on it + + for _, s := range steps { + inDegree[s.ID] = 0 + } + for _, s := range steps { + for _, dep := range s.DependsOn { + inDegree[s.ID]++ + dependents[dep] = append(dependents[dep], s.ID) + } + } + + var queue []string + for id, deg := range inDegree { + if deg == 0 { + queue = append(queue, id) + } + } + + removed := 0 + for len(queue) > 0 { + id := queue[0] + queue = queue[1:] + removed++ + for _, depID := range dependents[id] { + inDegree[depID]-- + if inDegree[depID] == 0 { + queue = append(queue, depID) + } + } + } + + if removed == len(inDegree) { + return nil + } + + var cycleMembers []string + for id, deg := range inDegree { + if deg > 0 { + cycleMembers = append(cycleMembers, id) + } + } + return cycleMembers +} diff --git a/internal/formula/formula_test.go b/internal/formula/formula_test.go new file mode 100644 index 0000000..0483268 --- /dev/null +++ b/internal/formula/formula_test.go @@ -0,0 +1,314 @@ +// Test coverage matrix for formula package: +// +// Load: +// - [x] Valid formula file parses and validates +// - [x] Missing file returns error +// - [x] Invalid TOML returns parse error +// - [x] Valid TOML but invalid formula returns validation error +// +// LoadAll: +// - [x] Empty directory returns empty map +// - [x] Missing directory returns nil (not error) +// - [x] Multiple valid formulas keyed by name +// - [x] Non-TOML files ignored +// - [x] Invalid formula in dir returns error +// +// Validate: +// - [x] Happy path — linear chain +// - [x] Happy path — diamond DAG +// - [x] Nil formula +// - [x] Empty steps +// - [x] Empty step ID +// - [x] Unsafe step ID (path separator) +// - [x] Empty step role +// - [x] Empty step title +// - [x] Duplicate step IDs +// - [x] depends_on references unknown step +// - [x] Simple cycle (A→B→A) +// - [x] Self-cycle (A→A) +// - [x] Three-node cycle +package formula + +import ( + "os" + "path/filepath" + "strings" + "testing" +) + +func TestLoad_Valid(t *testing.T) { + dir := t.TempDir() + path := filepath.Join(dir, "test.toml") + if err := os.WriteFile(path, []byte(` +description = "test formula" + +[[steps]] +id = "first" +role = "concierge" +title = "First step" +description = "Do the first thing" + +[[steps]] +id = "second" +role = "architect" +title = "Second step" +description = "Do the second thing" +depends_on = ["first"] +`), 0o644); err != nil { + t.Fatal(err) + } + + f, err := Load(path) + if err != nil { + t.Fatalf("Load() error: %v", err) + } + if f.Description != "test formula" { + t.Errorf("Description = %q, want %q", f.Description, "test formula") + } + if len(f.Steps) != 2 { + t.Fatalf("len(Steps) = %d, want 2", len(f.Steps)) + } + if f.Steps[1].DependsOn[0] != "first" { + t.Errorf("Steps[1].DependsOn = %v, want [first]", f.Steps[1].DependsOn) + } +} + +func TestLoad_MissingFile(t *testing.T) { + _, err := Load("/nonexistent/formula.toml") + if err == nil { + t.Fatal("Load() expected error for missing file") + } +} + +func TestLoad_InvalidTOML(t *testing.T) { + dir := t.TempDir() + path := filepath.Join(dir, "bad.toml") + if err := os.WriteFile(path, []byte(`this is not valid toml [[[`), 0o644); err != nil { + t.Fatal(err) + } + + _, err := Load(path) + if err == nil { + t.Fatal("Load() expected error for invalid TOML") + } + if !strings.Contains(err.Error(), "parsing") { + t.Errorf("error should mention parsing: %v", err) + } +} + +func TestLoad_ValidTOMLInvalidFormula(t *testing.T) { + dir := t.TempDir() + path := filepath.Join(dir, "empty.toml") + if err := os.WriteFile(path, []byte(`description = "no steps"`), 0o644); err != nil { + t.Fatal(err) + } + + _, err := Load(path) + if err == nil { + t.Fatal("Load() expected error for formula with no steps") + } + if !strings.Contains(err.Error(), "no steps") { + t.Errorf("error should mention no steps: %v", err) + } +} + +func TestLoadAll_EmptyDir(t *testing.T) { + dir := t.TempDir() + formulas, err := LoadAll(dir) + if err != nil { + t.Fatalf("LoadAll() error: %v", err) + } + if len(formulas) != 0 { + t.Errorf("len(formulas) = %d, want 0", len(formulas)) + } +} + +func TestLoadAll_MissingDir(t *testing.T) { + formulas, err := LoadAll("/nonexistent/dir") + if err != nil { + t.Fatalf("LoadAll() error: %v", err) + } + if formulas != nil { + t.Errorf("formulas = %v, want nil", formulas) + } +} + +func TestLoadAll_MultipleFormulas(t *testing.T) { + dir := t.TempDir() + + formulaTOML := ` +description = "test" +[[steps]] +id = "a" +role = "concierge" +title = "Step A" +` + + for _, name := range []string{"alpha.toml", "beta.toml"} { + if err := os.WriteFile(filepath.Join(dir, name), []byte(formulaTOML), 0o644); err != nil { + t.Fatal(err) + } + } + if err := os.WriteFile(filepath.Join(dir, "readme.md"), []byte("not a formula"), 0o644); err != nil { + t.Fatal(err) + } + + formulas, err := LoadAll(dir) + if err != nil { + t.Fatalf("LoadAll() error: %v", err) + } + if len(formulas) != 2 { + t.Fatalf("len(formulas) = %d, want 2", len(formulas)) + } + if formulas["alpha"] == nil || formulas["beta"] == nil { + t.Errorf("expected formulas keyed by alpha and beta, got keys: %v", mapKeys(formulas)) + } +} + +func TestLoadAll_InvalidFormulaReturnsError(t *testing.T) { + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, "bad.toml"), []byte(`description = "no steps"`), 0o644); err != nil { + t.Fatal(err) + } + + _, err := LoadAll(dir) + if err == nil { + t.Fatal("LoadAll() expected error for invalid formula") + } +} + +func TestValidate_LinearChain(t *testing.T) { + f := &Formula{ + Steps: []Step{ + {ID: "a", Role: "concierge", Title: "A"}, + {ID: "b", Role: "architect", Title: "B", DependsOn: []string{"a"}}, + {ID: "c", Role: "experts", Title: "C", DependsOn: []string{"b"}}, + }, + } + if err := Validate(f); err != nil { + t.Fatalf("Validate() error: %v", err) + } +} + +func TestValidate_DiamondDAG(t *testing.T) { + f := &Formula{ + Steps: []Step{ + {ID: "root", Role: "concierge", Title: "Root"}, + {ID: "left", Role: "architect", Title: "Left", DependsOn: []string{"root"}}, + {ID: "right", Role: "architect", Title: "Right", DependsOn: []string{"root"}}, + {ID: "join", Role: "concierge", Title: "Join", DependsOn: []string{"left", "right"}}, + }, + } + if err := Validate(f); err != nil { + t.Fatalf("Validate() error: %v", err) + } +} + +func TestValidate_NilFormula(t *testing.T) { + err := Validate(nil) + if err == nil || !strings.Contains(err.Error(), "nil") { + t.Fatalf("expected 'nil' error, got: %v", err) + } +} + +func TestValidate_UnsafeStepID(t *testing.T) { + f := &Formula{Steps: []Step{{ID: "foo/bar", Role: "x", Title: "y"}}} + err := Validate(f) + if err == nil || !strings.Contains(err.Error(), "filename-safe") { + t.Fatalf("expected 'filename-safe' error, got: %v", err) + } +} + +func TestValidate_EmptySteps(t *testing.T) { + f := &Formula{Steps: nil} + err := Validate(f) + if err == nil || !strings.Contains(err.Error(), "no steps") { + t.Fatalf("expected 'no steps' error, got: %v", err) + } +} + +func TestValidate_EmptyStepID(t *testing.T) { + f := &Formula{Steps: []Step{{Role: "x", Title: "y"}}} + err := Validate(f) + if err == nil || !strings.Contains(err.Error(), "empty id") { + t.Fatalf("expected 'empty id' error, got: %v", err) + } +} + +func TestValidate_EmptyStepRole(t *testing.T) { + f := &Formula{Steps: []Step{{ID: "a", Title: "y"}}} + err := Validate(f) + if err == nil || !strings.Contains(err.Error(), "empty role") { + t.Fatalf("expected 'empty role' error, got: %v", err) + } +} + +func TestValidate_EmptyStepTitle(t *testing.T) { + f := &Formula{Steps: []Step{{ID: "a", Role: "x"}}} + err := Validate(f) + if err == nil || !strings.Contains(err.Error(), "empty title") { + t.Fatalf("expected 'empty title' error, got: %v", err) + } +} + +func TestValidate_DuplicateIDs(t *testing.T) { + f := &Formula{Steps: []Step{ + {ID: "a", Role: "x", Title: "y"}, + {ID: "a", Role: "x", Title: "z"}, + }} + err := Validate(f) + if err == nil || !strings.Contains(err.Error(), "duplicate") { + t.Fatalf("expected 'duplicate' error, got: %v", err) + } +} + +func TestValidate_UnknownDependency(t *testing.T) { + f := &Formula{Steps: []Step{ + {ID: "a", Role: "x", Title: "y", DependsOn: []string{"nonexistent"}}, + }} + err := Validate(f) + if err == nil || !strings.Contains(err.Error(), "unknown step") { + t.Fatalf("expected 'unknown step' error, got: %v", err) + } +} + +func TestValidate_SimpleCycle(t *testing.T) { + f := &Formula{Steps: []Step{ + {ID: "a", Role: "x", Title: "y", DependsOn: []string{"b"}}, + {ID: "b", Role: "x", Title: "z", DependsOn: []string{"a"}}, + }} + err := Validate(f) + if err == nil || !strings.Contains(err.Error(), "cycle") { + t.Fatalf("expected 'cycle' error, got: %v", err) + } +} + +func TestValidate_SelfCycle(t *testing.T) { + f := &Formula{Steps: []Step{ + {ID: "a", Role: "x", Title: "y", DependsOn: []string{"a"}}, + }} + err := Validate(f) + if err == nil || !strings.Contains(err.Error(), "cycle") { + t.Fatalf("expected 'cycle' error, got: %v", err) + } +} + +func TestValidate_ThreeNodeCycle(t *testing.T) { + f := &Formula{Steps: []Step{ + {ID: "a", Role: "x", Title: "A", DependsOn: []string{"c"}}, + {ID: "b", Role: "x", Title: "B", DependsOn: []string{"a"}}, + {ID: "c", Role: "x", Title: "C", DependsOn: []string{"b"}}, + }} + err := Validate(f) + if err == nil || !strings.Contains(err.Error(), "cycle") { + t.Fatalf("expected 'cycle' error, got: %v", err) + } +} + +func mapKeys(m map[string]*Formula) []string { + keys := make([]string, 0, len(m)) + for k := range m { + keys = append(keys, k) + } + return keys +} diff --git a/internal/mcp/architect_tools.go b/internal/mcp/architect_tools.go index 765c6bf..94d4e78 100644 --- a/internal/mcp/architect_tools.go +++ b/internal/mcp/architect_tools.go @@ -2,6 +2,7 @@ package mcp import ( "context" + "encoding/json" "fmt" "os" "path/filepath" @@ -13,6 +14,7 @@ import ( "github.com/cameronsjo/agent-pool/internal/approval" "github.com/cameronsjo/agent-pool/internal/contract" + "github.com/cameronsjo/agent-pool/internal/formula" "github.com/cameronsjo/agent-pool/internal/mail" ) @@ -68,6 +70,17 @@ func RegisterArchitectTools(srv *server.MCPServer, cfg *ServerConfig) { ), handleAmendContract(cfg, store), ) + + srv.AddTool( + mcp.NewTool("instantiate_formula", + mcp.WithDescription("Instantiate a workflow formula, bulk-creating tasks with dependency edges. Posts all tasks to the postoffice."), + mcp.WithString("formula", mcp.Required(), mcp.Description("Formula name (filename without .toml from formulas/ directory)")), + mcp.WithString("prefix", mcp.Required(), mcp.Description("ID prefix for generated tasks (e.g., 'feat-auth' produces 'feat-auth-gather')")), + mcp.WithString("overrides", mcp.Description("JSON object mapping step ID to custom body text (optional)")), + mcp.WithString("experts", mcp.Description("JSON object mapping step ID to specific expert name (required for steps with role='experts')")), + ), + handleInstantiateFormula(cfg), + ) } func handleDefineContract(store *contract.Store) server.ToolHandlerFunc { @@ -284,6 +297,134 @@ func handleAmendContract(cfg *ServerConfig, store *contract.Store) server.ToolHa } } +func handleInstantiateFormula(cfg *ServerConfig) server.ToolHandlerFunc { + return func(ctx context.Context, request mcp.CallToolRequest) (*mcp.CallToolResult, error) { + formulaName := request.GetString("formula", "") + prefix := request.GetString("prefix", "") + overridesStr := request.GetString("overrides", "") + expertsStr := request.GetString("experts", "") + + if formulaName == "" { + return mcp.NewToolResultError("formula parameter is required"), nil + } + // Guard against path traversal on formula name + if formulaName != filepath.Base(formulaName) || formulaName == "." || formulaName == ".." { + return mcp.NewToolResultError(fmt.Sprintf("invalid formula name %q: must be a simple filename", formulaName)), nil + } + if prefix == "" { + return mcp.NewToolResultError("prefix parameter is required"), nil + } + if prefix != filepath.Base(prefix) || prefix == "." || prefix == ".." { + return mcp.NewToolResultError(fmt.Sprintf("invalid prefix %q: must be a simple filename-safe string", prefix)), nil + } + + // Load formula + formulaPath := filepath.Join(cfg.PoolDir, "formulas", formulaName+".toml") + f, err := formula.Load(formulaPath) + if err != nil { + return mcp.NewToolResultError(fmt.Sprintf("loading formula: %v", err)), nil + } + + // Parse optional overrides + overrides := make(map[string]string) + if overridesStr != "" { + if err := json.Unmarshal([]byte(overridesStr), &overrides); err != nil { + return mcp.NewToolResultError(fmt.Sprintf("parsing overrides JSON: %v", err)), nil + } + } + + // Parse optional expert assignments (only applied to role="experts" steps) + experts := make(map[string]string) + if expertsStr != "" { + if err := json.Unmarshal([]byte(expertsStr), &experts); err != nil { + return mcp.NewToolResultError(fmt.Sprintf("parsing experts JSON: %v", err)), nil + } + } + + // Validate: steps with role="experts" must have an expert assignment + for _, step := range f.Steps { + if step.Role == "experts" { + if _, ok := experts[step.ID]; !ok { + return mcp.NewToolResultError(fmt.Sprintf( + "step %q has role 'experts' but no expert assigned in experts map", step.ID)), nil + } + } + } + + // Phase 1: Compose all messages up front (preflight before any writes) + var taskIDs []string + var messages []*mail.Message + now := time.Now().UTC() + + for _, step := range f.Steps { + taskID := prefix + "-" + step.ID + taskIDs = append(taskIDs, taskID) + + // Resolve recipient: experts map only applies to role="experts" steps + to := step.Role + if step.Role == "experts" { + to = experts[step.ID] // guaranteed present by validation above + } + + // Resolve body + body := fmt.Sprintf("# %s\n\n%s", step.Title, step.Description) + if override, ok := overrides[step.ID]; ok { + body = fmt.Sprintf("# %s\n\n%s", step.Title, override) + } + + // Apply prefix to dependency IDs + var deps []string + for _, dep := range step.DependsOn { + deps = append(deps, prefix+"-"+dep) + } + + messages = append(messages, &mail.Message{ + ID: taskID, + From: "architect", + To: to, + Type: mail.TypeTask, + Priority: mail.PriorityNormal, + DependsOn: deps, + Timestamp: now, + Body: body, + }) + } + + // Approval gate: block on human approval if required + if shouldRequireApproval(cfg.ApprovalMode) { + gate := approval.DefaultGate(cfg.PoolDir) + gate.Logger = cfg.Logger + summary := fmt.Sprintf("Formula: %s\nPrefix: %s\nTasks: %s", + formulaName, prefix, strings.Join(taskIDs, ", ")) + if gateErr := gate.Request(ctx, prefix, summary); gateErr != nil { + return mcp.NewToolResultError(fmt.Sprintf("approval gate: %v", gateErr)), nil + } + } + + // Phase 2: Post all messages. Best-effort cleanup on failure: + // the daemon routes postoffice files immediately via fsnotify, + // so earlier tasks may already be in inboxes by the time a later + // post fails. The cleanup removes unrouted postoffice files but + // cannot recall already-routed messages. The taskboard's dependency + // evaluation prevents premature execution of downstream steps. + var posted []string + for _, msg := range messages { + if err := postMessage(cfg.PoolDir, msg); err != nil { + postofficeDir := filepath.Join(cfg.PoolDir, "postoffice") + for _, id := range posted { + os.Remove(filepath.Join(postofficeDir, id+".md")) + } + return mcp.NewToolResultError(fmt.Sprintf("posting task %s: %v (cleaned up %d postoffice files, some may have been routed)", msg.ID, err, len(posted))), nil + } + posted = append(posted, msg.ID) + } + + return mcp.NewToolResultText(fmt.Sprintf( + "formula %q instantiated: %d tasks created [%s]", + formulaName, len(taskIDs), strings.Join(taskIDs, ", "))), nil + } +} + // shouldRequireApproval returns whether the given approval mode requires // human approval before task dispatch. func shouldRequireApproval(mode string) bool { diff --git a/internal/mcp/architect_tools_test.go b/internal/mcp/architect_tools_test.go index 3ab41da..6cce717 100644 --- a/internal/mcp/architect_tools_test.go +++ b/internal/mcp/architect_tools_test.go @@ -1,27 +1,34 @@ // Architect tools coverage matrix: // // RegisterArchitectTools (Classification: INTEGRATION) -// [x] Happy: all 4 architect tools + 6 expert tools registered (TestArchitectTools_Registration) +// [x] Happy: all 5 architect tools + 6 expert tools registered (TestArchitectTools_Registration) // -//define_contract (Classification: FILESYSTEM I/O) +// define_contract (Classification: FILESYSTEM I/O) // [x] Happy: creates contract file and index (TestDefineContract_Happy) // [x] Error: missing params (TestDefineContract_MissingParams) // [x] Error: fewer than 2 between parties (TestDefineContract_TooFewBetween) // -//send_task (Classification: FILESYSTEM I/O) +// send_task (Classification: FILESYSTEM I/O) // [x] Happy: message appears in postoffice (TestSendTask_Happy) // [x] Error: missing params (TestSendTask_MissingParams) // [x] Error: path traversal ID (TestSendTask_PathTraversal) // -//verify_result (Classification: FILESYSTEM I/O) +// verify_result (Classification: FILESYSTEM I/O) // [x] Happy: verification log created (TestVerifyResult_Happy) // [x] Error: invalid status (TestVerifyResult_InvalidStatus) // [x] Error: contract not found (TestVerifyResult_ContractNotFound) // -//amend_contract (Classification: FILESYSTEM I/O) +// amend_contract (Classification: FILESYSTEM I/O) // [x] Happy: version incremented + notify messages (TestAmendContract_Happy) // [x] Error: contract not found (TestAmendContract_NotFound) // +// instantiate_formula (Classification: FILESYSTEM I/O) +// [x] Happy: all tasks created with correct deps (TestInstantiateFormula_Happy) +// [x] Happy: overrides and expert assignments applied (TestInstantiateFormula_Overrides) +// [x] Error: formula not found (TestInstantiateFormula_NotFound) +// [x] Error: missing expert for role=experts step (TestInstantiateFormula_MissingExpert) +// [x] Error: invalid prefix (TestInstantiateFormula_InvalidPrefix) +// // Approval gate integration (Classification: FILESYSTEM I/O + CONCURRENCY) // [x] Happy: none mode bypasses approval (TestSendTask_ApprovalNoneMode) // [x] Happy: decomposition mode blocks until approved (TestSendTask_ApprovalRequired) @@ -72,6 +79,7 @@ func TestArchitectTools_Registration(t *testing.T) { // Architect tools "define_contract", "send_task", "verify_result", "amend_contract", + "instantiate_formula", // Expert tools (inherited) "read_state", "update_state", "append_error", "send_response", @@ -341,6 +349,197 @@ func TestAmendContract_NotFound(t *testing.T) { } } +// --- instantiate_formula tests --- + +func TestInstantiateFormula_Happy(t *testing.T) { + poolDir := setupArchitectPool(t) + srv := buildArchitectTestServer(t, poolDir) + + writeFormula(t, poolDir, "feature-impl", ` +description = "Standard feature flow" + +[[steps]] +id = "gather" +role = "concierge" +title = "Gather input" +description = "Ask experts for input" + +[[steps]] +id = "plan" +role = "concierge" +title = "Build plan" +description = "Synthesize input" +depends_on = ["gather"] + +[[steps]] +id = "implement" +role = "experts" +title = "Implement" +description = "Build the feature" +depends_on = ["plan"] + +[[steps]] +id = "verify" +role = "architect" +title = "Verify" +description = "Check contracts" +depends_on = ["implement"] +`) + + result := callTool(t, srv, "instantiate_formula", map[string]any{ + "formula": "feature-impl", + "prefix": "feat-auth", + "experts": `{"implement": "auth-expert"}`, + }) + + text := resultText(t, result) + if !strings.Contains(text, "4 tasks created") { + t.Errorf("expected '4 tasks created', got %q", text) + } + + // Verify all 4 messages in postoffice + expectedTasks := []struct { + id string + to string + deps []string + }{ + {"feat-auth-gather", "concierge", nil}, + {"feat-auth-plan", "concierge", []string{"feat-auth-gather"}}, + {"feat-auth-implement", "auth-expert", []string{"feat-auth-plan"}}, + {"feat-auth-verify", "architect", []string{"feat-auth-implement"}}, + } + + for _, exp := range expectedTasks { + path := filepath.Join(poolDir, "postoffice", exp.id+".md") + msg, err := mail.ParseFile(path) + if err != nil { + t.Fatalf("parsing %s: %v", exp.id, err) + } + if msg.To != exp.to { + t.Errorf("task %s: to = %q, want %q", exp.id, msg.To, exp.to) + } + if msg.From != "architect" { + t.Errorf("task %s: from = %q, want architect", exp.id, msg.From) + } + if msg.Type != mail.TypeTask { + t.Errorf("task %s: type = %q, want task", exp.id, msg.Type) + } + if len(msg.DependsOn) != len(exp.deps) { + t.Errorf("task %s: deps = %v, want %v", exp.id, msg.DependsOn, exp.deps) + } else { + for i, dep := range exp.deps { + if msg.DependsOn[i] != dep { + t.Errorf("task %s: dep[%d] = %q, want %q", exp.id, i, msg.DependsOn[i], dep) + } + } + } + } +} + +func TestInstantiateFormula_Overrides(t *testing.T) { + poolDir := setupArchitectPool(t) + srv := buildArchitectTestServer(t, poolDir) + + writeFormula(t, poolDir, "simple", ` +description = "Simple two-step" + +[[steps]] +id = "do" +role = "concierge" +title = "Do the thing" +description = "Default body" + +[[steps]] +id = "check" +role = "architect" +title = "Check result" +description = "Default check" +depends_on = ["do"] +`) + + result := callTool(t, srv, "instantiate_formula", map[string]any{ + "formula": "simple", + "prefix": "test-run", + "overrides": `{"do": "Custom body for this specific task"}`, + }) + + text := resultText(t, result) + if !strings.Contains(text, "2 tasks created") { + t.Errorf("expected '2 tasks created', got %q", text) + } + + // Verify override was applied + msg, err := mail.ParseFile(filepath.Join(poolDir, "postoffice", "test-run-do.md")) + if err != nil { + t.Fatalf("parsing test-run-do: %v", err) + } + if !strings.Contains(msg.Body, "Custom body") { + t.Errorf("expected custom body, got %q", msg.Body) + } + + // Verify non-overridden step has default body + msg2, err := mail.ParseFile(filepath.Join(poolDir, "postoffice", "test-run-check.md")) + if err != nil { + t.Fatalf("parsing test-run-check: %v", err) + } + if !strings.Contains(msg2.Body, "Default check") { + t.Errorf("expected default body, got %q", msg2.Body) + } +} + +func TestInstantiateFormula_NotFound(t *testing.T) { + poolDir := setupArchitectPool(t) + srv := buildArchitectTestServer(t, poolDir) + + result := callTool(t, srv, "instantiate_formula", map[string]any{ + "formula": "nonexistent", + "prefix": "test", + }) + text := resultText(t, result) + if !strings.Contains(text, "loading formula") { + t.Errorf("expected loading error, got %q", text) + } +} + +func TestInstantiateFormula_MissingExpert(t *testing.T) { + poolDir := setupArchitectPool(t) + srv := buildArchitectTestServer(t, poolDir) + + writeFormula(t, poolDir, "needs-expert", ` +description = "Needs expert assignment" + +[[steps]] +id = "work" +role = "experts" +title = "Do work" +description = "Needs an expert" +`) + + result := callTool(t, srv, "instantiate_formula", map[string]any{ + "formula": "needs-expert", + "prefix": "test", + // No experts map provided + }) + text := resultText(t, result) + if !strings.Contains(text, "no expert assigned") { + t.Errorf("expected 'no expert assigned' error, got %q", text) + } +} + +func TestInstantiateFormula_InvalidPrefix(t *testing.T) { + poolDir := setupArchitectPool(t) + srv := buildArchitectTestServer(t, poolDir) + + result := callTool(t, srv, "instantiate_formula", map[string]any{ + "formula": "test", + "prefix": "../escape", + }) + text := resultText(t, result) + if !strings.Contains(text, "invalid prefix") { + t.Errorf("expected 'invalid prefix' error, got %q", text) + } +} + // --- Approval gate tests --- func TestSendTask_ApprovalNoneMode(t *testing.T) { diff --git a/internal/mcp/testhelp_test.go b/internal/mcp/testhelp_test.go index 6e51561..010cbce 100644 --- a/internal/mcp/testhelp_test.go +++ b/internal/mcp/testhelp_test.go @@ -7,6 +7,7 @@ // callToolWithContext — invoke a tool via JSON-RPC (custom context) // resultText — extract text content from tool result // listToolNames — list registered tool names +// writeFormula — write a TOML formula to pool's formulas/ directory // mustJSON — marshal to JSON or fail package mcp_test @@ -242,6 +243,18 @@ func buildSharedMCPTestServer(t *testing.T, poolDir, expertName, overlayDir stri return srv } +// writeFormula writes a TOML formula file into the pool's formulas/ directory. +func writeFormula(t *testing.T, poolDir, name, content string) { + t.Helper() + dir := filepath.Join(poolDir, "formulas") + if err := os.MkdirAll(dir, 0o755); err != nil { + t.Fatalf("creating formulas dir: %v", err) + } + if err := os.WriteFile(filepath.Join(dir, name+".toml"), []byte(content), 0o644); err != nil { + t.Fatalf("writing formula: %v", err) + } +} + // mustJSON marshals v to JSON or fails the test. func mustJSON(t *testing.T, v any) json.RawMessage { t.Helper()