Issue 5: Executor framework + fresh per-task agent

# Issue 5: Executor framework + fresh per-task agent

**Epic:** Task-queue orchestrator runner · **Depends on:** #621, #622, #623 · **PR:** #609

## Why

This is the heart of the new runner. Drain the queue, and run a fresh agent per
task with its own model, tools, and permissions, reusing the existing `runAgent`
machinery rather than reimplementing the SDK loop.

## Scope / deliverable

1. **Implement `runOrchestrator(session, programConfig, boot)`.**
   - **Initialize the agent once** via `initializeAgent` (`agent-interface.ts:637`).
     It re-fetches the skill menu and creates MCP servers, so it runs once, not per
     task. Pass the in-memory `QueueStore` into the options so the orchestrator
     tools register in `wizard-tools` (#623).
   - **Seeding lands in #626.** The orchestrator agent that inspects the repo and
     calls `enqueue_task` is #626's deliverable. #624 builds and tests the drain loop
     against a hand-seeded queue.
   - **Task resolver.** The executor owns a small resolver interface that maps a
     task's `type` to what `runAgent` needs: model, allowedTools, disallowedTools,
     and prompt text with the upstream handoffs appended. #624 ships a trivial inline
     resolver for its test. The real markdown-backed resolver is #625.
   - **Drain loop.** Repeatedly pick `QueueStore.nextRunnable()`, the tasks whose
     deps are satisfied, honoring a concurrency cap defaulted to 1. For each task,
     resolve it, call the existing `runAgent` (`:773`) with the per-task model and
     tools, mark `in_progress`, run, then `done` or `failed`.
   - **Outcome.** The agent reports `done` or `failed` through `complete_task`,
     against the plain-text success criteria in its prompt (#625). The executor reads
     that status. A `failed` task retries while attempts remain, and the run
     continues. A global failure (401, `[ABORT]`) aborts the whole run, via the
     `mapAgentError` classification from #621.
   - **Force-exit a runaway agent.** Give each task agent a tool-call budget. When
     it is exceeded, `canUseTool` (`agent-interface.ts:483`) denies every further
     tool except `complete_task` and returns a short "tool budget reached, wrap up
     and call complete_task now" message, so the agent has to finalize. This is
     more reliable than waiting for the model to stop on its own. It is the
     `canUseTool` version of the trick hogai uses by stripping tools at its loop
     cap (`ee/hogai/core/agent_modes/executables.py:307`).
   - **The loop ends** when all tasks are terminal, or the executor-loop iteration
     cap is hit, the final termination guard.
2. **Refactors to `agent-interface.ts`.**
   - **Per-task `model`.** Make `model` an explicit `AgentRunConfig` field. It is
     hardcoded by label today (`:706–709`). Default to the current logic, and let
     the executor override it per task. `runAgent` already reads `agentConfig.model`
     at `:903`.
   - **Run-end remark.** The remark and the `additionalFeatureQueue` drain fire
     once at run end, not on every task's `Stop`. Add a `requestRemark` flag so
     only the executor triggers it (`createStopHook` at `:285`, wired `:1043`).
3. **The TUI renders the queue.** The in-memory queue (#622) is the task state the
   UI reads, and a new screen renders the tasks and their statuses. When the
   executor mutates the queue, the UI re-renders, through the existing reactive
   store and `getUI()` pattern, so business logic never touches the store directly.
   `getUI().pushStatus` and `spinner.message` carry the active task's sub-status.
   `TaskStreamPush` already subscribes to the store, so web sync comes along.

## Key files

- `src/lib/programs/orchestrator/orchestrator-runner.ts` (the executor and the
  resolver interface; the real markdown resolver is #625)
- `src/lib/agent/agent-interface.ts` (`model` field, run-end remark flag)
- `src/lib/agent/agent-runner.ts` (wire the fork to the real `runOrchestrator`)
- `src/ui/tui/` (a screen that renders the queue)

## Acceptance criteria

- [ ] A hand-seeded 2-task queue runs as two separate `query()` calls with distinct
      per-task models.
- [ ] The TUI renders the queue, and tasks visibly move pending to in_progress to
      done.
- [ ] A `complete_task` failure marks the task `failed`, retries, and continues the
      run. A 401 or `[ABORT]` aborts the whole run.
- [ ] A task agent that blows its tool-call budget is force-exited: further tools
      are denied except `complete_task`, and it finalizes.
- [ ] The linear path is unchanged with the flag off.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Issue 5: Executor framework + fresh per-task agent #624

Issue 5: Executor framework + fresh per-task agent

Why

Scope / deliverable

Key files

Acceptance criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Issue 5: Executor framework + fresh per-task agent #624

Description

Issue 5: Executor framework + fresh per-task agent

Why

Scope / deliverable

Key files

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions