Skip to content

Issue 5: Executor framework + fresh per-task agent #624

Description

@gewenyu99

Issue 5: Executor framework + fresh per-task agent

Epic: Task-queue orchestrator runner · Depends on: #621, #622, #623 · PR: #609

Why

This is the heart of the new runner. Drain the queue, and run a fresh agent per
task with its own model, tools, and permissions, reusing the existing runAgent
machinery rather than reimplementing the SDK loop.

Scope / deliverable

  1. Implement runOrchestrator(session, programConfig, boot).
    • Initialize the agent once via initializeAgent (agent-interface.ts:637).
      It re-fetches the skill menu and creates MCP servers, so it runs once, not per
      task. Pass the in-memory QueueStore into the options so the orchestrator
      tools register in wizard-tools ( Issue 4: Orchestrator MCP tools (in wizard-tools) #623).
    • Seeding lands in Issue 7: Walking skeleton, end-to-end with stub tasks #626. The orchestrator agent that inspects the repo and
      calls enqueue_task is Issue 7: Walking skeleton, end-to-end with stub tasks #626's deliverable. Issue 5: Executor framework + fresh per-task agent #624 builds and tests the drain loop
      against a hand-seeded queue.
    • Task resolver. The executor owns a small resolver interface that maps a
      task's type to what runAgent needs: model, allowedTools, disallowedTools,
      and prompt text with the upstream handoffs appended. Issue 5: Executor framework + fresh per-task agent #624 ships a trivial inline
      resolver for its test. The real markdown-backed resolver is Issue 6: Agent-prompt and mini-skill format + seed prompt #625.
    • Drain loop. Repeatedly pick QueueStore.nextRunnable(), the tasks whose
      deps are satisfied, honoring a concurrency cap defaulted to 1. For each task,
      resolve it, call the existing runAgent (:773) with the per-task model and
      tools, mark in_progress, run, then done or failed.
    • Outcome. The agent reports done or failed through complete_task,
      against the plain-text success criteria in its prompt ( Issue 6: Agent-prompt and mini-skill format + seed prompt #625). The executor reads
      that status. A failed task retries while attempts remain, and the run
      continues. A global failure (401, [ABORT]) aborts the whole run, via the
      mapAgentError classification from Issue 1: Shared bootstrap extraction + variant gating #621.
    • Force-exit a runaway agent. Give each task agent a tool-call budget. When
      it is exceeded, canUseTool (agent-interface.ts:483) denies every further
      tool except complete_task and returns a short "tool budget reached, wrap up
      and call complete_task now" message, so the agent has to finalize. This is
      more reliable than waiting for the model to stop on its own. It is the
      canUseTool version of the trick hogai uses by stripping tools at its loop
      cap (ee/hogai/core/agent_modes/executables.py:307).
    • The loop ends when all tasks are terminal, or the executor-loop iteration
      cap is hit, the final termination guard.
  2. Refactors to agent-interface.ts.
    • Per-task model. Make model an explicit AgentRunConfig field. It is
      hardcoded by label today (:706–709). Default to the current logic, and let
      the executor override it per task. runAgent already reads agentConfig.model
      at :903.
    • Run-end remark. The remark and the additionalFeatureQueue drain fire
      once at run end, not on every task's Stop. Add a requestRemark flag so
      only the executor triggers it (createStopHook at :285, wired :1043).
  3. The TUI renders the queue. The in-memory queue ( Issue 3: Queue + persistence layer #622) is the task state the
    UI reads, and a new screen renders the tasks and their statuses. When the
    executor mutates the queue, the UI re-renders, through the existing reactive
    store and getUI() pattern, so business logic never touches the store directly.
    getUI().pushStatus and spinner.message carry the active task's sub-status.
    TaskStreamPush already subscribes to the store, so web sync comes along.

Key files

  • src/lib/programs/orchestrator/orchestrator-runner.ts (the executor and the
    resolver interface; the real markdown resolver is Issue 6: Agent-prompt and mini-skill format + seed prompt #625)
  • src/lib/agent/agent-interface.ts (model field, run-end remark flag)
  • src/lib/agent/agent-runner.ts (wire the fork to the real runOrchestrator)
  • src/ui/tui/ (a screen that renders the queue)

Acceptance criteria

  • A hand-seeded 2-task queue runs as two separate query() calls with distinct
    per-task models.
  • The TUI renders the queue, and tasks visibly move pending to in_progress to
    done.
  • A complete_task failure marks the task failed, retries, and continues the
    run. A 401 or [ABORT] aborts the whole run.
  • A task agent that blows its tool-call budget is force-exited: further tools
    are denied except complete_task, and it finalizes.
  • The linear path is unchanged with the flag off.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions