Skip to content

feat(watch): /fleet hybrid dispatch — 2.9x faster parallel execution#776

Merged
tamirdresher merged 11 commits intodevfrom
squad/775-fleet-dispatch
Apr 4, 2026
Merged

feat(watch): /fleet hybrid dispatch — 2.9x faster parallel execution#776
tamirdresher merged 11 commits intodevfrom
squad/775-fleet-dispatch

Conversation

@tamirdresher
Copy link
Copy Markdown
Collaborator

/fleet Hybrid Dispatch for squad watch --execute

Adds --dispatch-mode flag with three modes:

Mode Behavior When to use
task (default) Existing Promise.all — unchanged Backward compatible
fleet All issues via single /fleet prompt Maximum parallelism for read-heavy boards
hybrid Reads via /fleet, writes via task tool Recommended — best of both worlds

Usage

squad watch --execute --dispatch-mode hybrid
squad watch --execute --dispatch-mode fleet
squad watch --execute  # default: task (unchanged)

Benchmark (4 real issues)

Metric Fleet (parallel) Sequential Improvement
Total time 116s 332s 2.9x faster
Premium requests 12 ~16 ~25% cheaper
MCP startup 1x shared 4x per-process 4x less

How hybrid dispatch works

Issues arrive → classify by title keywords
                    ↓
Read-heavy (research, review, analyze, audit) → /fleet (parallel)
Write-heavy (fix, implement, build, refactor) → task tool + worktrees

Fleet builds one prompt with N parallel tracks:

/fleet Execute 3 read-only analysis tracks in parallel:
Track 1 (seven): Issue #42 — Investigate flaky tests
Track 2 (q): Issue #43 — Fact-check PR claims
Track 3 (seven): Issue #44 — Document API flow

Key finding: Fleet ignores custom agents

⚠️ When referencing @seven or @data in fleet prompts, Copilot CLI spawns generic explore agents — NOT custom agents from .github/agents/. Charters are NOT followed. This is why hybrid mode uses fleet only for reads and task tool for writes.

Files changed

File Change
fleet-dispatch.ts New — FleetDispatchCapability with prompt builder
execute.ts Issue classification (read vs write) + hybrid routing
config.ts DispatchMode type + config field
cli-entry.ts --dispatch-mode flag parsing
capabilities/index.ts Register fleet-dispatch
watch/index.ts Pass dispatchMode through context

Builds on

Closes #775

Copilot AI review requested due to automatic review settings April 3, 2026 08:54
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new --dispatch-mode option for squad watch --execute to support batching read-heavy issues into a single /fleet-style Copilot invocation (with a hybrid mode that keeps write-heavy work on the existing per-issue execution path).

Changes:

  • Introduces DispatchMode (task|fleet|hybrid) in watch config and wires it from CLI → config loader → capability contexts.
  • Adds FleetDispatchCapability to build and run a multi-track /fleet prompt for read-heavy issues.
  • Updates ExecuteCapability to classify issues by title keywords and route work based on dispatchMode.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
packages/squad-cli/src/cli/commands/watch/index.ts Passes dispatchMode through the per-capability config context.
packages/squad-cli/src/cli/commands/watch/config.ts Adds DispatchMode, config field, merge logic, and JSON normalization.
packages/squad-cli/src/cli/commands/watch/capabilities/index.ts Registers the new fleet-dispatch capability in the default registry.
packages/squad-cli/src/cli/commands/watch/capabilities/fleet-dispatch.ts New capability that batches read-heavy issues and invokes Copilot with a /fleet prompt.
packages/squad-cli/src/cli/commands/watch/capabilities/execute.ts Adds keyword classification + fleet/hybrid routing behavior for execution.
packages/squad-cli/src/cli-entry.ts Parses --dispatch-mode and passes it into loadWatchConfig.

Comment on lines +170 to +175
return {
success: true,
summary: dispatchMode === 'fleet'
? `fleet mode: all ${executable.length} issues deferred to fleet-dispatch`
: `hybrid mode: no write issues (${readCount} read-only deferred to fleet)`,
data: { executed: 0, failed: 0, deferredToFleet: executable.length - writeIssues.length },
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In dispatchMode === 'fleet' this branch reports “all N issues deferred to fleet-dispatch”, but deferredToFleet is currently set to executable.length - writeIssues.length (read-only count). In fleet mode the deferred count should be executable.length (since both read+write are deferred).

Suggested change
return {
success: true,
summary: dispatchMode === 'fleet'
? `fleet mode: all ${executable.length} issues deferred to fleet-dispatch`
: `hybrid mode: no write issues (${readCount} read-only deferred to fleet)`,
data: { executed: 0, failed: 0, deferredToFleet: executable.length - writeIssues.length },
const deferredToFleet = dispatchMode === 'fleet' ? executable.length : readCount;
return {
success: true,
summary: dispatchMode === 'fleet'
? `fleet mode: all ${executable.length} issues deferred to fleet-dispatch`
: `hybrid mode: no write issues (${readCount} read-only deferred to fleet)`,
data: { executed: 0, failed: 0, deferredToFleet },

Copilot uses AI. Check for mistakes.
Comment on lines +161 to +167
// In fleet or hybrid mode, split read vs write issues
if (dispatchMode === 'fleet' || dispatchMode === 'hybrid') {
const writeIssues = executable.filter(i => classifyIssue(i.title) === 'write');
const batch = dispatchMode === 'fleet'
? [] // fleet mode: all issues go to fleet-dispatch capability
: writeIssues.slice(0, maxConcurrent); // hybrid: only write issues here

Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In dispatchMode === 'fleet', ExecuteCapability does no work and relies on the separate fleet-dispatch capability to be enabled. If a user sets --dispatch-mode fleet but doesn’t also enable --fleet-dispatch (or config), this will result in no execution happening. Consider automatically enabling fleet-dispatch when dispatchMode is fleet|hybrid, or invoking fleet dispatch directly from here when in fleet mode.

Copilot uses AI. Check for mistakes.
Comment on lines +71 to +83
// Cross-platform: shell-expand the file contents into the -p argument
const isWindows = process.platform === 'win32';
const cmd = isWindows
? `powershell -NoProfile -Command "copilot -p (Get-Content '${promptFile}' -Raw) --allow-all --no-ask-user --autopilot"`
: `copilot -p "$(cat '${promptFile}')" --allow-all --no-ask-user --autopilot`;

const result = execSync(cmd, {
cwd,
timeout: timeoutMs,
encoding: 'utf-8',
shell: true,
stdio: ['pipe', 'pipe', 'pipe'],
});
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

invokeFleet() builds a shell command that inlines the prompt file contents into a quoted -p argument ("$(cat ...)" / PowerShell Get-Content ... -Raw) and executes it with shell: true. Issue titles/bodies are untrusted input and can contain quotes/newlines that break quoting, leading to command injection or arbitrary argument injection. Prefer execFile/spawn with an args array (no shell) and pass the prompt as a single argument safely (or use a supported file/stdin input mode if Copilot CLI has one).

Copilot uses AI. Check for mistakes.
Comment on lines +97 to +112
export class FleetDispatchCapability implements WatchCapability {
readonly name = 'fleet-dispatch';
readonly description = 'Batch read-heavy issues into a parallel /fleet Copilot session';
readonly configShape = 'boolean' as const;
readonly requires = ['gh'];
readonly phase = 'post-execute' as const;

async preflight(_context: WatchContext): Promise<PreflightResult> {
// Fleet dispatch requires the copilot CLI — quick sanity check
try {
execSync('copilot --version', { encoding: 'utf-8', stdio: 'pipe' });
return { ok: true };
} catch {
return { ok: false, reason: 'copilot CLI not found — required for fleet dispatch' };
}
}
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This capability’s requires lists only ['gh'], but preflight/execute invoke the standalone copilot binary (and on Windows also relies on powershell). This makes the preflight failure messaging/tooling metadata inaccurate. Update requires (and/or standardize on gh copilot if that’s the intended CLI) so dependency reporting matches what’s actually executed.

Copilot uses AI. Check for mistakes.
Comment on lines +372 to +375
const dispatchModeIdx = args.indexOf('--dispatch-mode');
const dispatchMode = (dispatchModeIdx !== -1 && args[dispatchModeIdx + 1])
? args[dispatchModeIdx + 1] as 'fleet' | 'task' | 'hybrid'
: undefined;
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--dispatch-mode is cast to the union type without runtime validation. If a user passes an invalid value, it will silently flow into config and later behave like task mode (because it won’t match fleet|hybrid), which is hard to diagnose. Consider validating against {task,fleet,hybrid} and either erroring or warning + defaulting to task (and, if fleet|hybrid, auto-enable fleet-dispatch to match the documented usage).

Copilot uses AI. Check for mistakes.
Comment on lines +32 to +52
/** Keywords that indicate read-heavy / analysis work. */
const READ_KEYWORDS = [
'research', 'review', 'analyze', 'investigate', 'audit',
'check', 'scan', 'assess', 'evaluate', 'fact-check',
'document', 'report',
];

/** Keywords that indicate write-heavy / implementation work. */
const WRITE_KEYWORDS = [
'fix', 'implement', 'create', 'build', 'refactor',
'add', 'update', 'migrate', 'deploy', 'feature',
];

/** Classify an issue as read-heavy or write-heavy by title keywords. */
export function classifyIssue(title: string): 'read' | 'write' {
const lower = title.toLowerCase();
const isRead = READ_KEYWORDS.some(k => lower.includes(k));
const isWrite = WRITE_KEYWORDS.some(k => lower.includes(k));
if (isRead && !isWrite) return 'read';
return 'write'; // default to write (safer — gets full agent session)
}
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New dispatch-mode behavior (classifyIssue + fleet/hybrid routing) isn’t covered by the existing watch execute tests. Adding focused unit tests for classifyIssue() and for the fleet/hybrid branching (e.g., fleet defers all, hybrid executes only write issues up to maxConcurrent) would help prevent regressions.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit e07e8f9. Added unit tests in \ est/cli/watch-execute.test.ts\ covering:

  • \classifyIssue(): read keywords (research, review, analyze, investigate, audit), write keywords (fix, implement, add, update, refactor), default-to-write, case-insensitive, mixed-keyword tie-breaking
  • Fleet mode: all executable issues dispatched
  • Hybrid mode: only read-heavy issues fleet-dispatched, write issues execute
  • Hybrid mode: assigned issues excluded from both paths

All 16 watch-execute tests pass.

@tamirdresher tamirdresher requested a review from diberry April 3, 2026 18:09
Copilot and others added 4 commits April 3, 2026 21:18
…775)

Adds --dispatch-mode flag to squad watch --execute with three modes:
- task (default): existing Promise.all behavior, unchanged
- fleet: all issues dispatched via single /fleet prompt, true parallelism
- hybrid: read-heavy issues via /fleet, write-heavy via task tool + worktrees

Benchmark results (4 real issues):
- Fleet: 116s total (29s/issue avg)
- Sequential: 332s total (83s/issue avg)
- Fleet is 2.9x faster, ~25% cheaper on premium requests

Key finding: /fleet ignores custom agents (.github/agents/) and spawns
generic explore agents. This makes fleet ideal for read-only analysis
(triage, research, reviews) but NOT for charter-driven code changes.

New files:
- fleet-dispatch.ts: FleetDispatchCapability with /fleet prompt builder
- Updated execute.ts: issue classification (read vs write keywords)
- Updated config.ts: DispatchMode type + config field
- Updated cli-entry.ts: --dispatch-mode flag parsing

Closes #775

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove shell:true and stdio from execSync options (fixes overload mismatch)
- Add .changeset/fleet-dispatch-hybrid.md for changelog-gate CI check
- Build verified locally: tsc --noEmit passes clean

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1. deferredToFleet count: use executable.length in fleet mode (was read-only count)
2. Warning log when fleet/hybrid defers but fleet-dispatch may not be enabled
3. Command injection fix: execFileSync with args array instead of shell command
4. requires: added 'copilot' alongside 'gh'
5. --dispatch-mode validation: error on invalid values, default to task
6. Unit tests: 11 tests for classifyIssue() covering all keyword categories

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@tamirdresher tamirdresher force-pushed the squad/775-fleet-dispatch branch from 8893c70 to 7d57659 Compare April 3, 2026 18:18
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

🛫 PR Readiness Check

ℹ️ This comment updates on each push. Last checked: commit ae866a4

⚠️ 4 item(s) to address before review

Status Check Details
Single commit 11 commits — consider squashing before review
Not in draft Ready for review
Branch up to date Up to date with dev
Copilot review No Copilot review yet — it may still be processing
Changeset present Changeset file found
Scope clean No .squad/ or docs/proposals/ files
No merge conflicts No merge conflicts
Copilot threads resolved 3 unresolved Copilot thread(s) — fix and resolve before merging
CI passing 15 check(s) still running

This check runs automatically on every push. Fix any ❌ items and push again.
See CONTRIBUTING.md and PR Requirements for details.

Copy link
Copy Markdown
Collaborator

@diberry diberry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Review Summary

Verdict: Approve

What's Good

  • Real performance win - 2.9x faster on a 4-issue board. Benchmarked on real issues. MCP startup amortized (1x vs 4x).
  • Three-mode design - task (backward compatible default), fleet (all parallel), hybrid (reads->fleet, writes->task). Safe rollout.
  • Issue classification is pragmatic - Keyword-based read/write split, defaults to write when ambiguous (safer). 11 tests.
  • Documented the key finding - Fleet ignores custom agents and spawns generic explore agents. Honest engineering, explains why hybrid exists.
  • Clean plugin architecture - FleetDispatchCapability slots into capability registry at post-execute phase. No special-casing.
  • Good security - execFileSync with args array (no shell injection). Temp file cleanup in finally block.

Concerns (non-blocking)

  1. 4 commits - should squash before merge (readiness check expects 1)
  2. Hardcoded copilot binary - uses copilot.cmd/copilot instead of agentCmd config. Should use context.agentCmd or document why fleet always uses Copilot CLI
  3. Duplicate issue fetching - both ExecuteCapability and FleetDispatchCapability independently call listWorkItems. Future: pass issue list through context.data
  4. Keyword overlap fragile - "Audit and refactor" goes to write because refactor keyword. Acceptable for v1, future could weight by position

Non-blocking suggestions

  1. Squash to 1 commit
  2. Make fleet binary configurable instead of hardcoding copilot/copilot.cmd
  3. Pass issue list through context to avoid double API calls
  4. Future: consider LLM-based classification for ambiguous titles

tamirdresher and others added 2 commits April 4, 2026 18:28
- Add warning log when dispatchMode=fleet/hybrid but fleet-dispatch cap is not enabled
- Add comment documenting --dispatch-mode runtime validation
- Re-export classifyIssue from watch/index barrel
- Add classifyIssue unit tests for dispatch-mode categories
- Pass dispatchMode through to capability synthesized config

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 4, 2026

🏗️ Architectural Review

⚠️ Architectural review: 2 warning(s).

Severity Category Finding Files
🟡 warning bootstrap-area 1 file(s) in the bootstrap area (packages/squad-cli/src/cli/core/) were modified. These files must maintain zero external dependencies. Review carefully. packages/squad-cli/src/cli/core/upgrade.ts
🟡 warning export-surface Package entry point(s) modified with 10 new/changed export(s). New public API surface requires careful review for backward compatibility. packages/squad-sdk/src/index.ts

Automated architectural review — informational only.

Copilot and others added 2 commits April 4, 2026 21:02
…es PS1 design)

The execute capability now passes ALL squad issues to the agent and lets the
agent decide what to work on, matching the PS1 ralph-watch behavior. The prompt
includes Task/WHY/Success/Escalation structure and references
.squad/ralph-instructions.md for full instructions.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…eIssues

The PR #776 simplified findExecutableIssues to only filter by squad label
(intentionally minimal), but this dropped two pre-existing filters that
the test suite expects:
  - exclude issues already assigned to a human
  - exclude issues with status:blocked (or similar blocking labels)

These filters match the PS1 ralph-watch pre-filter logic and are the right
behavior — the agent should only receive clearly actionable issues.

Fixes test: CLI: watch execute mode > findExecutableIssues > returns only
issues ready for execution
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 4, 2026

🟡 Impact Analysis — PR #776

Risk tier: 🟡 MEDIUM

📊 Summary

Metric Count
Files changed 16
Files added 6
Files modified 10
Files deleted 0
Modules touched 4
Critical files 3

🎯 Risk Factors

  • 16 files changed (6-20 → MEDIUM)
  • 4 modules touched (2-4 → MEDIUM)
  • Critical files touched: packages/squad-cli/src/cli/commands/watch/capabilities/index.ts, packages/squad-cli/src/cli/commands/watch/index.ts, packages/squad-sdk/src/index.ts

📦 Modules Affected

root (2 files)
  • .changeset/fleet-dispatch-hybrid.md
  • .changeset/scratch-dir-utility.md
squad-cli (7 files)
  • packages/squad-cli/src/cli-entry.ts
  • packages/squad-cli/src/cli/commands/watch/capabilities/execute.ts
  • packages/squad-cli/src/cli/commands/watch/capabilities/fleet-dispatch.ts
  • packages/squad-cli/src/cli/commands/watch/capabilities/index.ts
  • packages/squad-cli/src/cli/commands/watch/config.ts
  • packages/squad-cli/src/cli/commands/watch/index.ts
  • packages/squad-cli/src/cli/core/upgrade.ts
squad-sdk (3 files)
  • packages/squad-sdk/src/config/init.ts
  • packages/squad-sdk/src/index.ts
  • packages/squad-sdk/src/resolution.ts
tests (4 files)
  • test/cli/watch-execute.test.ts
  • test/cli/watch-fleet-dispatch.test.ts
  • test/scratch-dir.test.ts
  • test/watch-fleet-dispatch.test.ts

⚠️ Critical Files

  • packages/squad-cli/src/cli/commands/watch/capabilities/index.ts
  • packages/squad-cli/src/cli/commands/watch/index.ts
  • packages/squad-sdk/src/index.ts

This report is generated automatically for every PR. See #733 for details.

Addresses active review comment on PR #776: add unit tests for
classifyIssue() (read vs write classification) and fleet/hybrid
dispatch routing logic.

Tests cover:
- classifyIssue() read keywords (research, review, analyze, investigate, audit)
- classifyIssue() write keywords (fix, implement, add, update, refactor)
- Default to write when no keywords match
- Default to write when both read and write keywords appear
- Case-insensitivity
- Fleet mode: all executable issues dispatched
- Hybrid mode: only read-heavy issues fleet-dispatched, write issues execute
- Hybrid mode: assigned issues excluded from both paths

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@tamirdresher
Copy link
Copy Markdown
Collaborator Author

Superseded by #830 — combined into a single watch next-gen PR for easier review.

@tamirdresher
Copy link
Copy Markdown
Collaborator Author

Closing — superseded by #830 (watch-next-gen combined PR)

@tamirdresher tamirdresher reopened this Apr 4, 2026
@tamirdresher tamirdresher merged commit 5996db4 into dev Apr 4, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Watch] Add /fleet parallel dispatch mode for squad watch --execute

3 participants