Subagent runtime model fallback (#198) by aebrer · Pull Request #209 · aebrer/dreb

aebrer · 2026-05-14T20:14:29Z

Closes #198

Subagent model fallback lists are lost at spawn time — runtime rate limits cannot trigger fallback.

Implementation plan posted as a comment below.

aebrer · 2026-05-14T20:15:34Z

Implementation Plan

Problem Analysis

resolveModelWithFallbacks in subagent.ts picks the first resolvable model at spawn time and passes only that single model to the child process via --model. When the child hits a runtime rate limit or quota error, its auto-retry logic (AgentSession._handleRetryableError) only retries the same model with exponential backoff. The child has no knowledge of the agent definition's fallback list, so it can never switch to the next model. The subagent fails entirely.

Approach: Child-Side Fallback

We pass the full fallback list into the child process and teach AgentSession to switch to the next fallback model when retries on the current model are exhausted. This is more robust than parent-side respawn because:

No partial progress is lost (the same process continues)
No stderr/exit-code parsing is needed
The fallback is transparent to the parent — the subagent either succeeds with some model or fails only after all fallbacks are exhausted

Deliverables

CLI support for --fallback-models — internal flag consumed by subagent children
AgentSession runtime fallback switching — when _handleRetryableError exhausts retries, try the next fallback model
Subagent spawn wiring — preserve the agent definition's raw fallback list and pass it to the child
Loud logging — every model switch is logged in a single concise line; final failure is a concise error
Tests — spawn args, fallback resolution, and model-switching behavior

Files to Create or Modify

File	Change
`packages/coding-agent/src/cli/args.ts`	Add `--fallback-models <list>` parsing; add `fallbackModels?: string[]` to `Args`
`packages/coding-agent/src/main.ts`	Pass `parsed.fallbackModels` through `buildSessionOptions`
`packages/coding-agent/src/core/sdk.ts`	Add `fallbackModels?: string[]` to `CreateAgentSessionOptions`
`packages/coding-agent/src/core/agent-session.ts`	Accept `fallbackModels` in config; add `_fallbackModelIndex`; modify `_handleRetryableError` to switch models when retries exhausted; add `_switchToFallbackModel()` helper
`packages/coding-agent/src/core/tools/subagent.ts`	Add `modelFallbacks?: string[]` to `AgentTypeConfig`; in `executeSingle`, preserve raw fallback list when no override; in `spawnSubagent`, pass `--fallback-models` arg
`packages/coding-agent/test/subagent-model-fallback.test.ts`	Add tests for `modelFallbacks` propagation and spawn args
`packages/coding-agent/test/agent-session-fallback.test.ts`	New file — test `_handleRetryableError` fallback switching with mocked registry

Detailed Design

1. Passing the fallback list to the child

In executeSingle, after resolveModelWithFallbacks resolves the first model, if the agent config had a fallback list and no per-invocation override, preserve the raw list:

if (!modelOverride && Array.isArray(config.model)) {
    effectiveConfig = { ...effectiveConfig, modelFallbacks: config.model };
}

In spawnSubagent, when modelFallbacks is present, append --fallback-models <comma-separated> to the child args.

2. AgentSession fallback consumption

Add to AgentSessionConfig / CreateAgentSessionOptions:

fallbackModels?: string[];

In AgentSession constructor:

this._fallbackModels = config.fallbackModels ?? [];
this._fallbackModelIndex = 0;

3. Model switching on retry exhaustion

Modify _handleRetryableError (agent-session.ts):

if retryAttempt > maxRetries:
    if _fallbackModelIndex + 1 < _fallbackModels.length:
        log: "Model X failed after N retries (error: ...). Falling back to Y."
        await _switchToFallbackModel(_fallbackModelIndex + 1)
        _retryAttempt = 0
        return true  // retry initiated with new model
    else:
        emit auto_retry_end with failure
        return false

_switchToFallbackModel(index):

Resolves the model string via resolveCliModel against the registry
Calls this.setModel(resolvedModel) (already exists)
Updates _fallbackModelIndex
If resolution fails, skip to next fallback or fail

4. Surfacing the resolved model

spawnSubagent already captures resolvedModel from the child's agent_start JSONL event. Whichever model ultimately succeeds will emit its own agent_start, so SubagentResult.model will automatically reflect the actually-used model. No changes needed here.

5. Logging design principle

Per the issue's "Fail loud, not verbose":

Each fallback switch: [subagent] Model "openai-codex/gpt-5.5" failed (rate limit). Falling back to "kimi-coding-oauth/kimi-for-coding".
Final failure after all fallbacks exhausted: concise error mentioning the originally chosen model and how many were attempted.

Acceptance Criteria Mapping

Criterion	How it's addressed
Primary model hits rate limit → attempt next fallback	`_handleRetryableError` switches model after max retries
Transparent to parent	Child handles it internally; parent sees normal success/fail
Resolved model surfaced in result	`agent_start` JSONL event carries the active model ID; `spawnSubagent` already captures it
Spawn-time fallback unchanged	`resolveModelWithFallbacks` logic is untouched
Every fallback logged loudly	Single concise log line per switch
Final failure is concise	Error message names the chosen model and attempted count

Testing Approach

test/subagent-model-fallback.test.ts — extend with:
- executeSingle preserves modelFallbacks when agent config is an array
- executeSingle does NOT preserve fallbacks when modelOverride is set
- spawnSubagent includes --fallback-models arg when modelFallbacks is set
test/agent-session-fallback.test.ts (new) — create minimal AgentSession with mocked Agent, ModelRegistry, SettingsManager. Test:
- _handleRetryableError switches to next fallback after max retries
- _handleRetryableError emits auto_retry_end when all fallbacks exhausted
- _switchToFallbackModel resolves model via registry and calls setModel
- Unresolvable fallback is skipped and next is tried
Integration sanity check (manual): spawn a subagent with a fallback list, verify --fallback-models appears in ps or by reading the child's startup log.

Risks and Open Questions

Retry vs fallback interaction: _handleRetryableError does exponential backoff before falling back. For a 429 with "try again in 5 min", this could delay fallback by ~30s (default maxRetries=3, baseDelayMs=1000 → ~7s total). This is acceptable — temporary rate limits should get a chance to resolve. Persistent quota errors will exhaust retries and then fall back.
Model resolution in child: The child resolves fallback model strings against its own ModelRegistry. Since it shares the same env/auth as the parent, this should match spawn-time resolution. Edge case: if auth expires between spawn and fallback, the child may fail to resolve a model that resolved at spawn time. This is acceptable (the model genuinely became unavailable).
Scoped models collision: AgentSession already has scopedModels for Ctrl+P cycling. fallbackModels is a separate concept (automatic, not user-initiated). They can coexist; cycleModel continues to use scopedModels.
Test complexity: AgentSession has many dependencies. The new test file will need mocks for Agent, ModelRegistry, SettingsManager, SessionManager, and event emission. Consider extracting _switchToFallbackModel as a pure-ish helper if mocking becomes unwieldy.

Plan created by mach6

aebrer · 2026-05-14T20:38:41Z

Dinesh vs Gilfoyle Review — Plan for issue 198

3 rounds of mass destruction

Best of the Banter

Gilfoyle: "You built a feature that fixes a rate limit by gaslighting the user's settings."

Dinesh: "I genuinely hope that was an accident and not sabotage."
Gilfoyle: "Most plans only have one fatal flaw. Yours has... more."

Dinesh: "Did Gilfoyle just admit I was right about the mechanism being necessary? I'm not going to make a big deal out of this. [pause] OK I am a little."

Verdict

Critical (Gilfoyle won, Dinesh conceded)

AgentSession.setModel persists to settings.json — When the child falls back to a different model, setModel writes the fallback model as the new global default via settingsManager.setDefaultModelAndProvider. Subagent children share the user's real ~/.dreb/agent/ directory.
Fix: Extract a setActiveModel method that performs only runtime model switching. Have setModel call setActiveModel plus persistence. _switchToFallbackModel calls setActiveModel only.
_isRetryableError regex overreach on quota — Adding quota|insufficient_quota to the retryable regex treats hard billing failures as transient. Retrying a quota error burns guaranteed-fail API calls and delays fallback.
Fix: Remove insufficient_quota from the regex. Rely on existing rate limit|429|too many requests for transient cases. Quota exhaustion will still be caught by the broader error handling path (the agent loop sees stopReason: "error" and _handleRetryableError can be taught to treat ANY error as a fallback trigger when fallbackModels are configured — see "Immediate fallback gate" below).
--provider unconditional passing breaks mixed-provider lists — Removing the !modelStr.includes("/") guard forces all fallback models through the parent's provider. An openai/gpt-4o entry filtered through --provider anthropic fails resolution.
Fix: Restore the hasProvider conditional. Pass --provider only for bare fallback IDs that need provider context.

Important (Gilfoyle won after debate)

Try/catch granularity around _switchToFallbackModel — Wrapping the entire fallback loop means one malformed entry aborts the chain.
Fix: Per-fallback try/catch. Each entry is resolved independently; failures advance the index and try the next.
auto_retry_fallback event not integrated — A new event type must be added to the AgentSessionEvent union and handled in interactive mode / Telegram dispatchers or it will be silently dropped.
Fix: Add to AgentSessionEvent union (agent-session.ts:~118), handle in interactive mode event switch (interactive-mode.ts), and mirror in Telegram handlers (telegram/src/handlers/events.ts).
--fallback-models leaks internal wire protocol — Exposing an internal parent-child mechanism as a first-class CLI flag pollutes the public Args interface.
Fix: Use a namespaced hidden flag (--dreb-internal-fallback-models) or environment variable (DREB_FALLBACK_MODELS) for the parent→child channel.
Test plan diverged from established pattern — Proposing "minimal mocked AgentSession" ignores the existing agent-session-retry.test.ts pattern which uses real Agent, ModelRegistry, SettingsManager, and temp directories.
Fix: Follow the established pattern — real infrastructure with a mocked streamFn to inject errors and successes.
Deduplication placement — Running deduplication inside the fallback loop wastes cycles.
Fix: Dedupe once during AgentSession initialization or in executeSingle before passing to the child.

Contested (Dinesh held his ground)

Retry-After header inspection — Gilfoyle argued for parsing HTTP Retry-After headers to skip retries when delays are long. Dinesh defended that HTTP metadata is not plumbed through @dreb/ai into AssistantMessage at the AgentSession layer; doing so would require a cross-package refactor of the provider abstraction.
Resolution: Out of scope for this feature. The existing retry-then-fallback ordering is an accepted trade-off, documented in the plan's risks section.

Dismissed (Gilfoyle was nitpicking)

Help text exposure — Gilfoyle's initial framing was that --fallback-models would appear in help text. Dinesh correctly noted that printHelp() is manually maintained; adding a field to Args does not automagically append it. The reframed concern (protocol leakage into public API) was elevated to Important and addressed above.

Strengths (even Gilfoyle grudgingly acknowledged)

Child-side fallback is the right architecture — Parent-side respawn would require parsing stderr/exit codes and would lose partial progress. Keeping fallback inside the child process is transparent, preserves tool call results, and matches the existing retry infrastructure.
agent_start re-emission naturally surfaces the resolved model — Agent.continue() emits agent_start with the current model. Since setModel mutates Agent._state.model, the parent spawnSubagent captures the fallback model ID automatically via JSONL. No extra plumbing needed.

Score

Gilfoyle: 8 | Dinesh: 2

Amended Plan — Recommended Changes

The following amendments should be applied to the implementation plan before work begins:

Extract setActiveModel from setModel (agent-session.ts:1810). Runtime-only model switch with no settings persistence. setModel wraps it.
Restore provider guard in spawnSubagent (subagent.ts). Pass --provider only when the spawn-time model lacks a slash.
Remove insufficient_quota from _isRetryableError regex (agent-session.ts:2872). Keep existing transient patterns only.
Add immediate-fallback gate in _handleRetryableError: when fallbackModels are configured and ANY error occurs (including non-retryable), consider falling back immediately or after a single retry — do not burn retries on quota errors. (Alternative: treat fallbackModels as a bypass for the retry counter entirely — if the current model fails for ANY reason, try the next fallback after at most one retry.)
Per-fallback try/catch in _switchToFallbackModel or _handleRetryableError. One bad entry skips to the next.
Namespace the fallback flag — --dreb-internal-fallback-models or DREB_FALLBACK_MODELS env var.
Integrate auto_retry_fallback event into AgentSessionEvent union and all UI handlers.
Dedupe fallback list once during construction, not in the loop.
Follow real-infrastructure test pattern per agent-session-retry.test.ts.

Review by mach6 /dg

aebrer · 2026-05-14T21:29:00Z

Scope Clarification

The implementation plan for this issue needs a significant course correction. After review, the correct scope is much narrower than what was previously planned.

What this feature is NOT

NO automatic fallback in the main AgentSession. The main dreb session must never automatically switch models. A weak model getting that authority would be catastrophic.
NO automatic model switching inside a subagent child process. The subagent child must not do runtime model fallback. Its session behaves exactly like any other session.

What this feature IS

The fallback exists only at subagent spawn time, in the parent process.

resolveModelWithFallbacks already picks the first resolvable model at spawn time (checking auth, registry, etc.). The gap is: a model can pass spawn-time validation but immediately fail at runtime with a rate limit, quota error, or auth revocation. In that case, the parent should respawn the subagent with the next model in the agent definition's fallback list.

Flow:

Parent tries model 1 from the fallback list
Subagent spawns, fails immediately with a rate-limit / quota / auth error
Parent detects this from the subagent's exitCode / stderr / errorMessage
Parent respawns with model 2
Repeat until one succeeds or all are exhausted
Final fallback is still the parent model

Approach: Parent-side respawn. The child process receives only a single --model arg, exactly as before. No --fallback-models flag. No changes to AgentSession, _handleRetryableError, CLI args, or event types.

Files to modify

Only packages/coding-agent/src/core/tools/subagent.ts:

executeSingle — add a loop that tries each model in the fallback list
Detect rate-limit / quota / auth errors in SubagentResult
Respawn with next fallback model on detection
Log each fallback loudly (single concise line)
Surface the actually-used model in the final SubagentResult

Files to NOT modify

packages/coding-agent/src/core/agent-session.ts — no _switchToFallbackModel, no _handleRetryableError changes, no auto_retry_fallback event
packages/coding-agent/src/cli/args.ts — no --fallback-models flag
packages/coding-agent/src/main.ts — no fallbackModels passing
packages/coding-agent/src/core/sdk.ts — no fallbackModels in CreateAgentSessionOptions
packages/coding-agent/src/modes/interactive/interactive-mode.ts — no new event handler
packages/telegram/src/handlers/events.ts — no new event handler

All previously planned changes to the above files should be discarded. The feature lives entirely in the subagent spawn logic.

aebrer · 2026-05-14T21:29:22Z

Additional Design Note

An acceptable implementation approach: make a lightweight API call at spawn time to verify the model is actually responsive before committing to it. If the probe returns a rate limit, quota, or auth error, skip that model and try the next fallback immediately. This avoids the respawn overhead entirely.

This is preferred over spawning blindly and then parsing exit codes / stderr to detect failure.

aebrer · 2026-05-14T21:42:35Z

Amended Implementation Plan — Issue 198

Problem Analysis

resolveModelWithFallbacks in subagent.ts picks the first resolvable model at spawn time by checking registry existence and auth configuration. A model can pass all of these static checks but immediately fail at runtime with a rate limit, quota error, or auth revocation. The child process receives only that single model via --model and has no way to switch fallbacks. The subagent fails entirely.

The previous plan attempted child-side fallback, which would have touched AgentSession, CLI args, SDK, and multiple UI handlers. That scope was rejected.

Scope

Only packages/coding-agent/src/core/tools/subagent.ts is modified.

No changes to AgentSession, CLI args, SDK, event types, interactive mode, or Telegram handlers.

Approach: Spawn-Time Probe with Fallback Loop

Instead of spawning blindly and then parsing exit codes / stderr to detect failure (respawn approach), we make a lightweight API call at spawn time to verify the model is actually responsive. If the probe returns a rate limit, quota, or auth error, we skip that model and try the next fallback immediately. This avoids respawn overhead entirely.

Flow:

executeSingle extracts the fallback list from the agent config (only when config.model is an array and no modelOverride is set)
For each model in the list:
a. Resolve it via resolveModelStringSingle (registry + auth check)
b. Probe: call complete() from @dreb/ai with maxTokens: 1, a short AbortSignal timeout (~10s), and maxRetryDelayMs: 0
c. If the probe returns an error for any reason, log a single concise line and continue to the next fallback
d. If the probe succeeds, use this model
If all configured fallbacks fail the probe, fall back to the parent model (existing behavior)
Spawn the subagent with the chosen model exactly as before

The probe is best-effort: it only runs when a ModelRegistry is available (needed to look up the Model object for @dreb/ai). When there is no registry, behavior falls back to existing spawn-time resolution.

Deliverables

probeModelAvailability helper in subagent.ts — lightweight complete() call that returns whether the model is available or failed for any reason
Fallback loop in executeSingle — iterates the agent definition's fallback list, probing each before committing to spawn
Loud logging — every skipped model is logged in a single concise line; final choice is logged
Tests — probe behavior, fallback loop, and error classification

Files to Create or Modify

File	Change
`packages/coding-agent/src/core/tools/subagent.ts`	Add `probeModelAvailability` helper; modify `executeSingle` to loop through fallback list and probe each model before spawning; add `isRuntimeUnavailableError` error classifier
`packages/coding-agent/test/subagent-model-fallback.test.ts`	Add tests for probe behavior: probe succeeds on first model, probe skips rate-limited model and falls back, probe exhausts all models, single-model config skips probing

Files to NOT Modify

packages/coding-agent/src/core/agent-session.ts
packages/coding-agent/src/cli/args.ts
packages/coding-agent/src/main.ts
packages/coding-agent/src/core/sdk.ts
packages/coding-agent/src/modes/interactive/interactive-mode.ts
packages/telegram/src/handlers/events.ts

Detailed Design

1. `probeModelAvailability` helper

async function probeModelAvailability(
  model: Model<Api>,
  signal?: AbortSignal,
): Promise<{ ok: true } | { ok: false; reason: string }>

Creates a minimal Context with one user message ("hi")
Calls complete(model, context, { maxTokens: 1, signal, maxRetryDelayMs: 0 })
If the result's stopReason is "error" or the call throws, returns { ok: false, reason } with the error message
Returns { ok: true } only on a clean success

2. Error classification

The probe treats any error as a reason to skip the model and try the next fallback. This includes rate limits, quota exhaustion, auth failures, timeouts, 5xx errors, and network failures.

Rationale: the parent model is the final fallback. If the parent model is working, it is good enough to orchestrate and good enough for any subagent task. If the parent model is also failing, nothing matters. There is no benefit to spawning a child with a model that has already demonstrated it cannot complete a 1-token request.

3. `executeSingle` fallback loop

Current behavior (single string or override): unchanged.

New behavior (when config.model is a string[] and no override):

const fallbackList = config.model;
const skippedModels: Array<{ model: string; reason: string }> = [];
for (const modelStr of fallbackList) {
  const resolved = resolveModelStringSingle(modelStr, parentProvider, registry);
  if (!resolved.ok) continue;

  const modelObj = registry?.find(resolved.provider, resolved.modelId);
  if (modelObj) {
    const probe = await probeModelAvailability(modelObj, abortSignal);
    if (!probe.ok) {
      console.error(`[subagent] Model "${modelStr}" failed probe (${probe.reason}). Trying next fallback...`);
      skippedModels.push({ model: modelStr, reason: probe.reason });
      continue;
    }
  }

  // Use this model
  effectiveConfig = { ...config, model: resolved.modelId };
  resolvedProvider = resolved.provider;
  break;
}

If the loop exhausts all configured fallbacks without finding a passing probe, resolveModelWithFallbacks behavior is preserved: try the parent model as a final fallback with a warning.

4. Surfacing the resolved model

spawnSubagent already captures the model from the child's agent_start JSONL event into resolvedModel. Whichever model ultimately succeeds will emit its own agent_start, so SubagentResult.model automatically reflects the actually-used model. No extra plumbing needed.

5. Fallback visibility

Fallback information must be visible to both the user and the parent agent regardless of session type (TUI, Telegram, JSONL). This requires two mechanisms:

console.error() — emitted during executeSingle as models are skipped. Captured in stderr/logs for terminal visibility.
Prepended to result.output — a summary of which models were skipped and which was ultimately used is prepended to the subagent's output text (same pattern as the existing warning prepend from resolveModelWithFallbacks). This ensures the parent agent sees it in the tool result content, which is rendered by the TUI, sent by Telegram, and included in JSONL message events.

The parent agent then knows which model was actually used and can account for capability differences in the response.

Acceptance Criteria Mapping

Criterion	How it's addressed
Primary model hits rate limit → attempt next fallback	`executeSingle` probes each fallback in order; skips models that fail the probe for any reason
Transparent to parent	Parent sees normal success/fail from `spawnSubagent`; fallback logic is entirely inside `executeSingle`
Resolved model surfaced in result	`agent_start` JSONL event carries the active model ID; `spawnSubagent` already captures it. Fallback summary is prepended to `result.output`
Spawn-time fallback unchanged	`resolveModelWithFallbacks` logic is untouched; probe is an additional gate before spawning
Every fallback logged loudly	Skipped models are logged via `console.error` AND summarized in `result.output` so the parent agent sees them
Final failure is concise	If all fallbacks fail, existing `resolveModelWithFallbacks` error behavior is preserved

Testing Approach

Extend packages/coding-agent/test/subagent-model-fallback.test.ts:

Probe succeeds on first model — mock complete to return success; verify spawnSubagent is called with the first model
Probe skips failed model, falls back to second — mock complete to return an error for the first model and success for the second; verify the second model is used and the fallback summary is present in result.output
Probe skips on any error type — mock complete with rate-limit, quota, timeout, and 5xx errors; verify fallback behavior in all cases
Probe exhausts all configured fallbacks, parent model succeeds — mock all probes to fail; verify parent model is used with warning
Single model config skips probing — verify complete is not called when config.model is a string
Model override skips probing — verify complete is not called when a per-invocation override is set

Mock strategy:

vi.mock("@dreb/ai", async () => ({ complete: vi.fn(), ... }))
Mock registry with find() returning Model objects
Spy on console.error to verify log lines

Risks and Open Questions

Probe costs: Each probe is a 1-token API call. For a 4-model fallback list where the first 3 fail, this costs ~3 minimal API calls. This is acceptable given it prevents a full subagent spawn + failure cycle.
Transient network errors skip good models: A local network blip could cause all probes to fail, pushing everything to the parent model fallback. This is acceptable because the parent model is guaranteed to be working (the parent session is using it), and it is good enough for any subagent task.
Probe false negatives: A model could pass the probe but fail on the first real request in the child (e.g., rate limit triggers between probe and spawn). This is an accepted trade-off; the probe catches the common case of immediate unavailability. A future enhancement could add respawn-on-failure as a secondary safety net, but that is out of scope for this plan.
No registry = no probe: In environments without a ModelRegistry, the probe is skipped and existing behavior is preserved. This matches the current test infrastructure.
complete import from @dreb/ai: packages/coding-agent already depends on @dreb/ai. The import is a simple addition with no package-level changes.

Plan created by mach6

aebrer · 2026-05-14T21:53:39Z

Progress Update

Implemented the amended plan for subagent runtime model fallback:

Added spawn-time availability probes for agent definition fallback lists.
Probe uses a lightweight one-token complete() call and skips any model that returns or throws an error.
Preserved existing behavior for single-model configs, per-invocation overrides, and registry-less sessions.
Falls back to the parent model after configured fallbacks fail.
Logs skipped models loudly and prepends fallback details to subagent output so the parent agent sees what happened.
Added tests covering successful probes, fallback after failed probes, error-type skipping, parent fallback, and single-model probe bypass.

Verification run:

npx vitest --run packages/coding-agent/test/subagent-model-fallback.test.ts
npx biome check --write packages/coding-agent/src/core/tools/subagent.ts packages/coding-agent/test/subagent-model-fallback.test.ts
npm run build
npm test
npm run verify-workspace-links

Commit: 26e97fe

Progress tracked by mach6

aebrer · 2026-05-14T22:13:03Z

Code Review

Critical

None.

Important

finding 1 — Parent abort during probe loop is reported as model unavailability
Severity: high. Confidence: 90.
File: packages/coding-agent/src/core/tools/subagent.ts (makeProbeSignal, probeModelAvailability, resolveModelForSubagentSpawn).
When the parent AbortSignal fires while a probe is awaiting complete(), the abort is treated like a probe failure. The loop can mark each fallback as unavailable and eventually return None of the fallback models passed availability checks instead of the accurate cancellation result. Suggested fix: check signal?.aborted before and after each probe and return an abort-specific error immediately.

finding 2 — Already-aborted parent signal may remove the probe timeout safety net
Severity: high. Confidence: 90.
File: packages/coding-agent/src/core/tools/subagent.ts (makeProbeSignal).
If the parent signal is already aborted when makeProbeSignal runs, the function aborts the inner controller but does not arm the timeout. If a provider ignores or lazily observes AbortSignal, complete() can hang indefinitely despite the intended 10 second ceiling. Suggested fix: enforce timeout independently of provider abort handling, ideally with a Promise.race timeout around complete().

finding 3 — Fallback summary output is not tested
Severity: high. Confidence: 95.
File: packages/coding-agent/src/core/tools/subagent.ts (formatModelFallbackSummary, executeSingle) and packages/coding-agent/test/subagent-model-fallback.test.ts.
The user-facing behavior that prepends skipped-model details to result.output is central to the "fail loud" requirement, but it is not directly tested. Suggested fix: add unit coverage for the formatter and an integration-style test that verifies executeSingle prepends the fallback summary when skippedModels is non-empty.

Suggestions

finding 4 — Cancellation is not checked at the top of the fallback loop
Severity: medium. Confidence: 85.
File: packages/coding-agent/src/core/tools/subagent.ts (resolveModelForSubagentSpawn).
Once cancellation happens, the loop can continue resolving and probing remaining models. This wastes work and combines badly with the already-aborted signal path. Suggested fix: check signal?.aborted at the start of each iteration and exit immediately.

finding 5 — Per-model skip reasons are discarded on total failure
Severity: medium. Confidence: 95.
File: packages/coding-agent/src/core/tools/subagent.ts (executeSingle).
When resolveModelForSubagentSpawn returns ok: false, executeSingle returns only resolved.error; the detailed skippedModels reasons are lost outside stderr. Suggested fix: append skipped model details to the returned errorMessage or output on the failure path.

finding 6 — Registry-less array fallback boundary is untested
Severity: medium. Confidence: 90.
File: packages/coding-agent/test/subagent-model-fallback.test.ts.
The implementation intentionally skips probing when a fallback list is present but no registry is available. Add a test asserting that an array model config with registry undefined resolves via existing fallback behavior and does not call complete().

finding 7 — Aborted assistant messages from the probe are not directly tested
Severity: medium. Confidence: 90.
File: packages/coding-agent/test/subagent-model-fallback.test.ts.
isRuntimeUnavailableError covers stopReason: "aborted", but probeModelAvailability itself is not tested for a returned assistant message with that stop reason. Add a direct probe test.

finding 8 — Probe timeout and parent-abort propagation are untested
Severity: medium. Confidence: 85.
File: packages/coding-agent/test/subagent-model-fallback.test.ts.
makeProbeSignal composes timeout and parent abort behavior, but the timeout, already-aborted parent, listener cleanup, and parent abort propagation paths are not covered. Add focused tests for these paths or for probeModelAvailability behavior under fake timers.

finding 9 — Parent model failure after all probes fail is untested in the new resolver
Severity: medium. Confidence: 85.
File: packages/coding-agent/test/subagent-model-fallback.test.ts.
resolveModelWithFallbacks has coverage for parent-model failure, but resolveModelForSubagentSpawn does not. Add a test where configured fallback probes fail and the parent model fails resolution, verifying the final concise error and retained skipped models.

finding 10 — Missing explicit test for per-invocation override probe skip
Severity: low. Confidence: 95.
File: packages/coding-agent/test/subagent-model-fallback.test.ts.
The internal branch is covered by the single-string test, but the accepted plan explicitly called out per-invocation override probe skipping. Add a targeted test to document the contract.

finding 11 — Redundant ternary in resolveModelForSubagentSpawn
Severity: low. Confidence: 100.
File: packages/coding-agent/src/core/tools/subagent.ts.
The no-probe branch returns the same spread expression in both ternary arms. Replace it with return { ...resolved, skippedModels: [] };.

finding 12 — makeProbeSignal can be simplified with clearer control flow
Severity: low. Confidence: 90.
File: packages/coding-agent/src/core/tools/subagent.ts.
The function uses mutable optional bindings for timeout and parent abort handling. An early-return structure for the already-aborted case would be easier to read. If fixing finding 2, prefer a structure that preserves a hard timeout even when the parent is already aborted.

Strengths

Scope stays focused on subagent.ts and tests; no child-side fallback, CLI, SDK, AgentSession, UI, or Telegram changes leaked in.
The spawn-time one-token probe matches the amended plan and avoids fragile stderr parsing.
Single-model configs, per-invocation overrides, and registry-less sessions preserve existing behavior.
Provider/auth gating remains explicit before probing or spawning.
Skipped models are logged loudly and summarized for successful subagent results.

Agents run: code-reviewer, error-auditor, test-reviewer, completeness-checker, simplifier

Reviewed by mach6

aebrer · 2026-05-14T22:19:40Z

Review Assessment

Review comment: #209 (comment)

Classifications

Finding	Classification	Reasoning
finding 1 — Parent abort during probe loop is reported as model unavailability	genuine	`probeModelAvailability` collapses thrown aborts and `stopReason: "aborted"` into `{ ok: false }`, and `resolveModelForSubagentSpawn` then treats that like model unavailability. Cancellation should short-circuit with an accurate abort result instead of trying fallbacks.
finding 2 — Already-aborted parent signal may remove the probe timeout safety net	genuine	`makeProbeSignal` does not arm its timeout in the already-aborted parent branch. The code promises a layered timeout, but a provider that ignores the signal could hang past the intended ceiling.
finding 3 — Fallback summary output is not tested	genuine	The fail-loud user-facing summary is generated after resolution and prepended in `executeSingle`, but current tests only assert resolver return values. This PR adds testable behavior that should be covered.
finding 4 — Cancellation is not checked at the top of the fallback loop	genuine	The loop continues resolving/probing after cancellation unless a later path happens to stop it. This wastes work and compounds the cancellation misclassification.
finding 5 — Per-model skip reasons are discarded on total failure	genuine	`executeSingle` receives `resolved.skippedModels` on the failure path but returns only `resolved.error`. The detailed reasons are lost outside stderr, weakening the fail-loud guarantee.
finding 6 — Registry-less array fallback boundary is untested	genuine	The new resolver has an explicit array-without-registry branch that preserves old behavior, but no test exercises it. This is a relevant boundary for the PR's compatibility contract.
finding 7 — Aborted assistant messages from the probe are not directly tested	genuine	`isRuntimeUnavailableError` is tested for aborted messages, but `probeModelAvailability` is not tested for the returned-message path. That branch should be covered directly.
finding 8 — Probe timeout and parent-abort propagation are untested	genuine	Timeout/abort composition is core new behavior and currently has no focused coverage for timeout firing, already-aborted parents, parent abort propagation, or cleanup.
finding 9 — Parent model failure after all probes fail is untested in the new resolver	genuine	`resolveModelForSubagentSpawn` has a distinct final error path when configured probes fail and the parent model also fails resolution. Existing tests cover the older resolver, not this new path.
finding 10 — Missing explicit test for per-invocation override probe skip	nitpick	The same internal behavior is already covered by the single-string skip test. A dedicated override test would document intent but is unlikely to catch a distinct bug.
finding 11 — Redundant ternary in `resolveModelForSubagentSpawn`	nitpick	Both ternary branches return the same value. This is harmless style cleanup.
finding 12 — `makeProbeSignal` can be simplified with clearer control flow	nitpick	This is readability-only on its own, and will likely be superseded by the real timeout/cancellation fix.
additional live finding — OpenAI Codex probe omits required instructions	genuine	During this assessment, `openai-codex/gpt-5.5` failed the new probe with `{ "detail": "Instructions are required" }` even though the parent is successfully using that model. The OpenAI Codex provider sends `instructions: context.systemPrompt`, but `probeModelAvailability` builds a context with only messages and no `systemPrompt`, creating a false negative for working Codex models.

Action Plan

Fix the OpenAI Codex false-negative probe first: include a minimal systemPrompt/instructions in the probe context, or otherwise make the probe satisfy provider-specific required fields.
Fix cancellation semantics: check signal?.aborted before each fallback iteration and after each probe, and return an abort-specific error instead of treating cancellation as model unavailability.
Make the probe timeout a hard ceiling independent of provider AbortSignal compliance, preferably with a timeout race around complete().
Preserve skipped-model details on total failure by appending them to errorMessage or output when resolveModelForSubagentSpawn returns ok: false.
Add tests for the Codex instructions probe case, fallback summary/prepending, registry-less array behavior, returned stopReason: "aborted", timeout/abort propagation, and parent-model final failure.
Optionally clean up the redundant ternary while touching the resolver.

Assessment by mach6

aebrer · 2026-05-14T22:37:12Z

Progress Update

Fixed the genuine review findings for subagent spawn-time model probes:

Added a minimal probe systemPrompt so OpenAI Codex models no longer fail with Instructions are required during availability checks.
Made probe cancellation explicit: parent aborts now short-circuit instead of being reported as model unavailability.
Added a hard timeout race around API-key lookup and the probe completion so providers that ignore abort cannot hang the spawn path.
Preserved per-model skipped fallback reasons on total failure.
Added focused coverage for aborted probe messages, parent abort propagation, timeout behavior, registry-less arrays, parent-model failure, and fallback summary formatting/prepending.
Relaxed a flaky live OpenAI Codex image-tool assertion to verify the visual color rather than requiring the model to repeat the word "circle".

Verification run:

npx vitest --run packages/coding-agent/test/subagent-model-fallback.test.ts
npx tsgo --noEmit
npm run build
npm test
npm run verify-workspace-links

Commit: b8ec34d

Progress tracked by mach6

aebrer · 2026-05-14T23:03:07Z

Code Review

Critical

None.

Important

finding 1 — executeSingle fallback integration is untested
Severity: high. Confidence: 95.
File: packages/coding-agent/src/core/tools/subagent.ts (executeSingle) and packages/coding-agent/test/subagent-model-fallback.test.ts.
The new resolver and pure formatting helpers are covered, but the production entry point that composes resolution, warning prepending, fallback-summary prepending, and failure-detail formatting has no direct coverage. A regression in ordering, newline handling, selected-model fallback, or failure formatting would not be caught.

finding 2 — Per-invocation model override probe skip is not integration-tested
Severity: high. Confidence: 90.
File: packages/coding-agent/src/core/tools/subagent.ts (executeSingle) and packages/coding-agent/test/subagent-model-fallback.test.ts.
The PR explicitly preserves the contract that per-invocation model overrides bypass the fallback probe path. Current tests cover the same internal non-array branch through single-model configs, but not the real override path against an agent config with a multi-model fallback list.

Suggestions

finding 3 — makeProbeSignal listener cleanup is not verified
Severity: medium. Confidence: 85.
File: packages/coding-agent/src/core/tools/subagent.ts (makeProbeSignal) and packages/coding-agent/test/subagent-model-fallback.test.ts.
makeProbeSignal attaches a parent abort listener and arms a timeout, and cleanup removes the listener and clears the timeout. No test verifies cleanup, so listener leaks on reused parent signals could slip through.

finding 4 — probeModelAvailability registry-less API-key path is untested
Severity: medium. Confidence: 85.
File: packages/coding-agent/src/core/tools/subagent.ts (probeModelAvailability) and packages/coding-agent/test/subagent-model-fallback.test.ts.
The helper accepts an optional registry and should call complete with apiKey: undefined when no registry is present. Current probe tests pass a registry mock, leaving this compatibility path unexercised.

finding 5 — Probe skip when registry.find() returns undefined is untested
Severity: medium. Confidence: 80.
File: packages/coding-agent/src/core/tools/subagent.ts (resolveModelForSubagentSpawn) and packages/coding-agent/test/subagent-model-fallback.test.ts.
If static resolution succeeds but registry.find() returns undefined, the code intentionally skips probing and uses the model. That defensive edge case is not covered by tests.

finding 6 — Redundant probe.aborted guard in fallback loop
Severity: low. Confidence: 98.
File: packages/coding-agent/src/core/tools/subagent.ts (resolveModelForSubagentSpawn).
probeModelAvailability only sets aborted: true when the parent signal is already aborted, and AbortSignal.aborted is monotonic. The immediately following signal?.aborted check covers the same case, so the explicit probe.aborted branch is redundant.

finding 7 — Duplicate skipped-model list formatting
Severity: low. Confidence: 90.
File: packages/coding-agent/src/core/tools/subagent.ts (formatModelFallbackSummary, formatSkippedModelFailureDetails).
Both functions build the same bullet list from SkippedFallbackModel[]. Extracting a shared formatter would remove duplication and keep future formatting tweaks in one place.

finding 8 — Verbose inline attempted-model array construction in final error message
Severity: low. Confidence: 95.
File: packages/coding-agent/src/core/tools/subagent.ts (resolveModelForSubagentSpawn).
The final error message constructs the attempted model list inline inside a template literal. A named attemptedModels intermediate would be easier to read without changing behavior.

Strengths

The implementation stays tightly scoped to subagent.ts and tests; no AgentSession, CLI, SDK, UI, or Telegram changes leaked in.
The spawn-time one-token probe matches the amended plan and avoids fragile child stderr parsing.
Cancellation and hard-timeout behavior are substantially improved from the earlier review state.
Fallbacks are loud in both stderr and the subagent result output.
Requirements are otherwise complete, and current tests cover most probe/fallback/error paths.

Agents run: code-reviewer, error-auditor, test-reviewer, completeness-checker, simplifier

Reviewed by mach6

aebrer · 2026-05-14T23:06:12Z

Review Assessment

Review comment: #209 (comment)

Classifications

Finding	Classification	Reasoning
finding 1 — `executeSingle` fallback integration is untested	genuine	`executeSingle` composes resolver results, skipped-model failure details, fallback-summary prepending, and warning prepending, but current tests only cover resolver/formatter helpers directly. A regression in that production composition would not be caught.
finding 2 — Per-invocation model override probe skip is not integration-tested	genuine	The override path relies on `modelOverride
finding 3 — `makeProbeSignal` listener cleanup is not verified	nitpick	Cleanup appears correct and is called from `probeModelAvailability`'s `finally`; directly asserting listener internals would mostly test private implementation mechanics.
finding 4 — `probeModelAvailability` registry-less API-key path is untested	nitpick	The helper supports absent registry, but production probing only runs when a registry exists. This is a low-value compatibility-path test.
finding 5 — Probe skip when `registry.find()` returns undefined is untested	nitpick	This is a defensive edge for an inconsistent registry after static resolution succeeds. Useful coverage, but not a meaningful merge blocker.
finding 6 — Redundant `probe.aborted` guard in fallback loop	nitpick	The guard is redundant with the immediately following `signal?.aborted` check because abort signals are monotonic, but it is harmless.
finding 7 — Duplicate skipped-model list formatting	nitpick	Minor duplication between two formatting helpers; no correctness issue.
finding 8 — Verbose inline attempted-model array construction in final error message	nitpick	Readability-only style cleanup.

Action Plan

Add integration-style coverage for executeSingle/tool execution fallback behavior: skipped-model summary prepending, warning ordering, selected model, and failure-detail formatting.
Include a targeted override test where the agent definition has a multi-model fallback list but the invocation supplies model, asserting fallback probes are not run and the override model is used.

Assessment by mach6

aebrer · 2026-05-15T12:46:49Z

Progress Update

Fixed the two genuine review findings for subagent fallback test coverage:

Exported executeSingle and AgentTypeConfig for focused integration coverage.
Added mocked subagent process tests for executeSingle composition:
- warning appears before fallback summary when parent-model fallback is used
- failed resolution includes skipped-model details in the error message
- per-invocation model override skips fallback probes and uses the override model

Verification run:

npx vitest --run packages/coding-agent/test/subagent-model-fallback.test.ts
npx biome check --write packages/coding-agent/src/core/tools/subagent.ts packages/coding-agent/test/subagent-model-fallback.test.ts
npx tsgo --noEmit
npm run build
npm run verify-workspace-links

Note: npm test with the normal local auth store hit live OpenAI Codex usage-limit failures. Re-running with a clean temporary HOME skipped those live OAuth tests and showed no test failures in the logs.

Commit: 8dd5ae3

Progress tracked by mach6

m-aebrer · 2026-05-15T13:30:45Z

Code Review

Critical

None.

Important

finding 1 — Probe-at-spawn only partially addresses "runtime rate limits trigger fallback"
Severity: high. Confidence: 88.
File: packages/coding-agent/src/core/tools/subagent.ts (resolveModelForSubagentSpawn).
The issue title is "runtime rate limits cannot trigger fallback" and AC1 requires fallback "when a subagent's primary model hits a non-retryable rate limit or quota error." The probe catches models already unavailable at spawn time, but a model that passes the 1-token probe and then hits a rate limit on API call 5 of a 20-call subagent run fails with no fallback — the parent receives an error result with zero models retried. The "dies mid-flight" case — which is the most likely scenario for long-running subagents processing large codebases — gets no coverage. AC2's guarantee ("fails only after all fallbacks are exhausted") is not met for this failure class. The PR comments explicitly narrowed scope to avoid AgentSession/CLI changes, which is pragmatic, but leaves a gap against the issue's acceptance criteria as written.

finding 2 — Parent model final fallback bypasses availability probe
Severity: medium. Confidence: 87.
File: packages/coding-agent/src/core/tools/subagent.ts (lines 710-722 in resolveModelForSubagentSpawn).
Every model in the agent definition's fallback list is probed before use, but the parent model — the last-resort fallback — skips the probe entirely. resolveModelStringSingle only checks registry existence and auth credential records, not actual API availability. In a provider-wide outage (e.g., all configured models and the parent share the same provider), all probes fail correctly, then the parent model passes static resolution and returns ok: true with a reassuring "Falling back to parent model" log. The subagent spawns and immediately fails at runtime. The probe system gives false assurance before a spawn-then-fail. Either probe the parent model too, or add an explicit code comment acknowledging the deliberate omission.

finding 3 — executeSingle fallback integration untested for model-less agents
Severity: medium. Confidence: 95.
File: packages/coding-agent/test/subagent-model-fallback.test.ts.
All executeSingle tests use agents with a model fallback list. The path where modelSpec is falsy (agent config has no model field, no modelOverride set) silently skips the entire probe block and calls spawnSubagent directly. This is a real-world path — agents without explicit model fields are valid. A regression in the if (modelSpec) guard would break all unspecified-model agents with no test to catch it.

Suggestions

finding 4 — isRuntimeUnavailableError has unreachable branches and misleading export
Severity: low. Confidence: 85.
File: packages/coding-agent/src/core/tools/subagent.ts (lines 568-581).
The instanceof Error and typeof === "string" branches are dead at the sole call site — complete() returns AssistantMessage (always an object), and thrown values go to the catch block, never reaching isRuntimeUnavailableError. The function is exported with an unknown parameter type, signaling broader applicability than exists. Tests even assert isRuntimeUnavailableError("HTTP 500") === true, covering a path production code never takes. Either unexport it and narrow to AssistantMessage, or document the defensive breadth.

finding 5 — Duplicated bullet-list construction across formatting helpers
Severity: low. Confidence: 92.
File: packages/coding-agent/src/core/tools/subagent.ts (lines 737, 753).
formatModelFallbackSummary and formatSkippedModelFailureDetails both contain the identical template .map((s) => \- ${s.model}: ${s.reason}`).join("\n"). Extract a shared skippedBulletList()helper so both output paths stay in sync if the format changes. Additionally,formatSkippedModelFailureDetails` is a two-liner with exactly one call site — consider inlining.

finding 6 — Minor test coverage gaps in helper functions
Severity: low. Confidence: 85.
File: packages/coding-agent/test/subagent-model-fallback.test.ts.
Several small gaps: (a) isRuntimeUnavailableError lacks a direct assertion for stopReason: "aborted" (covered indirectly via probe test but not in the unit test block); (b) formatModelFallbackSummary(skipped, undefined) → "unknown" fallback never tested; (c) compactErrorReason 180-char truncation and empty-string → "unknown error" fallback never exercised. Each is individually minor but they add up to untested defensive branches.

Strengths

Scope stays tightly focused on subagent.ts and tests — no AgentSession, CLI, SDK, UI, or Telegram changes leaked in.
The spawn-time one-token probe is a clean design that avoids fragile stderr parsing and child process respawn overhead.
Single-model configs, per-invocation overrides, and registry-less sessions correctly preserve existing behavior.
Abort signal threading through makeProbeSignal → probeModelAvailability → resolveModelForSubagentSpawn → executeSingle is correct at every step.
Hard timeout via Promise.race protects against providers that ignore AbortSignal.
Skipped models are logged loudly to stderr AND prepended to result output so the parent agent sees what happened.
makeProbeSignal cleanup (clearTimeout + removeEventListener) is consistently called via finally.

Agents run: code-reviewer, error-auditor, test-reviewer, completeness-checker, simplifier

Reviewed by mach6

m-aebrer · 2026-05-15T13:34:09Z

Review Assessment

Review comment: #209 (comment)

Classifications

Finding	Classification	Reasoning
finding 1 — Probe-at-spawn only partially addresses "runtime rate limits trigger fallback"	false positive	Intentional design: mid-task model switching could degrade performance by falling back to a weaker model partway through. The probe catches the common case (model already unavailable at spawn time). Mid-task failures correctly return early with a clear error rather than silently degrading. The acceptance criteria are satisfied — the probe IS the runtime fallback mechanism; it tests actual API availability, not just static config.
finding 2 — Parent model final fallback bypasses availability probe	false positive	The parent model is the model currently powering the parent session. If it were unavailable, the parent wouldn't be running to spawn subagents. Probing it adds latency for near-zero benefit — active usage by the parent proves availability. The probe itself would consume a request, potentially worsening the rate-limit scenario it's meant to detect.
finding 3 — `executeSingle` untested for model-less agents	nitpick	The model-less path (`if (modelSpec)` is falsy) simply skips the entire resolution block and proceeds to `spawnSubagent` with no `--model` flag — identical to pre-PR behavior. It's a trivial pass-through with no branching complexity introduced by this PR.
finding 4 — `isRuntimeUnavailableError` has unreachable branches	nitpick	Defensive programming for a utility that accepts `unknown` and may gain additional call sites. The branches are directly tested. Standard practice, not dead code.
finding 5 — Duplicated bullet-list construction	nitpick	A single-expression template in two places. The functions serve different purposes (user-visible summary vs. error detail) and could diverge in the future. Extracting a helper for a one-liner is over-engineering.
finding 6 — Minor test coverage gaps in helpers	nitpick	(a) `stopReason: "aborted"` is tested indirectly via the probe aborted-message test. (b) `formatModelFallbackSummary(skipped, undefined)` → "unknown" is trivial string interpolation. (c) `compactErrorReason` is an unexported internal helper tested through integration paths. None represent untested risk.

Action Plan

No action needed before merge — all findings are nitpicks or false positives.

Assessment by mach6

Also fix flaky web-search-queue timing test by using monotonic performance.now() instead of Date.now() which can go backwards during NTP clock corrections.

chore: open PR for issue 198

bf0f2e0

Add subagent spawn-time model probes

26e97fe

aebrer marked this pull request as ready for review May 14, 2026 21:54

Fix subagent probe fallback edge cases

b8ec34d

fix subagent fallback test coverage

8dd5ae3

m-aebrer added 2 commits May 15, 2026 09:40

chore: bump version to 2.19.0

259e930

docs: document spawn-time model availability probes

d6d9ccd

Also fix flaky web-search-queue timing test by using monotonic performance.now() instead of Date.now() which can go backwards during NTP clock corrections.

m-aebrer merged commit b0c1868 into master May 15, 2026
2 checks passed

m-aebrer deleted the feature/issue-198-subagent-runtime-model-fallback branch May 15, 2026 13:50

Conversation

aebrer commented May 14, 2026

Uh oh!

aebrer commented May 14, 2026

Implementation Plan

Problem Analysis

Approach: Child-Side Fallback

Deliverables

Files to Create or Modify

Detailed Design

1. Passing the fallback list to the child

2. AgentSession fallback consumption

3. Model switching on retry exhaustion

4. Surfacing the resolved model

5. Logging design principle

Acceptance Criteria Mapping

Testing Approach

Risks and Open Questions

Uh oh!

aebrer commented May 14, 2026

Dinesh vs Gilfoyle Review — Plan for issue 198

3 rounds of mass destruction

Best of the Banter

Verdict

Critical (Gilfoyle won, Dinesh conceded)

Important (Gilfoyle won after debate)

Contested (Dinesh held his ground)

Dismissed (Gilfoyle was nitpicking)

Strengths (even Gilfoyle grudgingly acknowledged)

Score

Amended Plan — Recommended Changes

Uh oh!

aebrer commented May 14, 2026

Scope Clarification

What this feature is NOT

What this feature IS

Files to modify

Files to NOT modify

Uh oh!

aebrer commented May 14, 2026

Additional Design Note

Uh oh!

aebrer commented May 14, 2026

Amended Implementation Plan — Issue 198

Problem Analysis

Scope

Approach: Spawn-Time Probe with Fallback Loop

Deliverables

Files to Create or Modify

Files to NOT Modify

Detailed Design

1. probeModelAvailability helper

2. Error classification

3. executeSingle fallback loop

4. Surfacing the resolved model

5. Fallback visibility

Acceptance Criteria Mapping

Testing Approach

Risks and Open Questions

Uh oh!

aebrer commented May 14, 2026

Progress Update

Uh oh!

aebrer commented May 14, 2026

Code Review

Critical

Important

Suggestions

Strengths

Uh oh!

aebrer commented May 14, 2026

Review Assessment

Classifications

Action Plan

Uh oh!

aebrer commented May 14, 2026

Progress Update

Uh oh!

aebrer commented May 14, 2026

Code Review

1. `probeModelAvailability` helper

3. `executeSingle` fallback loop