You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Window: last 24h, count-limited to the most recent 20 runs (the agenticworkflows logs bridge caps at 60s; both the full-24h fetch and a 60-run fetch timed out). Treat the headline success rate as a biased sample of the most recent — and failure-heavy — cluster, not a full-day rate.
Headline
Metric
Value
Runs (sampled)
20
Success / Failure
12 / 8
Sampled success rate
60%⚠️(small recent slice — see caveat)
AI credits (AIC)
1,023.8
Action minutes
179 (2.8h wall)
Engines
copilot 12 · claude 6 · pi 2
Missing tools / data / MCP failures
0 / 0 / 0
Agent-level errors
0 (all fails are pre-agent)
Every one of the 8 failures is a 0-turn / 0-token / 0-error run — the agent never executed. These are pre-agent driver/CLI-startup failures, not model or tooling problems. claude was clean (6/6 = 100%).
Failure breakdown — all 8 are the chronic driver-startup family
New signal: the first pi-engine 0-turn failure I've recorded (prior windows showed pi at ~100%). PR Sous Chef's pi runs are backed by copilot/gpt-5.4, so this is very likely the same copilot CLI/driver startup fault surfacing through the pi engine, not a pi-specific regression — its sibling pi run in the same window succeeded. Worth watching whether pi failures track copilot's.
Why this is the same chronic family, not something new
All 8 failed runs share an identical fingerprint in run_summary.json: TokenUsage=0, Turns=0, ErrorCount=0, duration 3–8m, failure at the agent-execution step before any model turn. This matches the copilot-sdk-driver-failures family (recurrence 20+, first seen 2026-06-02) and the smoke-ci-copilot-cli-100pct-fail-on-push cluster (chronic since ~06-30). No missing-tool, missing-data, MCP, auth-token, or timeout signatures appeared in this window.
Trends (28 days)
Success rate has oscillated in an 85–95% band for most of the window, with sharp single-day troughs when one workflow goes 100%-fail (06-13 Avenger/Sous Chef, 06-20 Skillet, 06-25 broad 0-turn). Today's 60% is a count-limited artifact — the 20-run sample lands squarely on the recent copilot-CLI failure cluster and is not comparable to the full-day rates plotted for prior days. The persistent structural issue across the whole window is copilot 0-turn driver-startup failures, which no config change has yet resolved.
Token totals ran 24–127M/day through mid-June, but recent days report tok=null (token artifact absent) and fall back to AIC as the cost metric; today's sampled AIC is ~1,024. The gap in the token series is a telemetry/observability issue, not a usage drop — cost is still being tracked via AI credits.
Recurring issues (status)
copilot-sdk-driver-failures — recurrence 21, OPEN/chronic. Dominant fleet failure family; 7 of 8 fails this window. Now also observed via pi (copilot-backed model).
smoke-ci-copilot-cli-100pct-fail-on-push — OPEN, still 100% on every push to main. Highest-signal, most-reproducible instance of the driver fault — best candidate for a root-cause fix.
No new instances of safe-output-partial-failure-intolerance, model-param-config-incompatibility, or playwright-install-cli-failure in this sample.
Recommendations
Prioritize Smoke CI as the reproduction case. It fails 100% on every push with a clean 0-turn signature — the cheapest, most deterministic handle on the copilot CLI-startup fault. Fixing it likely fixes the whole copilot-sdk-driver-failures family.
Confirm the pi↔copilot coupling. Track whether pi (copilot-backed) 0-turn failures continue to co-occur with copilot Execute-CLI failures; if so, the fix is shared and pi is just another surface.
Restore the token-usage artifact. Several recent days report tok=null; the daily token series is broken and only AIC survives. Investigate why token_usage.jsonl is empty on completed runs.
Raise audit-fetch coverage. The 60s bridge cap forces count-limited sampling; consider windowed/paginated log fetches so daily rates aren't biased by the most recent cluster.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Window: last 24h, count-limited to the most recent 20 runs (the agenticworkflows
logsbridge caps at 60s; both the full-24h fetch and a 60-run fetch timed out). Treat the headline success rate as a biased sample of the most recent — and failure-heavy — cluster, not a full-day rate.Headline
Every one of the 8 failures is a 0-turn / 0-token / 0-error run — the agent never executed. These are pre-agent driver/CLI-startup failures, not model or tooling problems.
claudewas clean (6/6 = 100%).Failure breakdown — all 8 are the chronic driver-startup family
pushto mainsmoke-ci-copilot-cli-100pct-fail-on-pushcopilot-sdk-driver-failurescopilot-sdk-driver-failurescopilot-sdk-driver-failurescopilot-sdk-driver-failurescopilot/gpt-5.4)copilot-sdk-driver-failures(new surface)New signal: the first
pi-engine 0-turn failure I've recorded (prior windows showed pi at ~100%). PR Sous Chef's pi runs are backed bycopilot/gpt-5.4, so this is very likely the same copilot CLI/driver startup fault surfacing through the pi engine, not a pi-specific regression — its sibling pi run in the same window succeeded. Worth watching whether pi failures track copilot's.Why this is the same chronic family, not something new
All 8 failed runs share an identical fingerprint in
run_summary.json:TokenUsage=0,Turns=0,ErrorCount=0, duration 3–8m, failure at the agent-execution step before any model turn. This matches thecopilot-sdk-driver-failuresfamily (recurrence 20+, first seen 2026-06-02) and thesmoke-ci-copilot-cli-100pct-fail-on-pushcluster (chronic since ~06-30). No missing-tool, missing-data, MCP, auth-token, or timeout signatures appeared in this window.Trends (28 days)
Success rate has oscillated in an 85–95% band for most of the window, with sharp single-day troughs when one workflow goes 100%-fail (06-13 Avenger/Sous Chef, 06-20 Skillet, 06-25 broad 0-turn). Today's 60% is a count-limited artifact — the 20-run sample lands squarely on the recent copilot-CLI failure cluster and is not comparable to the full-day rates plotted for prior days. The persistent structural issue across the whole window is copilot 0-turn driver-startup failures, which no config change has yet resolved.
Token totals ran 24–127M/day through mid-June, but recent days report
tok=null(token artifact absent) and fall back to AIC as the cost metric; today's sampled AIC is ~1,024. The gap in the token series is a telemetry/observability issue, not a usage drop — cost is still being tracked via AI credits.Recurring issues (status)
copilot-sdk-driver-failures— recurrence 21, OPEN/chronic. Dominant fleet failure family; 7 of 8 fails this window. Now also observed via pi (copilot-backed model).smoke-ci-copilot-cli-100pct-fail-on-push— OPEN, still 100% on every push to main. Highest-signal, most-reproducible instance of the driver fault — best candidate for a root-cause fix.safe-output-partial-failure-intolerance,model-param-config-incompatibility, orplaywright-install-cli-failurein this sample.Recommendations
copilot-sdk-driver-failuresfamily.pi(copilot-backed) 0-turn failures continue to co-occur with copilot Execute-CLI failures; if so, the fix is shared and pi is just another surface.tok=null; the daily token series is broken and only AIC survives. Investigate whytoken_usage.jsonlis empty on completed runs.References
Repo memory updated: audit-history.jsonl, metrics-summary.json, known-issues.json (recurrence counters bumped).
Warning
Firewall blocked 1 domain
The following domain was blocked by the firewall during workflow execution:
awmgmcpgSee Network Configuration for more information.
Beta Was this translation helpful? Give feedback.
All reactions