Skip to content

feat(agents): show per-turn duration and prune dead turns within ~25s of host crash#1017

Merged
wpfleger96 merged 6 commits into
mainfrom
duncan/turn-duration-counter
Jun 12, 2026
Merged

feat(agents): show per-turn duration and prune dead turns within ~25s of host crash#1017
wpfleger96 merged 6 commits into
mainfrom
duncan/turn-duration-counter

Conversation

@wpfleger96

Copy link
Copy Markdown
Collaborator

Extends the Agents panel "working" indicator to show how long each agent has been running in its current turn, with a live-ticking counter per channel.

Producer — harness liveness emission (crates/buzz-acp/)

The desktop previously kept a turn's "working" badge alive for 90s after the last stream event, so a hard-killed agent host (kill -9, crash — no graceful unwind) left a stale badge for up to 90s. The harness now emits a turn_liveness observer event on a fixed interval while a turn is in-flight, carrying the live turn_id. Because it is emitted harness-side and independent of agent output, it stops the instant the host process dies — which is exactly the signal the desktop needs to prune a dead turn quickly.

  • run_turn_liveness is a non-resolving future raced against the prompt as a select! arm on both the cancellable and non-cancellable paths, so it lives and dies with the prompt future and tears down on every exit path (complete / cancel / error / panic) — no separate cleanup to drift out of sync.
  • Emits via a captured ObserverHandle, not the borrowed acp client.
  • BUZZ_ACP_TURN_LIVENESS_SECS controls the interval (default 10; 0 disables; floor of 5s mirrors the existing heartbeat-interval validation).

Consumer — desktop backstop (desktop/src/features/agents/activeAgentTurnsStore.ts)

  • turn_liveness routes through the same activity path as acp_read / acp_write, refreshing lastActivityAt.
  • The prune bound drops from a bare 90_000 to a derived LIVENESS_INTERVAL_MS * 2.5 (~25s), tolerating one fully dropped ping plus slack. Only a hard-killed host reaches this bound; graceful exits clear via turn_completed and working turns refresh on stream events.

Naming

The new concept is turn_liveness, distinct from the existing agent self-prompt heartbeat (PromptSource::Heartbeat, BUZZ_ACP_HEARTBEAT_INTERVAL) and the ACP-stream-layer keepalive — both already taken for unrelated mechanisms.

npub1mn7jgtj4w2pd0g0zeuhxsa6jy6p0rewxz4kujt98my82ahfmp72sxjexk7 and others added 6 commits June 12, 2026 16:40
The Working badges showed only which channels an agent was active in,
discarding the per-turn start time the store already tracked. Each badge
now reads "Working in #channel · 1m 23s" and ticks once a second.

Elapsed is anchored to a desktop-clock observedAt captured when a turn
enters the store, not the observer event's startedAt — the latter is the
agent host's clock, which skews against the desktop's Date.now() for
remote agents and would make the counter start negative or jump. The 1s
tick lives in a leaf WorkingBadge so only visible badges re-render each
second; idle rows never mount the ticking hook.

Co-authored-by: Will Pfleger <pfleger.will@gmail.com>
Signed-off-by: Will Pfleger <pfleger.will@gmail.com>
The same-channel collapse must read the live turn map on every mutation,
not a stale cached minimum. Ending the earliest-observed turn must advance
the surfaced observedAt to the surviving turn — pinned with a deterministic
test (busy-wait guarantees distinct observedAt across the clock-millisecond
boundary). Adds a WHY comment at the observedAt capture site so a future
reader does not revert it to startedAt and reintroduce cross-machine skew.

Co-authored-by: Will Pfleger <pfleger.will@gmail.com>
Signed-off-by: Will Pfleger <pfleger.will@gmail.com>
The desktop prunes a turn after 90s of no observer activity, which only matters when an agent host dies without unwinding (kill -9 / crash) so the turn_completed Drop guard never fires. With no liveness signal the badge can linger 90s. The harness now emits a turn_liveness observer event per in-flight turn on a fixed interval (BUZZ_ACP_TURN_LIVENESS_SECS, default 10, 0=off) so the desktop can prune dead turns far sooner.

Named turn_liveness, not heartbeat — heartbeat already denotes agent self-prompting in this crate. The emission rides every prompt-await path as a non-resolving select! arm, so it lives and dies with the prompt future and stops on every exit path with no separate teardown. It uses a captured ObserverHandle rather than agent.acp because the prompt holds &mut agent.acp for its whole duration.

Co-authored-by: Will Pfleger <pfleger.will@gmail.com>
Signed-off-by: Will Pfleger <pfleger.will@gmail.com>
The desktop pruned a turn after 90s of silence, which only bites when an agent host dies without unwinding so the turn_completed Drop guard never fires. The harness now emits turn_liveness pings (Phase 1); routing them through the same recordActivity path as acp_read/acp_write lets the bound drop to 25s (2.5x the 10s liveness interval) without falsely pruning live-but-quiet turns. The bound is derived from LIVENESS_INTERVAL_MS so it tracks the interval if it changes.

Co-authored-by: Will Pfleger <pfleger.will@gmail.com>
Signed-off-by: Will Pfleger <pfleger.will@gmail.com>
Phase 2 dispatch named three new tests; the keep-alive and prune-at-bound cases shipped but the null-turnId no-op was missing. recordActivity early-returns on a falsy turnId, so a liveness ping without a turn_id must refresh nothing and must not throw — proven via the turn still pruning at the bound after the null ping.

Co-authored-by: Will Pfleger <pfleger.will@gmail.com>
Signed-off-by: Will Pfleger <pfleger.will@gmail.com>
Co-authored-by: Will Pfleger <pfleger.will@gmail.com>
Signed-off-by: Will Pfleger <pfleger.will@gmail.com>
@wpfleger96

Copy link
Copy Markdown
Collaborator Author

Single agent working — live counter

Paul mid-turn in #general: the Working badge alongside the per-turn elapsed counter ticking up (1m 23s).

01-single-agent-working

Multiple channels — independent counters

Paul working in two channels at once — #engineering · 2m 6s and #general · 2m 45s — each row tracking its own turn independently.

02-multiple-channels

Longer-running turn — h/m/s rollover

A turn well past the hour mark (1h 7m 42s), demonstrating formatElapsed's hour/minute/second rollover.

03-longer-running

wpfleger96 pushed a commit that referenced this pull request Jun 12, 2026
@wpfleger96 wpfleger96 merged commit 87e45c6 into main Jun 12, 2026
27 checks passed
@wpfleger96 wpfleger96 deleted the duncan/turn-duration-counter branch June 12, 2026 22:01
tlongwell-block pushed a commit that referenced this pull request Jun 13, 2026
* origin/main: (33 commits)
  fix(desktop): make Windows release compile cleanly (#1029)
  Add production Docker Compose bundle (#985)
  feat(profile): show active turn badges on agent profile panel and popover (#1026)
  chore(release): release version 0.3.20 (#1027)
  fix(release): resolve Windows sidecar path and Linux AppImage updater format (#1024)
  chore(release): release version 0.3.19 (#1014)
  fix(release): ignore prerelease tags in changelog generation (#1021)
  fix: repair main build after cross-PR merge skew (#1020)
  feat(agents): show per-turn duration and prune dead turns within ~25s of host crash (#1017)
  fix(release): replace hermit with native tool setup on Windows job (#1018)
  feat(acp): surface error-class outcomes to the activity feed only, never the channel (#1010)
  fix(desktop): migrate Sprout workspace storage (#1016)
  feat(auth): force token refresh on rejected token (401/403), never the browser (#1015)
  fix(release): mark prerelease versions so they do not become latest (#1013)
  feat(acp): implement systemPrompt with protocol version gating (#981)
  fix(release): update repository name check from block/sprout to block/buzz (#1012)
  feat(release): all-OS desktop builds + universal auto-update manifest (#1011)
  Add relay disconnect UX: friendly errors, reconnect, cached identity (#1004)
  feat(agents): add active turn indicators to Agents Menu (#1005)
  ci: add fork guards to docker, release, and auto-tag workflows (#1007)
  ...

Co-authored-by: npub1t2tgm7d8f995uqvmnm8h88sg3wnpp9a5xysjf6dg3tjmgt3ltulqdp8ehr <5a968df9a7494b4e019b9ecf739e088ba61097b4312124e9a88ae5b42e3f5f3e@sprout-oss.stage.blox.sqprod.co>
Signed-off-by: npub1t2tgm7d8f995uqvmnm8h88sg3wnpp9a5xysjf6dg3tjmgt3ltulqdp8ehr <5a968df9a7494b4e019b9ecf739e088ba61097b4312124e9a88ae5b42e3f5f3e@sprout-oss.stage.blox.sqprod.co>
tellaho pushed a commit that referenced this pull request Jun 14, 2026
…tate

* origin/main: (21 commits)
  fix(release): use signed NSIS installer for updates (#1036)
  handoff: pass full session history to summarizer (#1033)
  feat(emoji): latest-set-wins union for custom emoji across desktop, mobile, and CLI (#989)
  Fix relay NIP-11 software URL (#1030)
  fix(desktop): make Windows release compile cleanly (#1029)
  Add production Docker Compose bundle (#985)
  feat(profile): show active turn badges on agent profile panel and popover (#1026)
  chore(release): release version 0.3.20 (#1027)
  fix(release): resolve Windows sidecar path and Linux AppImage updater format (#1024)
  chore(release): release version 0.3.19 (#1014)
  fix(release): ignore prerelease tags in changelog generation (#1021)
  fix: repair main build after cross-PR merge skew (#1020)
  feat(agents): show per-turn duration and prune dead turns within ~25s of host crash (#1017)
  fix(release): replace hermit with native tool setup on Windows job (#1018)
  feat(acp): surface error-class outcomes to the activity feed only, never the channel (#1010)
  fix(desktop): migrate Sprout workspace storage (#1016)
  feat(auth): force token refresh on rejected token (401/403), never the browser (#1015)
  fix(release): mark prerelease versions so they do not become latest (#1013)
  feat(acp): implement systemPrompt with protocol version gating (#981)
  fix(release): update repository name check from block/sprout to block/buzz (#1012)
  ...

Co-authored-by: Taylor Ho <taylorkmho@gmail.com>
Signed-off-by: Taylor Ho <taylorkmho@gmail.com>

# Conflicts:
#	desktop/src/features/profile/ui/UserProfilePanel.tsx
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant