Checkpoint Continuation Issue#659
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
📝 WalkthroughWalkthroughAdds a Codex turn stall watchdog in ChangesCodex Stall Handling and Goal Propagation
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Preview deployment for your docs. Learn more about Mintlify Previews.
💡 Tip: Enable Workflows to automatically generate PRs for you. |
|
@copilot review but do not make fixes |
|
@codex review |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (4)
apps/desktop/src/main/services/chat/agentChatService.ts (1)
16683-16756: 🎯 Functional Correctness | 🔵 Trivial | 💤 Low value
emitCodexReconciledItemnon-text branch always returnstrue.For
commandExecution/fileChange/toolCall/etc., the fallthrough returnstrueunconditionally after delegating tohandleCodexItemEvent, even if nothing new was emitted. Reconciled item events are also not registered inrecentNotificationKeys, so a later genuineitem/completedfor the sameitemIdre-runshandleCodexItemEventand re-emits the event. The renderer collapses byitemId, so impact is limited, but therecoveredUsefulItemsignal can over-report. Low priority given the dedup downstream; consider returningtrueonly for item types that meaningfully changed state if this proves noisy.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/desktop/src/main/services/chat/agentChatService.ts` around lines 16683 - 16756, The non-text fallthrough in emitCodexReconciledItem currently reports success unconditionally after calling handleCodexItemEvent, even when no meaningful event was emitted. Update the fallthrough logic for commandExecution/fileChange/toolCall-like items so it only returns true when a real state change or emission occurred, and otherwise returns false; keep the itemId normalization and the handleCodexItemEvent delegation intact while aligning the return value with actual recovery signal semantics.apps/desktop/src/main/services/chat/agentChatService.test.ts (3)
14049-14080: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick winAssert that the structured stall event is cleared too.
This only proves the watchdog notice is suppressed. A regression that still emits
codex_turn_stalledafteritem/startedwould still pass here.Suggested change
expect(events.some((event) => event.event.type === "system_notice" && event.event.message.includes("has not streamed model or tool output yet") )).toBe(false); + expect(events.some((event) => event.event.type === "codex_turn_stalled")).toBe(false);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/desktop/src/main/services/chat/agentChatService.test.ts` around lines 14049 - 14080, The Codex no-output watchdog test only checks that the system_notice is suppressed, so it can miss a regression where codex_turn_stalled is still emitted after item/started. Update the test in agentChatService.test.ts around the createService/sendMessage and mockState.emitCodexPayload flow to also assert that no codex_turn_stalled event appears in the collected events after the useful turn event arrives.
13863-13913: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick winAdd the deduped-reconciliation regression case.
This covers recovery when reconciliation emits fresh assistant text, but not the reported branch where the recovered turn is already deduped. That path still needs to emit
codex_turn_stalledbecause no new output reached the client.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/desktop/src/main/services/chat/agentChatService.test.ts` around lines 13863 - 13913, Add a regression test in agentChatService.test around the existing reconciles-a-completed-silent-Codex-turn case to cover the deduped-reconciliation branch. In the relevant test flow for createService, sendMessage, and the stalled-turn timeout, make the reconciled app-server turn match already-seen assistant output so reconciliation produces no new client-visible text, then assert that codex_turn_stalled is emitted instead of treating the turn as recovered. Keep the existing thread/read and thread/turns/list setup, but adjust the mocked turn content/recorded events so the deduplication path is exercised and verified.
1157-1172: 🗄️ Data Integrity & Integration | 🔵 Trivial | ⚡ Quick winKeep the mock's goal normalization aligned with
sessionService.create().The real session service trims create-time goals and collapses blank strings to
null, but this mock storesargs.goalverbatim. That lets these tests pass against backing rows production never persists.Suggested change
- goal: args.goal ?? null, + goal: typeof args.goal === "string" && args.goal.trim().length > 0 + ? args.goal.trim() + : null,🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/desktop/src/main/services/chat/agentChatService.test.ts` around lines 1157 - 1172, The `create` mock in `agentChatService.test.ts` should normalize `args.goal` the same way `sessionService.create()` does, instead of storing it verbatim. Update the mock session object so blank or whitespace-only goals become `null` and non-blank goals are trimmed before being assigned to `goal`, keeping test behavior aligned with the real service.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@apps/desktop/src/main/services/chat/agentChatService.ts`:
- Around line 9759-9795: The approval card path is not clearing the Codex
no-first-event watchdog, so `reconcileCodexSilentTurn` can still report a stall
while `emitPendingInputRequest` is legitimately waiting for user approval.
Update `emitPendingInputRequest` and/or the Codex server-request handling path
to clear `runtime.noFirstEventWatchdog` when an `approval_request` is surfaced,
using the same `clearCodexNoFirstEventWatchdog` flow that
`markCodexTurnProgress` relies on. Keep the stall state aligned with the visible
approval UI by ensuring Codex approval/input requests are treated as meaningful
progress.
In `@apps/desktop/src/main/services/sessions/sessionService.test.ts`:
- Around line 71-89: The test setup in sessionService.test.ts delays registering
the DB cleanup until after several operations, so failures before that point can
leak the sqlite handle. Move the db.close disposer registration to immediately
after openKvDb in the test body, or wrap the test logic in a try/finally so the
cleanup is always queued; use the openKvDb, createSessionService, and
activeDisposers references to update the setup safely.
---
Nitpick comments:
In `@apps/desktop/src/main/services/chat/agentChatService.test.ts`:
- Around line 14049-14080: The Codex no-output watchdog test only checks that
the system_notice is suppressed, so it can miss a regression where
codex_turn_stalled is still emitted after item/started. Update the test in
agentChatService.test.ts around the createService/sendMessage and
mockState.emitCodexPayload flow to also assert that no codex_turn_stalled event
appears in the collected events after the useful turn event arrives.
- Around line 13863-13913: Add a regression test in agentChatService.test around
the existing reconciles-a-completed-silent-Codex-turn case to cover the
deduped-reconciliation branch. In the relevant test flow for createService,
sendMessage, and the stalled-turn timeout, make the reconciled app-server turn
match already-seen assistant output so reconciliation produces no new
client-visible text, then assert that codex_turn_stalled is emitted instead of
treating the turn as recovered. Keep the existing thread/read and
thread/turns/list setup, but adjust the mocked turn content/recorded events so
the deduplication path is exercised and verified.
- Around line 1157-1172: The `create` mock in `agentChatService.test.ts` should
normalize `args.goal` the same way `sessionService.create()` does, instead of
storing it verbatim. Update the mock session object so blank or whitespace-only
goals become `null` and non-blank goals are trimmed before being assigned to
`goal`, keeping test behavior aligned with the real service.
In `@apps/desktop/src/main/services/chat/agentChatService.ts`:
- Around line 16683-16756: The non-text fallthrough in emitCodexReconciledItem
currently reports success unconditionally after calling handleCodexItemEvent,
even when no meaningful event was emitted. Update the fallthrough logic for
commandExecution/fileChange/toolCall-like items so it only returns true when a
real state change or emission occurred, and otherwise returns false; keep the
itemId normalization and the handleCodexItemEvent delegation intact while
aligning the return value with actual recovery signal semantics.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 774dc892-cf08-40af-8531-6b6a7a19ff75
⛔ Files ignored due to path filters (1)
docs/features/chat/README.mdis excluded by!docs/**
📒 Files selected for processing (9)
apps/ade-cli/src/tuiClient/aggregate.tsapps/desktop/src/main/services/ai/tools/systemPrompt.test.tsapps/desktop/src/main/services/ai/tools/systemPrompt.tsapps/desktop/src/main/services/chat/agentChatService.test.tsapps/desktop/src/main/services/chat/agentChatService.tsapps/desktop/src/main/services/sessions/sessionService.test.tsapps/desktop/src/main/services/sessions/sessionService.tsapps/desktop/src/renderer/components/chat/chatTranscriptRows.tsapps/desktop/src/shared/types/chat.ts
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 548a0e825d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| itemId, | ||
| turnId, | ||
| }); | ||
| if (String(item.status ?? "completed") !== "inProgress") { |
There was a problem hiding this comment.
Normalize MCP in-progress statuses
When reconciliation backfills an mcpToolCall whose app-server status is in_progress or running (spellings this function already treats as in-progress for other item types below), this exact inProgress check still emits a tool_result with completed status and an empty/partial result. In that case ADE shows the MCP call as finished even though the app-server turn is still blocked on it, so normalize the status before deciding to emit the result.
Useful? React with 👍 / 👎.
| itemCount: items.length, | ||
| }); | ||
| persistChatState(managed); | ||
| return; |
There was a problem hiding this comment.
Re-arm the watchdog after partial recovery
When the watchdog fires and thread/turns/list returns the active turn with at least one useful item but a non-terminal status, this branch persists the backfilled item and returns after the timer has already been cleared. If the notification stream is still broken and the turn never completes, ADE will not probe again or emit codex_turn_stalled, leaving the session active indefinitely after a single recovered item; re-arm the watchdog or keep reconciling until the turn becomes terminal or visibly stalled.
Useful? React with 👍 / 👎.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0771c94e9e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| items.forEach((item, index) => { | ||
| recoveredUsefulItem = emitCodexReconciledItem(managed, runtime, item, index, turnId) || recoveredUsefulItem; |
There was a problem hiding this comment.
Deduplicate recovered Codex items before replaying
When a partial recovery re-arms the watchdog, later thread/turns/list calls will usually still include the same historical turn items, but this loop unconditionally emits every useful item again. For a long-running command/MCP call, or a final turn whose item list still contains already-recovered reasoning/text, ADE will append duplicate transcript/tool rows on each watchdog cycle or on completion. Track the recovered item ids/statuses and emit only new or changed items before re-arming.
Useful? React with 👍 / 👎.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2795f9fce9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| emitCodexTurnStalled(managed, runtime, { | ||
| turnId, | ||
| reason: threadId && !stateProbeFailed ? "no_output" : "app_server_state_unknown", | ||
| message: "Codex accepted this turn but has not streamed model or tool output yet. You can keep waiting, send a status nudge, or interrupt and retry this thread if it stays stalled.", | ||
| }); |
There was a problem hiding this comment.
Re-check turn state before emitting stall
If the watchdog fires and the async thread/read / thread/turns/list probes are still in flight, a normal turn/completed or turn/aborted can clear activeTurnId before this fall-through path runs. Unlike the currentTurn branch above, this path does not call isCodexSilentTurnStillCurrent after the awaits, so an empty or stale turns-list response can emit a codex_turn_stalled warning after the turn has already finished, leaving users and orchestration parents with a false recovery prompt.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/desktop/src/main/services/chat/agentChatService.ts (1)
17068-17090: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winRe-check turn currency after the async probes before emitting a stall.
reconcileCodexSilentTurnonly guardsisCodexSilentTurnStillCurrentat entry and inside theif (currentTurn)block. The threeemitCodexTurnStalledfall-through paths run after the awaitedthread/readandthread/turns/listprobes, during which a normalturn/completednotification can be processed by the reader — clearingactiveTurnIdand marking the turn terminal. SinceemitCodexTurnStalledonly dedupes viastalledTurnIds(not terminal status), a finished turn would still surface a spurious stall notice (and parent-session notice).Add a final post-await guard before the stall paths.
🛡️ Proposed guard before the stall emit paths
} } + + // A normal turn/completed can race in while thread/read + thread/turns/list + // are awaited; re-check before emitting so we don't surface a spurious + // stall (and parent notice) on an already-finished turn. + if (!isCodexSilentTurnStillCurrent(managed, runtime, turnId)) { + return; + } if (activeFlags.includes("waitingOnApproval")) {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/desktop/src/main/services/chat/agentChatService.ts` around lines 17068 - 17090, The fall-through stall notifications in reconcileCodexSilentTurn can fire after async probes even if the turn has already completed, so add a final current-turn/terminal-state guard immediately before the emitCodexTurnStalled paths. Re-check isCodexSilentTurnStillCurrent (and/or that activeTurnId is still present and non-terminal) after the awaited thread/read and thread/turns/list calls, then return without emitting if the turn is no longer current. Keep the existing entry and currentTurn checks, and apply the new guard right before the waitingOnApproval, waitingOnUserInput, and no_output/app_server_state_unknown branches.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@apps/desktop/src/main/services/chat/agentChatService.test.ts`:
- Around line 13881-13890: The ADE-lane approval fixture in
agentChatService.test.ts is using a hard-coded outside path for the
command-execution cwd, which bypasses lane-relative behavior. Update the mock in
mockState.emitCodexPayload for the requestApproval case to use a lane
worktree-relative cwd (or the mocked lane worktree) so the test exercises lane
enforcement through the same path resolution used in ADE sessions.
---
Outside diff comments:
In `@apps/desktop/src/main/services/chat/agentChatService.ts`:
- Around line 17068-17090: The fall-through stall notifications in
reconcileCodexSilentTurn can fire after async probes even if the turn has
already completed, so add a final current-turn/terminal-state guard immediately
before the emitCodexTurnStalled paths. Re-check isCodexSilentTurnStillCurrent
(and/or that activeTurnId is still present and non-terminal) after the awaited
thread/read and thread/turns/list calls, then return without emitting if the
turn is no longer current. Keep the existing entry and currentTurn checks, and
apply the new guard right before the waitingOnApproval, waitingOnUserInput, and
no_output/app_server_state_unknown branches.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b341bb21-a7a4-4d26-8cb3-4fa469b009d1
📒 Files selected for processing (3)
apps/desktop/src/main/services/chat/agentChatService.test.tsapps/desktop/src/main/services/chat/agentChatService.tsapps/desktop/src/main/services/sessions/sessionService.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- apps/desktop/src/main/services/sessions/sessionService.test.ts
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3f3547f8fa
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const eventKind = isCodexReconciledItemInProgress(item.status) | ||
| ? "started" | ||
| : "completed"; | ||
| handleCodexItemEvent(managed, runtime, { ...item, id: itemId }, eventKind, turnId); |
There was a problem hiding this comment.
Normalize recovered file diffs before emitting
When the silent-turn reconciler backfills a fileChange from thread/turns/list, that API shape uses changes[].unifiedDiff (the existing thread-item converter reads that field), but this new path passes the raw item into handleCodexItemEvent, which only reads changes[].diff. Recovered file-change rows therefore persist with empty diffs even though the app-server had the patch, losing the proof/artifact for a completed silent turn; map unifiedDiff to diff or reuse the existing converter before emitting.
Useful? React with 👍 / 👎.
| text: stringOrNull(item.text ?? item.content ?? item.message ?? item.markdown ?? item.description ?? item.planText), | ||
| summary: Array.isArray(item.summary) ? item.summary : null, | ||
| result: item.result ?? null, | ||
| error: item.error ?? null, | ||
| arguments: item.arguments ?? null, |
There was a problem hiding this comment.
Include running item payloads in reconcile signatures
When a silent command or file-change item is recovered while still running, the watchdog is re-armed for another poll. If the next thread/turns/list response has the same status but more aggregatedOutput or updated changes, this signature is unchanged because it omits the fields that drive the emitted command/file rows, so emitCodexReconciledItem treats the item as already emitted and can surface a stall instead of backfilling the new output. Include the relevant payload fields in the signature.
Useful? React with 👍 / 👎.
Summary
Describe the change.
What Changed
Key files and behaviors.
Validation
How you tested.
Risks
Anything to watch.
Summary by CodeRabbit
New Features
Bug Fixes
Greptile Summary
This PR adds a no-first-output watchdog to Codex turns: if no model/tool event arrives within 120 s of a
turn/start, ADE probes the app-server viathread/read+thread/turns/list, back-fills any recovered items into the transcript, and either finalises the turn or emits a structuredcodex_turn_stalledevent with recovery options. It also seeds a session-level goal into the Codex thread on resume, adds MCP-startup-failure notices, and instructs the agent to continue working after status-check interruptions.codex_turn_stalledonly when the turn is genuinely stuck.createSessionnow accepts and persists agoal; the Codex thread's goal is initialised viathread/goal/setbefore the first turn on resume.codex_turn_stalledcarries structured recovery options and is forwarded to orchestration-parent sessions; it is excluded from transcript-collapsing and CLI-TUI aggregation.Confidence Score: 4/5
Safe to merge with the mcpToolCall duplicate-emission fix applied; the stall and goal-seeding paths are well-tested and the DB change is additive.
The watchdog and reconciliation machinery is thoroughly tested, but
emitCodexReconciledItemencodes itemstatusin its deduplication signature. When anmcpToolCalltransitions from "running" to "completed" between two watchdog cycles, thetool_callevent fires twice for the sameitemId. The rest of the change — goal seeding, stall routing, MCP-failure notices, and the operating-loop prompt addition — looks correct.apps/desktop/src/main/services/chat/agentChatService.ts — specifically the
emitCodexReconciledItemfunction'smcpToolCallbranch.Important Files Changed
codex_turn_stalledevents, MCP startup-failure notices, and session-goal seeding. Duplicatetool_callemission found in the mcpToolCall reconciliation path when status transitions from running→completed across watchdog cycles.thread/readandthread/turns/listmock stubs are wired correctly. No issues found.goalcolumn to theterminal_sessionsINSERT. Column count and placeholder count are balanced (22 columns, 22 values). Normalization mirrors the pattern used elsewhere. No issues found.codex_turn_stalledevent discriminant and agoalfield toAgentChatSession. Both are backwards-compatible additions. No issues found.codex_turn_stalledfrom the CLI TUI aggregator's transcript/timeline path. Correct — the structured stall event is handled via other UI surfaces. No issues found.codex_turn_stalledfrom theappendCollapsedChatTranscriptEventpass. No issues found.Sequence Diagram
%%{init: {'theme': 'neutral'}}%% sequenceDiagram participant T as Turn start participant W as Watchdog (120s) participant R as reconcileCodexSilentTurn participant A as App-server T->>W: scheduleCodexNoFirstEventWatchdog(turnId) Note over T,W: normal events clear watchdog via markCodexTurnProgress W->>R: fire (no events in 120s) R->>A: thread/read (get activeFlags) R->>A: thread/turns/list (get items) alt turn completed R->>R: finishCodexTurnFromReconciledState → emit done else partial items recovered (in-progress) R->>R: emitCodexReconciledItem (backfill) R->>W: re-arm watchdog for same turnId else no items / state unknown R->>R: emitCodexTurnStalled → system_notice R-->>Parent: forward codex_turn_stalled if orchestrated end%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%% sequenceDiagram participant T as Turn start participant W as Watchdog (120s) participant R as reconcileCodexSilentTurn participant A as App-server T->>W: scheduleCodexNoFirstEventWatchdog(turnId) Note over T,W: normal events clear watchdog via markCodexTurnProgress W->>R: fire (no events in 120s) R->>A: thread/read (get activeFlags) R->>A: thread/turns/list (get items) alt turn completed R->>R: finishCodexTurnFromReconciledState → emit done else partial items recovered (in-progress) R->>R: emitCodexReconciledItem (backfill) R->>W: re-arm watchdog for same turnId else no items / state unknown R->>R: emitCodexTurnStalled → system_notice R-->>Parent: forward codex_turn_stalled if orchestrated endReviews (6): Last reviewed commit: "Guard Codex stall fallthrough" | Re-trigger Greptile