refactor(workflow): S10 self-healing recovery events#1583
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
4fd5a3b to
b57eaa0
Compare
30ef86b to
f1badbe
Compare
b57eaa0 to
7de5076
Compare
f1badbe to
cbdbe87
Compare
7de5076 to
95230fa
Compare
cbdbe87 to
6451e39
Compare
95230fa to
c85ab12
Compare
6451e39 to
2e67ef0
Compare
2e67ef0 to
eacaa0f
Compare
d3ade66 to
e84e8f6
Compare
eacaa0f to
11687a5
Compare
11687a5 to
082995d
Compare
e84e8f6 to
19e6540
Compare
082995d to
d0d7fb7
Compare
4eb81d1 to
86bea2f
Compare
d0d7fb7 to
46ef418
Compare
86bea2f to
c2df552
Compare
46ef418 to
3d1e5a9
Compare
Fusion-Task-Id: FN-000
Address PR #1583 feedback by keeping in-flight recovery dedupe while issuing fresh work when a previous deterministic recovery event has reached a terminal state.
3d1e5a9 to
b16d22e
Compare
Addressed: recovery events now dedupe in-flight work but create a fresh run id after terminal prior work, so recurring recovery signals can be published safely. |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/engine/src/__tests__/workflow-recovery-events.test.ts`:
- Around line 99-109: The assertion on the listWorkflowWorkItemsForTask call
uses a direct toEqual comparison with a fixed array order, which makes the test
fragile if row ordering changes. Replace the toEqual check with
expect.arrayContaining to match items regardless of order, and add a separate
length check to ensure exactly the expected number of items are returned. This
decouples the test from the specific ordering of workflow items.
- Around line 73-109: The test at line 73 currently validates the recurring
recovery event invariant only when the previous event reaches the "succeeded"
terminal state. Expand the test coverage to verify that
publishWorkflowRecoveryEvent can publish fresh recovery items when the previous
event transitions to any known terminal state: "succeeded", "failed",
"cancelled", and "exhausted". For each terminal state, use
store.transitionWorkflowWorkItem to move the first recovery work item to that
state, then assert that a subsequent publishWorkflowRecoveryEvent call creates a
new work item with the expected properties, ensuring the regression is fully
guarded across all terminal surface cases.
In `@packages/engine/src/workflow-recovery-events.ts`:
- Around line 20-21: The `listWorkflowWorkItemsForTask` method in the
`WorkflowRecoveryEventStore` interface is currently optional (marked with `?`),
which prevents `publishWorkflowRecoveryEvent` from guaranteeing deduplication
and terminal-state runId rotation across all store implementations. Remove the
optional marker (`?`) from the `listWorkflowWorkItemsForTask` method signature
to make it a required method, ensuring all implementations of this interface
must provide this capability for proper recovery-event handling.
🪄 Autofix (Beta)
❌ Autofix failed (check again to retry)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: e7374297-4e83-4fe3-a182-66a9fbf840e8
📒 Files selected for processing (4)
docs/plans/workflow-owned-merge-stack/s10-self-healing-recovery-events.mdpackages/engine/src/__tests__/workflow-recovery-events.test.tspackages/engine/src/index.tspackages/engine/src/workflow-recovery-events.ts
|
Note Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it. An unexpected error occurred while generating fixes: 13 INTERNAL: Received RST_STREAM with code 2 (Internal server error) |
- Make listWorkflowWorkItemsForTask required on WorkflowRecoveryEventStore so the terminal-state runId rotation guard can't be silently bypassed - Expand recurring-recovery regression test to all terminal states (succeeded/failed/cancelled/exhausted) via it.each - Use order-independent arrayContaining + length assertion for recovery work items Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Ready to review this PR? Stage has broken it down into 3 individual chapters for you:
Chapters generated by Stage for commit 21bb295 on Jun 17, 2026 2:19pm UTC. |
b5072ce
into
feature/workflow-owned-merge-s08-workflow-owned-merge-processing
Stack Slice
feature/workflow-owned-merge-s09-workflow-owned-retry-statedocs/plans/2026-06-09-003-refactor-workflow-owned-merge-full-migration-slices-plan.mdGoal
Convert self-healing merge/retry lifecycle mutations into typed workflow recovery events and node wakes.
Dependency
S5 runtime driver, S8 merge processing, and S9 retry state.
Expected Scope
packages/engine/src/self-healing.ts; restart recovery coordinator; recovery policy; workflow runtime; recovery tests.
Expected Tests
Mergeable in-review recovery event, stale merge status event, transient merge retry event, already-landed finalize event, autoMerge false terminal behavior, dedupe.
Exit Gate
Self-healing no longer directly requeues, pauses, fails, unpauses, or moves merge/retry tasks except through guarded workflow primitives.
Status
Draft stack placeholder. This PR reserves ordering and review context; implementation should replace or extend the handoff artifact before this slice is marked ready.
Implementation Added
Summary by CodeRabbit
Release Notes
New Features
Documentation
Tests