Skip to content

refactor(workflow): S05 runtime work-item driver#1578

Merged
gsxdsm merged 5 commits into
mainfrom
feature/workflow-owned-merge-s05-runtime-work-item-driver
Jun 13, 2026
Merged

refactor(workflow): S05 runtime work-item driver#1578
gsxdsm merged 5 commits into
mainfrom
feature/workflow-owned-merge-s05-runtime-work-item-driver

Conversation

@gsxdsm

@gsxdsm gsxdsm commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

Stack Slice

  • Slice: S05
  • Milestone: Runtime
  • Base branch: feature/workflow-owned-merge-s04-builtin-ir-regions
  • Full plan: docs/plans/2026-06-09-003-refactor-workflow-owned-merge-full-migration-slices-plan.md

Goal

Let WorkflowTaskRuntime start from a workflow work item and persist node/work-item outcomes.

Dependency

S1 workflow work items, S3 generic scheduler claim path, and S4 built-in IR regions.

Expected Scope

packages/engine/src/workflow-task-runtime.ts; workflow graph executor and node handler files; runtime tests.

Expected Tests

Runnable completion, retrying work creation, manual hold creation, restart resume, and duplicate lease refusal.

Exit Gate

Runtime can progress workflow work without old merge queue callbacks.

Status

Draft stack placeholder. This PR reserves ordering and review context; implementation should replace or extend the handoff artifact before this slice is marked ready.

Implementation Added

  • Added WorkflowTaskRuntime.runWorkItem for leased work-item execution and terminal persistence.\n- Added default handlers for merge-gate, merge-attempt, manual hold, retry, recovery, and branch-group workflow nodes.\n- Added runtime work-item success/failure tests.

Summary by CodeRabbit

  • New Features
    • Runtime can execute and finalize workflow work items (including merge routing, manual-hold, branch-group behaviors) and returns clearer terminal outcomes (including a manual-required disposition).
  • Documentation
    • Added draft for the S05 runtime work-item driver slice outlining scope, dependencies, goals, tests, and exit criteria.
  • Tests
    • Expanded coverage for work-item execution: success, failure, merge routing, manual-hold, and attempt-threading.

@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: c42c27c7-6bcb-49e7-b049-a03b8b746072

📥 Commits

Reviewing files that changed from the base of the PR and between 3af587a and 25cf473.

📒 Files selected for processing (2)
  • packages/engine/src/__tests__/workflow-task-runtime.test.ts
  • packages/engine/src/workflow-task-runtime.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • packages/engine/src/tests/workflow-task-runtime.test.ts
  • packages/engine/src/workflow-task-runtime.ts

📝 Walkthrough

Walkthrough

This PR implements the S05 "runtime work-item driver" slice by adding a runWorkItem execution method to WorkflowTaskRuntime, providing handlers for merge and branch-group workflow nodes, and validating the changes with comprehensive test coverage.

Changes

Workflow work-item runtime execution

Layer / File(s) Summary
Merge and branch-group node handlers
packages/engine/src/workflow-node-handlers.ts
createDefaultNodeHandlers is extended to provide handlers for merge-gate, merge-attempt, manual-merge-hold, retry-backoff, recovery-router, and branch-group-member-integration/branch-group-promotion nodes. Handlers implement auto-merge routing, merge-primitives invocation (fail-closed if primitives unwired), manual hold enforcement, retry/backoff passthrough, and recovery routing.
Work-item runtime execution API
packages/engine/src/workflow-task-runtime.ts
WorkflowTaskRuntime adds the runWorkItem method to execute an assigned work item by validating wiring/state, loading task detail, resolving the workflow IR, invoking the handler at the target node with work-item-augmented context, mapping handler results/exceptions to a runtime result and disposition (completed/failed/manual-required), and transitioning the work item while clearing lease fields and setting lastError as needed. WorkflowTaskRuntimeDeps.store type is extended to optionally provide getTask and transitionWorkflowWorkItem.
Work-item runtime tests
packages/engine/src/__tests__/workflow-task-runtime.test.ts
Test imports updated to include work-item types; merge primitive stub accepts merge context and records attempt/run/workflow ids; visited node expectations extended for merge/branch-group paths; new runWorkItem tests validate success lease clearance, failure propagation with lastError, merge-gate routing when auto-merge disabled, manual-hold -> manual-required, unwired store transition failure, and threading work-item attempt into merge primitive context.
S05 docs draft
docs/plans/workflow-owned-merge-stack/s05-runtime-work-item-driver.md
Adds draft documentation for the S05 runtime work-item driver slice describing scope, dependencies, goal, expected tests, and exit gate; references the full plan.

Sequence Diagram(s)

sequenceDiagram
  participant API as runWorkItem
  participant Store as WorkflowStore
  participant Loader as TaskLoader
  participant IR as WorkflowIRResolver
  participant Handler as NodeHandler
  API->>Store: getTask (optional)
  Store-->>API: TaskDetail
  API->>IR: resolve workflow IR for task
  IR-->>API: workflow IR + target node
  API->>Handler: execute handler at node with work-item context
  Handler-->>API: outcome, value, contextPatch
  API->>Store: transitionWorkflowWorkItem to terminal state (optional)
  Store-->>API: updated work item
  API-->>API: emit terminal event
  API-->>API: return WorkflowTaskRuntimeResult
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

"🐰
I hopped through nodes with eager delight,
Merged attempts and held the night,
Leases cleared and lastErrors named,
Tests watch the dance, the flow's reclaimed.
A tiny rabbit cheers this runtime flight!"

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title clearly identifies the main change: implementing the S05 runtime work-item driver slice for the workflow-owned merge refactor, which aligns with the documented objectives of adding WorkflowTaskRuntime.runWorkItem and default handlers.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/workflow-owned-merge-s05-runtime-work-item-driver
⚔️ Resolve merge conflicts
  • Resolve merge conflict in branch feature/workflow-owned-merge-s05-runtime-work-item-driver

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint install timed out. The project may have too many dependencies for the sandbox.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gsxdsm gsxdsm force-pushed the feature/workflow-owned-merge-s04-builtin-ir-regions branch from 2c9d2b6 to 069dcb8 Compare June 9, 2026 20:30
@gsxdsm gsxdsm force-pushed the feature/workflow-owned-merge-s05-runtime-work-item-driver branch from 5ed4024 to c4f0d29 Compare June 9, 2026 20:30
@gsxdsm gsxdsm marked this pull request as ready for review June 9, 2026 20:35
@gsxdsm gsxdsm force-pushed the feature/workflow-owned-merge-s04-builtin-ir-regions branch from 069dcb8 to e1b826b Compare June 9, 2026 20:38
@gsxdsm gsxdsm force-pushed the feature/workflow-owned-merge-s05-runtime-work-item-driver branch from c4f0d29 to dfd8196 Compare June 9, 2026 20:38
@greptile-apps

greptile-apps Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR implements the S05 runtime work-item driver slice, adding WorkflowTaskRuntime.runWorkItem for leased single-node execution with terminal persistence, and seven new default node handlers (merge-gate, merge-attempt, manual-merge-hold, retry-backoff, recovery-router, branch-group-member-integration, branch-group-promotion). The addressed threads from prior rounds (context seeding, manual-hold state mapping) are correctly resolved in the current code.

  • runWorkItem leases a work item, resolves its IR node, invokes the matching handler, and persists the terminal state (succeeded, failed, or manual-required) with lease release — the manual-required path is correctly wired end-to-end.
  • The happy-path transitionWorkflowWorkItem call (line 204) is outside any try-catch; a DB-level throw (item not found, concurrent terminal transition) escapes as an unhandled rejection rather than a structured WorkflowTaskRuntimeResult.
  • retry-backoff, branch-group-member-integration, and branch-group-promotion return immediate success with no side effects and no tests; whether the scheduler's edge-following logic owns all follow-on work-item creation is not documented.

Confidence Score: 3/5

The runtime's core execution path has a real contract gap: the final store write after handler execution is unguarded, so any DB-level error produces an unhandled rejection rather than the structured result the caller expects.

The transitionWorkflowWorkItem call in the happy-path of runWorkItem (line 204) sits outside any try-catch. The production store throws synchronously on item-not-found and terminal-state conflicts, so any of those conditions propagates as a rejected promise instead of a WorkflowTaskRuntimeResult. This is a current defect on the changed execution path that can surface in production whenever a DB inconsistency or concurrent lease races the write.

packages/engine/src/workflow-task-runtime.ts — the happy-path transitionWorkflowWorkItem call needs a try-catch to preserve the function's structured-result contract

Important Files Changed

Filename Overview
packages/engine/src/workflow-task-runtime.ts Adds runWorkItem and failWorkItem; the happy-path transitionWorkflowWorkItem call at line 204 is not wrapped in try-catch, breaking the structured-result contract when the store throws
packages/engine/src/workflow-node-handlers.ts Adds seven default node handlers; merge-gate and merge-attempt are well-wired; retry-backoff, branch-group-member-integration, and branch-group-promotion are no-ops without tests or documentation of scheduler contract
packages/engine/src/tests/workflow-task-runtime.test.ts Adds six new runWorkItem tests covering success, failure, merge-gate routing, manual-hold, unwired store, and attempt threading; no-op handlers (retry-backoff, branch-group) are not tested
docs/plans/workflow-owned-merge-stack/s05-runtime-work-item-driver.md Draft slice doc; no code issues

Sequence Diagram

sequenceDiagram
    participant Scheduler
    participant runWorkItem
    participant Store
    participant Handler
    participant IR

    Scheduler->>runWorkItem: runWorkItem(workItem, settings)
    runWorkItem->>Store: check getTask + transitionWorkflowWorkItem wired
    alt store unwired
        runWorkItem-->>Scheduler: "{disposition:"failed", reason:"store-unwired"}"
    end
    runWorkItem->>runWorkItem: "guard workItem.state === "running""
    alt "state !== "running""
        runWorkItem->>Store: transitionWorkflowWorkItem(id, "failed") no try-catch
        runWorkItem-->>Scheduler: "{disposition:"failed"}"
    end
    runWorkItem->>Store: getTask(taskId)
    runWorkItem->>IR: resolveRuntimeTarget(taskId)
    IR-->>runWorkItem: "{workflowId, ir}"
    runWorkItem->>IR: ir.nodes.find(nodeId)
    IR-->>runWorkItem: "node {id, kind}"
    runWorkItem->>Handler: "handler(node, {task, settings, context})"
    note over Handler: merge-gate / merge-attempt / manual-merge-hold / retry-backoff / recovery-router / branch-group
    Handler-->>runWorkItem: "{outcome, value, contextPatch}"
    runWorkItem->>runWorkItem: resolve disposition + terminalState
    runWorkItem->>Store: transitionWorkflowWorkItem(id, terminalState) no try-catch
    Store-->>runWorkItem: WorkflowWorkItem
    runWorkItem-->>Scheduler: "{disposition, outcome, visitedNodeIds, context}"
Loading

Reviews (8): Last reviewed commit: "Merge origin/main: align merge-region vi..." | Re-trigger Greptile

@gsxdsm gsxdsm force-pushed the feature/workflow-owned-merge-s04-builtin-ir-regions branch from e1b826b to a41dcad Compare June 9, 2026 20:48
@gsxdsm gsxdsm force-pushed the feature/workflow-owned-merge-s05-runtime-work-item-driver branch from dfd8196 to 1f576b9 Compare June 9, 2026 20:48
@gsxdsm gsxdsm force-pushed the feature/workflow-owned-merge-s04-builtin-ir-regions branch from a41dcad to f59ea05 Compare June 9, 2026 23:21
@gsxdsm gsxdsm force-pushed the feature/workflow-owned-merge-s05-runtime-work-item-driver branch from 1f576b9 to 6aac550 Compare June 9, 2026 23:24
@gsxdsm gsxdsm force-pushed the feature/workflow-owned-merge-s04-builtin-ir-regions branch from f59ea05 to f8b5768 Compare June 10, 2026 00:23
@gsxdsm gsxdsm force-pushed the feature/workflow-owned-merge-s05-runtime-work-item-driver branch from 6aac550 to de10d18 Compare June 10, 2026 00:23
@gsxdsm gsxdsm force-pushed the feature/workflow-owned-merge-s05-runtime-work-item-driver branch from de10d18 to fa44451 Compare June 10, 2026 03:48
Comment thread packages/engine/src/workflow-node-handlers.ts Outdated
Comment thread packages/engine/src/workflow-task-runtime.ts Outdated
Base automatically changed from feature/workflow-owned-merge-s04-builtin-ir-regions to main June 11, 2026 14:55
Address PR #1578 feedback by deriving merge-gate routing from task/settings auto-merge policy and persisting manual-hold work items as manual-required instead of failed.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/engine/src/workflow-task-runtime.ts`:
- Around line 128-133: The runtime currently calls failWorkItem before ensuring
the store is wired, so failWorkItem's optional transitionWorkflowWorkItem call
can be a no-op and leave items stuck; fix by moving the store wiring guard
(checks for this.deps.store.getTask and
this.deps.store.transitionWorkflowWorkItem) to run before any early returns that
call failWorkItem in runWorkItem, and then remove the optional chaining when
invoking transitionWorkflowWorkItem inside failWorkItem (call
this.deps.store.transitionWorkflowWorkItem(...) directly). Apply the same
reorder-and-remove-optional-chaining change to the other similar block that
validates and fails work items (the block around the later failure path that
also calls failWorkItem).
- Around line 163-179: runWorkItem in workflow-task-runtime.ts builds the
handler context without including workItem.attempt, so handlers (notably the
"merge-attempt" handler in workflow-node-handlers.ts) never forward the retry
attempt into primitiveContextForNode; update the call site in runWorkItem to
include attempt: workItem.attempt in the context passed to handler (the object
currently containing task, settings, context), and then update the
"merge-attempt" handler to pass that attempt value through when calling
primitiveContextForNode(_node, ctx.task, ctx.context, attempt) or the
appropriate parameter position so requestMerge and underlying primitives receive
the attempt number.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: b66c2d4b-a2d2-4f77-b5a2-963b4af0f366

📥 Commits

Reviewing files that changed from the base of the PR and between 82b89bd and fa44451.

📒 Files selected for processing (4)
  • docs/plans/workflow-owned-merge-stack/s05-runtime-work-item-driver.md
  • packages/engine/src/__tests__/workflow-task-runtime.test.ts
  • packages/engine/src/workflow-node-handlers.ts
  • packages/engine/src/workflow-task-runtime.ts

Comment thread packages/engine/src/workflow-task-runtime.ts Outdated
Comment thread packages/engine/src/workflow-task-runtime.ts
Address PR #1578 feedback by failing unwired work item dispatch without a no-op persistence path and forwarding work item attempts into merge primitive context.
Comment thread packages/engine/src/workflow-task-runtime.ts
Address PR #1578 feedback by seeding run and workflow identifiers into work-item handler context so merge primitives receive the leased work identity.
@gsxdsm

gsxdsm commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator Author

The runWorkItem context initializer seeds three work-item keys but omits "workflow:run-id" and "workflow:id".

Addressed: runWorkItem now seeds workflow:run-id from workItem.runId and workflow:id from the resolved workflow target. The merge-attempt regression asserts run id, workflow id, and attempt forwarding. The CodeRabbit review-body items on store wiring and work-item attempt forwarding were also fixed.

Comment on lines +141 to +143
if (workItem.state !== "running") {
return this.failWorkItem(workItem, `workflow-work-item-not-running:${workItem.state}`);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 failWorkItem on non-running items causes uncaught exceptions and incorrect state mutations

When runWorkItem is handed a work item whose state is NOT "running", it calls failWorkItem, which unconditionally calls transitionWorkflowWorkItem. Two failure modes follow from that:

  1. Terminal states (succeeded, failed, cancelled, exhausted): The production store (packages/core/src/store.ts:9083) throws "Workflow work item X is terminal (succeeded) and cannot transition to failed". That exception escapes runWorkItem as an unhandled rejection — callers expecting Promise<WorkflowTaskRuntimeResult> see a crash instead.

  2. Non-terminal non-running states (runnable, held, retrying, manual-required): The store accepts the transition. A "manual-required" item waiting for human approval would be silently moved to "failed", erasing the hold.

The correct behaviour for this guard is to return a structured error without touching the store — only items that were legitimately claimed (and are therefore in "running") should be transitioned by this executor. No existing test exercises this path with a real store (all tests use state: "running"), so the bug is undetected today.

…actor

main's workflow-graph-executor now collapses the merge-policy region
(merge-gate, branch-group-*, merge-attempt) into a single synthetic
`merge` node. Reconcile the S05 work-item-driver test's visitedNodeIds
assertion to match, and union the `observed` recording-primitive fields
from both branches (executedTasks + merge attempt/run/workflow ids).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment on lines +204 to +208
this.deps.store.transitionWorkflowWorkItem(workItem.id, terminalState, {
leaseOwner: null,
leaseExpiresAt: null,
lastError: reason ?? null,
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Unguarded terminal store call breaks the function's return contract

transitionWorkflowWorkItem is synchronous but CAN throw: the production store raises "Workflow work item X not found" (item deleted between lease check and write) and "Workflow work item X is terminal (…) and cannot transition to …" (concurrent lease race). Neither error is caught here, so any such exception escapes as an unhandled rejection — callers expecting Promise<WorkflowTaskRuntimeResult> receive a crashed promise instead of a structured { disposition: "failed" } result. Wrapping this call in a try-catch and returning a structured failure preserves the function's contract for all DB-error scenarios.

@gsxdsm gsxdsm merged commit d1bea16 into main Jun 13, 2026
6 checks passed
@gsxdsm gsxdsm deleted the feature/workflow-owned-merge-s05-runtime-work-item-driver branch June 13, 2026 23:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant