Skip to content

fix(executor): track For Each iteration bodies in pendingTasks#1182

Open
eskp wants to merge 1 commit intostagingfrom
fix/foreach-parallel-iteration-drop
Open

fix(executor): track For Each iteration bodies in pendingTasks#1182
eskp wants to merge 1 commit intostagingfrom
fix/foreach-parallel-iteration-drop

Conversation

@eskp
Copy link
Copy Markdown

@eskp eskp commented May 8, 2026

Summary

  • Stop parallel For Each iterations from silently dropping body steps when the workflow SDK's checkpoint resume severs an iteration's await chain after its first body step. The downstream step never got scheduled, the iteration still reported success, and the run finalised with status success missing the dropped work.
  • Each iteration body's executeBodyNode call is now wrapped in pendingTasks.track, so the workflow-end drain holds a strong reference to in-flight body chains and recovers orphaned continuations. Sequential mode is unaffected. Mirrors the KEEP-395 fix that protects the main DAG from the same class of SDK truncation.

Test plan

  • pnpm vitest run tests/unit/executor-drain-loop.test.ts -- new simulates parallel For Each iterations: drain rescues truncated body recursion case passes alongside the existing KEEP-395 drain-loop tests
  • pnpm vitest run tests/unit/for-each-body-runner.test.ts tests/unit/for-each-body-recursion.test.ts tests/unit/for-each-executor.test.ts tests/unit/for-each-concurrency.test.ts -- no regressions
  • pnpm type-check -- clean
  • Run the user-reported test-parallel-iteration-bug workflow on a deployed branch and confirm parallel mode records all body steps (e.g., 4 iterations -> 4 decode + 4 HTTP) instead of dropping iterations after the first step

Parallel For Each iterations were silently dropping body steps when the
SDK's checkpoint resume severed the iteration's await chain after the
first body step. The downstream step (e.g., HTTP Request after a Code
step) was never scheduled, the iteration appeared successful, and the
workflow finalised with status "success" missing the dropped work.

Wrap each iteration body's executeBodyNode call in pendingTasks.track
so the workflow-end drain holds strong references to in-flight body
chains. Sequential mode is unaffected. Mirrors the KEEP-395 fix that
protects the main DAG from the same SDK truncation.

Adds a drain-loop simulator covering the For Each parallel pattern.
@eskp eskp requested review from a team, OleksandrUA, joelorzet and suisuss and removed request for a team May 8, 2026 06:22
@eskp eskp deployed to staging May 8, 2026 06:22 — with GitHub Actions Active
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant