Skip to content

fix(pgdelta): force close event loop on success path#5714

Open
avallete wants to merge 8 commits into
developfrom
claude/issue-312-investigation-mec6f6
Open

fix(pgdelta): force close event loop on success path#5714
avallete wants to merge 8 commits into
developfrom
claude/issue-312-investigation-mec6f6

Conversation

@avallete

@avallete avallete commented Jun 26, 2026

Copy link
Copy Markdown
Member

Fixes a hang where supabase db schema declarative sync (and other pg-delta flows) stop responding indefinitely at 0% CPU after work completes.

Fixes supabase/pg-toolbelt#312

Problem

The pg-delta scripts run inside a one-shot Edge Runtime container and rely on the event loop draining for the worker to be destroyed and the container to exit. The catalog-export script opens a real connection pool (createManagedPool); when a keepalive handle lingers after close() resolves, the worker never exits, so the container never stops. Because the CLI streams that container's logs with Follow:true (DockerStreamLogs), a worker that never exits blocks the parent __catalog subprocess — and the declarative-sync command that spawned it — forever. Only the error path force-closed the loop via throw new Error(""); the success path did not.

This was reproduced against supabase/edge-runtime:v1.74.2: a worker left with a lingering handle keeps the container running and docker logs -f (the equivalent of DockerStreamLogs with Follow:true) never returns; adding the force-close makes the container exit immediately with its output intact. A large (~1.26 MB) stdout payload on its own does not reproduce the hang — the trigger is the non-exiting worker, not output size.

Solution

Force-close the event loop on the success path of every pg-delta Edge Runtime script, so the worker is torn down deterministically once output has been flushed (output is written synchronously before the throw, and RunEdgeRuntimeScript already tolerates the resulting main worker has been destroyed exit):

  • pgdelta_catalog_export.ts — catalog snapshot export (the path reported in test: secret list command #312)
  • pgdelta.ts — diff SOURCE→TARGET
  • pgdelta_declarative_export.ts — declarative file export
  • pgdelta_declarative_apply.ts — apply declarative schema

The catalog-export script is the only one that always holds a live pool, but the diff/export/apply scripts open connections too when given live database URLs, so all four are covered. Error paths are unchanged.

Changes

  • Added a success-path force-close to all four template scripts, with explanatory comments.
  • Synced the byte-for-byte embedded copies in legacy-pgdelta.deno-templates.ts.
  • Added regression guards (pgdelta_template_test.go, pgdelta_apply_template_test.go) asserting each script's success path ends with the force-close.

https://claude.ai/code/session_01XbxecW4DVmwgQB1YX321K3

… output

`supabase db schema declarative sync` could hang indefinitely at 0% CPU
after all migrations were applied to the shadow database
(supabase/pg-toolbelt#312).

Root cause: the pg-delta Deno scripts run inside a one-shot Edge Runtime
container and rely on the event loop draining for the worker to be
destroyed and the container to exit. The catalog-export script opens a
real connection pool (createManagedPool); when a keepalive handle lingers
after close() resolves, the worker never exits, so the container never
stops. The CLI streams that container's logs with Follow:true
(DockerStreamLogs), so a worker that never exits blocks the parent
`__catalog` subprocess — and the declarative-sync command that spawned it
— forever. Only the error path force-closed the loop
(`throw new Error("")`); the success path did not.

Fix: force-close the event loop on the success path of every pg-delta
Edge Runtime script (diff, declarative-export, catalog-export, and
declarative-apply), so the worker is torn down deterministically once
output has been flushed. The output is written synchronously before the
throw, and RunEdgeRuntimeScript already tolerates the resulting
"main worker has been destroyed" exit.

Reproduced against supabase/edge-runtime:v1.74.2: a worker with a
lingering handle keeps the container `running` and `docker logs -f` (the
equivalent of DockerStreamLogs Follow:true) never returns; adding the
force-close makes the container exit immediately with output intact.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01XbxecW4DVmwgQB1YX321K3
@avallete avallete requested a review from a team as a code owner June 26, 2026 14:45
@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown

Supabase CLI preview

npx --yes https://pkg.pr.new/supabase/cli/supabase@35e25544e2f3b716195bfacb9209a894c4373a13

Preview package for commit 35e2554.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d4e2d749ed

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +33 to +37
scripts := map[string]string{
"pgdelta.ts": pgDeltaScript,
"pgdelta_declarative_export.ts": pgDeltaDeclarativeExportScript,
"pgdelta_catalog_export.ts": pgDeltaCatalogExportScript,
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Include pgcache catalog export in the force-close fix

When pg-delta catalog caching is enabled, pgcache.TryCacheMigrationsCatalog still runs the separate pgDeltaCatalogExportTS script in apps/cli-go/internal/db/pgcache/cache.go through RunEdgeRuntimeScript; that script uses the same createManagedPool/extractCatalog/close() pattern but still ends after await close() with no final force-close throw. Because this new coverage only checks the three diff-package templates, the hang fixed here can still occur on the migrations-catalog cache path, such as db start/db push with pg-delta caching enabled. Please update that template and cover it as well.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed. pgDeltaCatalogExportTS in internal/db/pgcache/cache.go (the db start / db push migrations-catalog cache path) had the same createManagedPool/close() pattern with no success-path force-close. Added the force-close there plus a cache_template_test.go guard, matching the other four scripts. Pushed in 7908b63.


Generated by Claude Code

avallete and others added 3 commits June 26, 2026 17:06
The migrations-catalog cache script (pgcache.TryCacheMigrationsCatalog, used
by db start / db push with pg-delta caching) uses the same
createManagedPool/extractCatalog/close() pattern as the other pg-delta Edge
Runtime scripts and had the same missing success-path force-close, so it could
hang the same way (supabase/pg-toolbelt#312). Add the force-close and a guard
test. Flagged by automated PR review.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01XbxecW4DVmwgQB1YX321K3
@avallete

Copy link
Copy Markdown
Member Author

@codex review

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Swish!

Reviewed commit: 1d4b65b0a0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

claude and others added 3 commits June 30, 2026 10:00
…ts logs

The worker force-close made the edge-runtime container exit, but declarative
sync still hung under podman. A user goroutine dump on the hung __catalog
subprocess showed the block is in stdcopy.StdCopy reading the Follow:true
Docker log stream (DockerStreamLogs): the stream never receives EOF after the
container exits, so the read blocks forever at 0% CPU
(supabase/pg-toolbelt#312). podman's /containers/<id>/logs?follow endpoint does
not close when the container stops, unlike Docker.

Run the bounded edge-runtime scripts via DockerRunOnceWaitWithConfig, which
detects completion by polling ContainerInspect (reliable on podman) and then
reads the buffered logs without following. The shared DockerStreamLogs (used by
functions serve for genuine live streaming) and DockerRunOnceWithConfig (used by
db dump for large streaming output) are left unchanged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01XbxecW4DVmwgQB1YX321K3
@avallete avallete marked this pull request as ready for review June 30, 2026 14:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

declarative sync hangs indefinitely after migrations applied - goroutine deadlock in __catalog subprocess

2 participants