Skip to content

Signal handling for non-Python WASM modules#6154

Merged
logan-gatlin merged 24 commits intomainfrom
logan/wasm-signal-handling
Mar 3, 2026
Merged

Signal handling for non-Python WASM modules#6154
logan-gatlin merged 24 commits intomainfrom
logan/wasm-signal-handling

Conversation

@logan-gatlin
Copy link
Contributor

@logan-gatlin logan-gatlin commented Feb 24, 2026

Continuation of 6098

This adds a shim to WebAssembly.Instantiate and friends which looks for modules that export two i32 globals: __instance_signal and __instance_terminated. Modules matching one of these signatures will receive a hook, allowing us to read and write to those memory locations, allowing for signal-based communication with modules.

The addition of __instance_terminated is required, since we must know when the module is dead so we can stop sending it signals. This is also useful for modules that want to know when they have been terminated, so that they can re-initialize.

Copy link
Contributor

@ask-bonk ask-bonk bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm Bonk, and I've done a quick review of your PR.

Summary: Adds a JS shim around WebAssembly.instantiate/Instance to detect WASM modules exporting __signal_address/__terminated_address globals, registering them for signal-based shutdown notification via a new signal-safe AtomicList data structure.

Issues ranked by severity:

  1. [High] Missing compat flag — behavioral change applied unconditionally to all workers. The shim wraps WebAssembly.instantiate's return promise with an extra .then(), which adds a microtask tick before results are delivered. This subtly changes promise scheduling for every worker that uses WebAssembly.instantiate, not just ones with signal globals. workerd has a strong backward-compat commitment; this kind of observable behavioral change must be gated behind a compatibility flag in compatibility-date.capnp.

  2. [High] Missing autogate for staged rollout. Even with a compat flag, this is a risky change (shimming fundamental WebAssembly APIs, new signal-handler code path). It should additionally be gated behind an autogate flag for safe incremental rollout.

  3. [High] No tests. There are no tests for any part of this feature: AtomicList, the JS shim, registerWasmShutdownSignal, or the GC prologue filtering. AtomicList is a custom concurrent data structure used from signal handlers — this absolutely needs unit tests, including tests that verify signal-safety invariants (e.g., that iterate() sees a consistent list while filter() is interrupted).

  4. [Medium] WebAssembly.Instance shim changes behavior when called without new. The original WebAssembly.Instance throws if called as a regular function (not as a constructor). The shim is a plain JS function that internally does new originalInstance(...), so WebAssembly.Instance(module, imports) (without new) would succeed instead of throwing. This is a spec deviation.

  5. [Medium] AtomicList uses GCC __atomic builtins on non-atomic types. Node* head and Node* next are plain pointers, not std::atomic<Node*>. The GCC builtins work in practice, but this is technically undefined behavior per the C++ memory model. Consider using std::atomic<Node*> for head and Node::next — the generated code is identical, and it makes the concurrency intent explicit and portable.

  6. [Low] <atomic> header is included but std::atomic is not used. Either switch to std::atomic (see issue 5) or remove the unused include.

  7. [Low] signalByteOffset and terminatedByteOffset can overlap. The bounds checks don't verify the two offsets point to non-overlapping memory. Overlapping offsets would cause the runtime's own signal write to be misinterpreted as the module's termination flag. A simple KJ_REQUIRE(signalOffset != terminatedOffset) would catch this.

@ask-bonk
Copy link
Contributor

ask-bonk bot commented Feb 24, 2026

I'm Bonk, and I've done a quick review of your PR.

Summary: Adds a JS shim around WebAssembly.instantiate/Instance to detect WASM modules exporting __signal_address/__terminated_address globals, registering them for signal-based shutdown notification via a new signal-safe AtomicList data structure.

I found 7 issues and posted them as a batch review on the PR. The highest severity items are:

  1. [High] Missing compat flag — The .then() wrapper adds a microtask tick to WebAssembly.instantiate for all workers, which is an observable behavioral change that needs a compat flag.
  2. [High] Missing autogate — Shimming fundamental WebAssembly APIs with new signal-handler code is risky and should use an autogate for staged rollout.
  3. [High] No tests — No tests for AtomicList, the JS shim, registration, or GC prologue filtering.
  4. [Medium] Instance shim spec deviationWebAssembly.Instance() without new would succeed instead of throwing (suggestion provided).
  5. [Medium] UB with __atomic builtins on non-atomic typeshead/next should be std::atomic<Node*> (suggestion provided).
  6. [Low] Unused <atomic> include.
  7. [Low] No overlap check for signal/terminated offsets (suggestion provided).

github run

@codecov-commenter
Copy link

codecov-commenter commented Feb 24, 2026

Codecov Report

❌ Patch coverage is 86.93009% with 86 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.70%. Comparing base (ce1adde) to head (c0a9699).
⚠️ Report is 28 commits behind head on main.

Files with missing lines Patch % Lines
src/workerd/io/tracked-wasm-instance-test.c++ 86.08% 0 Missing and 69 partials ⚠️
src/workerd/io/worker.c++ 75.40% 9 Missing and 6 partials ⚠️
src/workerd/io/tracked-wasm-instance.h 95.65% 0 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6154      +/-   ##
==========================================
+ Coverage   70.60%   70.70%   +0.09%     
==========================================
  Files         414      417       +3     
  Lines      110411   111059     +648     
  Branches    18237    18340     +103     
==========================================
+ Hits        77960    78523     +563     
- Misses      21575    21584       +9     
- Partials    10876    10952      +76     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@guybedford guybedford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial feedback, please let me know when this is ready for final review.

@codspeed-hq
Copy link

codspeed-hq bot commented Feb 25, 2026

Merging this PR will not alter performance

✅ 70 untouched benchmarks
⏩ 129 skipped benchmarks1


Comparing logan/wasm-signal-handling (c0a9699) with main (ce1adde)

Open in CodSpeed

Footnotes

  1. 129 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@guybedford guybedford marked this pull request as ready for review February 25, 2026 19:57
@guybedford guybedford requested review from a team as code owners February 25, 2026 19:57
Copy link
Contributor

@guybedford guybedford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be great to see this feature shipped. There is an integration test PR already in cloudflare/workers-rs#939. @jasnell would value your input on if we need to think more about detached buffers here before landing.

Copy link
Contributor

@ryanking13 ryanking13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some nitpicky comments, not blocking merge.

@logan-gatlin
Copy link
Contributor Author

Minor note: I added wasm-tools as a test dependency, so that the wasm test files can be compiled instead of committed as binaries.

logan-gatlin and others added 23 commits March 2, 2026 19:50
Co-authored-by: Guy Bedford <gbedford@cloudflare.com>
…posal

Each WasmShutdownSignal entry holds a shared_ptr<v8::BackingStore> via
kj::Array::attach(). In Worker::Isolate, member destruction order destroys
the V8 isolate (api) before limitEnforcer (which owns the AtomicList).
The BackingStore destructor lives in libv8.so and may access V8 internal
state, so dropping these shared_ptrs after V8 disposal is a use-after-free.

Fix: explicitly clear the AtomicList in ~Isolate() while V8 is still alive.
Co-authored-by: Guy Bedford <gbedford@cloudflare.com>
Co-authored-by: Guy Bedford <gbedford@cloudflare.com>
Move signal-writing logic from inline free functions into
TrackedWasmInstanceList methods (writeShutdownSignal,
clearShutdownSignal, writeTerminatedSignal) and add unit tests.

Rename all files, types, methods, variables, and references from
wasm-shutdown-signal / WasmShutdownSignal to tracked-wasm-instance /
TrackedWasmInstance. The WASM_SHUTDOWN_SIGNAL_SHIM autogate key is
unchanged.
@logan-gatlin logan-gatlin force-pushed the logan/wasm-signal-handling branch from b153981 to b05963d Compare March 2, 2026 19:56
@github-actions
Copy link

github-actions bot commented Mar 2, 2026

The generated output of @cloudflare/workers-types matches the snapshot in types/generated-snapshot 🎉

@logan-gatlin logan-gatlin merged commit 2d4259c into main Mar 3, 2026
34 of 36 checks passed
@logan-gatlin logan-gatlin deleted the logan/wasm-signal-handling branch March 3, 2026 00:35
@logan-gatlin logan-gatlin mentioned this pull request Mar 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants