Skip to content

Validation: Combined Playwright upgrade + preview fix + CSM decode-retry (do not merge)#79381

Closed
adamsilverstein wants to merge 13 commits into
WordPress:trunkfrom
adamsilverstein:try/csm-validation-combined
Closed

Validation: Combined Playwright upgrade + preview fix + CSM decode-retry (do not merge)#79381
adamsilverstein wants to merge 13 commits into
WordPress:trunkfrom
adamsilverstein:try/csm-validation-combined

Conversation

@adamsilverstein

Copy link
Copy Markdown
Member

What?

Validation-only, do-not-merge. This is a combined draft branch that stacks two open PRs so CI runs the client-side media (CSM) e2e suite on Chrome 148+ and proves it goes green:

  • #79342 - the Playwright upgrade (Chrome 148) and the Document-Isolation-Policy preview fix.
  • #79378 - the wasm-vips worker decode-retry fix.

On top of those, it removes the Chromium >= 148 skip gate from client-side-media-processing.spec.js so the CSM tests actually run under the upgraded browser instead of skipping.

Why?

In #79342 the CSM e2e suite is skip-gated on Chrome >= 148. That gate exists because the upgraded CI browser is the first to support Document-Isolation-Policy, which is what activates CSM in CI - and activating it exposed a timing-sensitive race in the multi-threaded wasm-vips worker (the decoder intermittently receives a short/garbled source buffer and libheif aborts with bad seek / Bitstream not supported). That race is the subject of #79377, fixed by #79378.

The retry fix has unit tests, but the underlying race is fragile and contention-dependent and could not be reproduced reliably on an idle dev machine. The only faithful validation is a CI run of the CSM suite on Chrome 148+ with the suite un-skipped. This branch sets that up.

How?

  1. Branched off fix/preview-interstitial-dip-isolation (#79342), so it inherits the Playwright/Chrome 148 upgrade and the preview fix.
  2. Cherry-picked the decode-retry commit from #79378 (withVipsRetry() around the four File-based worker calls + unit tests).
  3. Removed the Chromium >= 148 skip gate and the now-unused version helper from client-side-media-processing.spec.js so the CSM tests run.

Testing Instructions

Watch CI. The goal is to confirm the client-side media processing e2e shard runs (no longer skipped) and passes on Chrome 148+ with the retry fix in place.

The preload and performance suites keep their existing skip gates - those have a separate, not-yet-root-caused startup-timeout under cross-origin isolation on Chrome 148+ and are out of scope here.

Screenshots or screencast

N/A - CI validation only.

Mamaduka and others added 12 commits June 16, 2026 13:19
The editor screen sends Document-Isolation-Policy: isolate-and-credentialless
for cross-origin isolation. This places the editor tab and an already-open
preview tab in separate agent clusters, so reusing a preview tab and
synchronously accessing previewWindow.document to write the interstitial
throws a SecurityError. The preview then keeps showing stale content.

Reset the reused tab to about:blank (which returns it to the opener's agent
cluster) and poll until its document is reachable before writing the
interstitial, instead of accessing the isolated document directly. The
interstitial is treated as a progressive enhancement and skipped if the
document never becomes reachable; the preview still navigates to the real
content.

This surfaced via the Playwright 1.60+ upgrade, which ships Chrome 148.
…atest-base

# Conflicts:
#	package-lock.json
#	test/storybook-playwright/package.json
…nberg into fix/preview-interstitial-dip-isolation
The root package.json still pinned @playwright/test at ^1.58.2 while the
workspaces were bumped to ^1.61.0, failing 'lint:deps' (syncpack) which
requires a single version across the repo.
…nberg into fix/preview-interstitial-dip-isolation
The Playwright 1.60 upgrade ships Chrome for Testing 148, which has a
regression in the cross-origin isolated `isolate-and-credentialless`
Document-Isolation-Policy runtime that Gutenberg sends on editor screens.

Under it three suites fail although the product behaves correctly on
shipping Chrome: client-side media processing silently falls back to the
server (so format/rotation/sub-size assertions break), and the preload
and Loading Patterns specs never reach a settled state so they time out.

Gate these specs on the major Chromium version so they skip on 148+ until
the browser regression is resolved, mirroring the existing 137+ gate used
for Document-Isolation-Policy. See
WordPress#78632.
The skip comments claimed CSM "silently falls back to the server" and read
as a user-facing product regression. Manual testing shows shipping Google
Chrome processes client-side media correctly (AVIF upload verified on stable
149 and Canary 151); the failures are confined to the Chrome for Testing
build Playwright bundles in CI after the 148/149 bump. Reword the comments to
state the verified, CI-specific nature and avoid asserting an unconfirmed
mechanism.
The Chromium >=148 skip comments attributed the failures to a
Document-Isolation-Policy regression. Investigation shows that is not the
cause: DIP is what first enables cross-origin isolation in the CI browser
(Chrome <148 never became crossOriginIsolated, so CSM was inactive and the
CSM specs simply skipped). With CSM now active under automation, uploads
hit a timing-sensitive race in the multi-threaded wasm-vips worker - the
decoder intermittently receives a short buffer and libheif aborts, surfacing
as IMAGE_TRANSCODING_ERROR. The same wasm-vips decodes the same fixtures in
Node and in manual Chrome, so it is automation-timing, not a user regression.

Reword the CSM skip to describe the race and link the tracking issue, and
reword the preload/performance skips (a separate, not-yet-root-caused
cross-origin-isolation startup timeout) to drop the inaccurate
"DIP is broken" wording.

Tracking: WordPress#79377
In the cross-origin-isolated, multi-threaded wasm-vips worker, image decode
calls (resize, transcode, rotate) can intermittently fail when the worker
receives a short or garbled source buffer under heavy main-thread
contention. libheif aborts with a "bad seek" / "Bitstream not supported"
error that surfaces as a generic IMAGE_TRANSCODING_ERROR and cancels the
upload, even though the same bytes decode correctly on a later attempt (and
in Node and in manual Chrome).

Wrap the four File-based worker calls in a bounded retry helper that
re-reads the source buffer and re-issues the call up to VIPS_MAX_ATTEMPTS
times with a short backoff, short-circuiting on an aborted signal.
Mechanism-agnostic: whatever yields the transient short buffer, a fresh
re-read on a later task recovers; genuinely undecodable images still fail
after the final attempt. Add unit tests covering retry-then-succeed,
give-up-after-max, and stop-on-abort.

Issue: WordPress#79377
Remove the Chromium >= 148 skip gate (and the now-unused version helper)
from the client-side media processing e2e suite so the tests run under the
upgraded Playwright browser. With the wasm-vips decode-retry fix cherry-picked
onto this branch, this validates that the previously-skipped CSM assertions
now pass on Chrome 148+ in CI.
These CSM e2e tests never ran in CI before: they skip unless the browser is
cross-origin isolated, which only happens once Document-Isolation-Policy is
active. Now that they run, several assert behavior the feature does not
implement rather than any Chrome 148 / wasm-vips regression. (Verified
wasm-vips processes correctly on Chrome 149 across main-thread,
nested-worker, COOP/COEP and Document-Isolation-Policy isolation.)

- UltraHDR probe: resolve wasm-vips from @wordpress/vips so the dynamic
  import works in a clean CI install where it does not hoist to the repo
  root node_modules.
- PNG->JPEG / JPEG->WebP: CSM, like core, only transcodes generated
  sub-sizes, never the full-size attachment's MIME type. Assert the sub-size
  format instead of the main file's.
- srcset: the editor's default size is `large`, so the block stores the
  large sub-size URL (a registered size that satisfies srcset matching), not
  the -scaled full file. Assert a finalized real-file URL plus the front-end
  srcset, and capture the attachment ID before navigating to the front end
  where window.wp.data is unavailable.
- EXIF auto-rotation: CSM does not bake EXIF orientation into the full-size
  file (it only sideloads a rotated original_image). Mark fixme pending a
  product decision on whether to rotate the main file like core.
@adamsilverstein

Copy link
Copy Markdown
Member Author

Closed in favor of adamsilverstein#28

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

[Package] Editor /packages/editor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants