Skip to content

JS screenshot suite: defend against silent test-hang chains#5125

Open
shai-almog wants to merge 4 commits into
masterfrom
jsport-screenshot-suite-hang-fix
Open

JS screenshot suite: defend against silent test-hang chains#5125
shai-almog wants to merge 4 commits into
masterfrom
jsport-screenshot-suite-hang-fix

Conversation

@shai-almog
Copy link
Copy Markdown
Collaborator

Summary

Master's JavaScript-port screenshot CI is currently timing out (3 consecutive failures on master: #5105 l10n, #5111 dedupe-paint-leak, #5054 modern-themes — each runs 30+ minutes without reaching CN1SS:SUITE:FINISHED). The blow-up test drifts run-to-run but the same two failure patterns surface across all three runs.

This PR patches both patterns at the Java test-runner layer so the suite makes forward progress in the presence of the documented canvas-staleness / NUMBER_FOR_OBJECT recovery=substituted-null bug (host bridge intermittently returns the JS Number 667 for window.getDocument() under the canvas-accumulation tail).

Findings

l10n run (commit defe5ed7b) — hangs at SheetSlideUpAnimationScreenshotTest. NPE in BrowserDomRenderingBackend.createCanvasAbstractAnimationScreenshotTest.captureAndEmit's catch tries to build a placeholder via Image.createImage(w,h,color) which routes through the same NPEing path → re-throw is swallowed by the surrounding finallyemitImage never runs → done() never fires.

dedupe-paint-leak / our-PR runs (commit f0c4acfc5 / 6b7d573dc) — hang at LightweightPickerButtonsScreenshotTest and ButtonThemeScreenshotTest respectively. DualAppearanceBaseTest.done() is gated on bothPhasesComplete; the gate also swallows fail() → done() calls. Combined with EDT-side throws in the emit chain (onShowCompleted → registerReadyCallback's UITimer → triple callSerially → emitCurrentFormScreenshot) that log to the EDT error handler but never call next.run(), the test hangs for the full suite budget.

Fixes

  1. AbstractAnimationScreenshotTest.captureAndEmit — nest the placeholder createImage in its own try/catch; pass null to emitImage on failure. emitImage already handles image==null (emits a placeholder marker, calls onComplete), so done() fires and the suite advances.

  2. DualAppearanceBaseTest.done() override — bypass the bothPhasesComplete gate when isFailed() returns true. fail() actually finalises now instead of being swallowed.

  3. DualAppearanceBaseTest.runTest — wrap the body in a Throwable catch that calls fail() on any synchronous exception out of installModernThemeIfRequested or runAppearance.

  4. DualAppearanceBaseTest.runAppearance — schedule an 8s wall-clock fallback via CN.setTimeout per appearance phase. If the natural emit chain hasn't fired next.run() in that window, the fallback forces it idempotently (a ran[] guard ensures whichever fires first wins, the other is a no-op).

Risk

  • The 8s fallback runs only when the natural chain has stalled. In healthy runs it's a no-op (chain typically finishes in 2-4s, well within the window).
  • The isFailed() escape on done() only changes behaviour for tests that have already had fail() called — pre-existing pass / non-fail flows are unaffected.
  • The null-image fall-through in AbstractAnimationScreenshotTest uses the same code path as emitImage(null, ...) for genuinely-null inputs, so no new behaviour is introduced.
  • All three changes are confined to the test harness under scripts/hellocodenameone/common/ — no framework or port code is touched.

Test plan

  • CI scripts-javascript workflow reaches CN1SS:SUITE:FINISHED within the 1740s budget.
  • Comparison report renders for all tests that successfully captured.
  • Tests whose underlying bug causes the placeholder fall-through emit a placeholder PNG (or screenshot=none marker) and the suite continues.
  • iOS / Android / JavaSE workflows unaffected (the failure mode addressed is JS-port-specific; the EDT error handling and gate behaviour change only kicks in when fail() is called, which native platforms don't currently hit).

🤖 Generated with Claude Code

The JS-port screenshot suite is currently timing out on master.
Three independent suite runs (l10n #5105, dedupe-paint-leak #5111,
modern-themes #5054) all hit the 30-min ``CN1_JS_TIMEOUT_SECONDS``
budget without ever emitting ``CN1SS:SUITE:FINISHED``. The blow-up
test drifts run-to-run but the failure pattern is the same: a single
test's done() never fires and the suite-level wait expires.

Two reproducible patterns surface in the CI browser logs:

  1. Animation-grid placeholder failure
     ``AbstractAnimationScreenshotTest.captureAndEmit`` builds the
     grid inside a try/catch and falls back to
     ``Image.createImage(w,h,color)`` if the per-frame composition
     throws. Under the canvas-accumulation tail (~80 prior tests)
     the JS port's ``BrowserDomRenderingBackend.createCanvas`` is
     hit by the known ``NUMBER_FOR_OBJECT recovery=substituted-null``
     bug — ``window.getDocument()`` intermittently returns the JS
     Number 667 (viewport height), ``createElement('canvas')`` then
     resolves to null, and ``canvas.setWidth(...)`` NPEs. The
     placeholder createImage routes through the same path and
     re-throws the identical NPE. The original catch swallows the
     re-throw via the surrounding ``finally``, ``emitImage`` is
     never reached, and ``done()`` never fires. Symptom: l10n run
     hangs forever at ``SheetSlideUpAnimationScreenshotTest``
     after one ``animation_grid_failed=java.lang.NullPointerException``
     entry.

     Fix: nest the placeholder createImage in its own try/catch;
     pass ``null`` to ``emitImage`` on failure. ``emitImage``
     already handles ``image==null`` by emitting a placeholder
     marker and still calling the ``onComplete`` runnable, so
     ``done()`` fires and the suite advances.

  2. DualAppearance done() gate swallows fail()
     ``DualAppearanceBaseTest.done()`` is gated by
     ``bothPhasesComplete`` so the JS port's emit-completion
     force-done can't finalise the test between the light and dark
     captures. The gate also swallows ``fail()`` → ``done()`` calls,
     which is the path port.js's ``lambdaBridge`` catch takes when
     ``runTest`` throws synchronously: the test fails, but the
     gate ignores the done() and the suite hangs.

     Additionally, the EDT-side emit chain
     (onShowCompleted → registerReadyCallback's UITimer →
     triple callSerially → emitCurrentFormScreenshot →
     Display.screenshot → chunked emit → onComplete) silently
     breaks when any link throws — the EDT error handler logs it
     but the chain's ``next.run()`` never fires.

     Fix:
       - ``done()`` override now bypasses the
         ``bothPhasesComplete`` gate when ``isFailed()`` returns
         true, so fail() actually finalises.
       - ``runTest`` body wrapped in a Throwable catch that
         logs + calls ``fail()`` on any synchronous exception
         out of installModernThemeIfRequested or runAppearance.
       - Each appearance phase now schedules an 8s wall-clock
         fallback via ``CN.setTimeout``. If the natural emit
         chain hasn't fired ``next.run()`` in that window, the
         fallback forces it (idempotently — whichever fires
         first wins). Without this, an uncaught EDT throw in
         the chain hangs the test for the entire suite budget.

Combined effect: the suite now makes forward progress in the
presence of the documented canvas-staleness / NUMBER_FOR_OBJECT
bug. Individual tests may still produce placeholder PNGs or warn
about a stalled chain, but the suite finishes and downstream
comparison runs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@shai-almog
Copy link
Copy Markdown
Collaborator Author

shai-almog commented May 30, 2026

Compared 122 screenshots: 122 matched.

Native Android coverage

  • 📊 Line coverage: 12.89% (7507/58219 lines covered) [HTML preview] (artifact android-coverage-report, jacocoAndroidReport/html/index.html)
    • Other counters: instruction 10.47% (37527/358412), branch 4.38% (1478/33726), complexity 5.47% (1778/32479), method 9.56% (1458/15258), class 15.61% (332/2127)
    • Lowest covered classes
      • kotlin.collections.kotlin.collections.ArraysKt___ArraysKt – 0.00% (0/6327 lines covered)
      • kotlin.collections.unsigned.kotlin.collections.unsigned.UArraysKt___UArraysKt – 0.00% (0/2384 lines covered)
      • org.jacoco.agent.rt.internal_b6258fc.asm.org.jacoco.agent.rt.internal_b6258fc.asm.ClassReader – 0.00% (0/1519 lines covered)
      • kotlin.collections.kotlin.collections.CollectionsKt___CollectionsKt – 0.00% (0/1148 lines covered)
      • org.jacoco.agent.rt.internal_b6258fc.asm.org.jacoco.agent.rt.internal_b6258fc.asm.MethodWriter – 0.00% (0/923 lines covered)
      • kotlin.sequences.kotlin.sequences.SequencesKt___SequencesKt – 0.00% (0/730 lines covered)
      • kotlin.text.kotlin.text.StringsKt___StringsKt – 0.00% (0/623 lines covered)
      • org.jacoco.agent.rt.internal_b6258fc.asm.org.jacoco.agent.rt.internal_b6258fc.asm.Frame – 0.00% (0/564 lines covered)
      • kotlin.collections.kotlin.collections.ArraysKt___ArraysJvmKt – 0.00% (0/495 lines covered)
      • kotlinx.coroutines.kotlinx.coroutines.JobSupport – 0.00% (0/423 lines covered)

✅ Native Android screenshot tests passed.

Native Android coverage

  • 📊 Line coverage: 12.89% (7507/58219 lines covered) [HTML preview] (artifact android-coverage-report, jacocoAndroidReport/html/index.html)
    • Other counters: instruction 10.47% (37527/358412), branch 4.38% (1478/33726), complexity 5.47% (1778/32479), method 9.56% (1458/15258), class 15.61% (332/2127)
    • Lowest covered classes
      • kotlin.collections.kotlin.collections.ArraysKt___ArraysKt – 0.00% (0/6327 lines covered)
      • kotlin.collections.unsigned.kotlin.collections.unsigned.UArraysKt___UArraysKt – 0.00% (0/2384 lines covered)
      • org.jacoco.agent.rt.internal_b6258fc.asm.org.jacoco.agent.rt.internal_b6258fc.asm.ClassReader – 0.00% (0/1519 lines covered)
      • kotlin.collections.kotlin.collections.CollectionsKt___CollectionsKt – 0.00% (0/1148 lines covered)
      • org.jacoco.agent.rt.internal_b6258fc.asm.org.jacoco.agent.rt.internal_b6258fc.asm.MethodWriter – 0.00% (0/923 lines covered)
      • kotlin.sequences.kotlin.sequences.SequencesKt___SequencesKt – 0.00% (0/730 lines covered)
      • kotlin.text.kotlin.text.StringsKt___StringsKt – 0.00% (0/623 lines covered)
      • org.jacoco.agent.rt.internal_b6258fc.asm.org.jacoco.agent.rt.internal_b6258fc.asm.Frame – 0.00% (0/564 lines covered)
      • kotlin.collections.kotlin.collections.ArraysKt___ArraysJvmKt – 0.00% (0/495 lines covered)
      • kotlinx.coroutines.kotlinx.coroutines.JobSupport – 0.00% (0/423 lines covered)

Benchmark Results

Detailed Performance Metrics

Metric Duration
Base64 payload size 8192 bytes
Base64 benchmark iterations 6000
Base64 native encode 930.000 ms
Base64 CN1 encode 161.000 ms
Base64 encode ratio (CN1/native) 0.173x (82.7% faster)
Base64 native decode 818.000 ms
Base64 CN1 decode 307.000 ms
Base64 decode ratio (CN1/native) 0.375x (62.5% faster)
Image encode benchmark status skipped (SIMD unsupported)

Initial CI for the safety-net patches confirmed the design works —
ButtonTheme and TextFieldTheme now finish (previously hung the
suite), and 87 of ~95 tests reach completion. But two refinements
are needed before the suite walks the full DEFAULT_TEST_CLASSES
array inside its budget:

1. ``DualAppearanceBaseTest.finish()`` runs on the EDT after the
   dark capture's completion runnable returns. Under the
   canvas-accumulation tail it can hit the host-bridge ``Missing
   JS member createElement`` cascade inside refreshTheme's form
   re-paint walk. The throw skips ``bothPhasesComplete = true``
   so the gated done() never fires and the test hangs for the
   rest of the suite budget. Wrap the cleanup in try/finally so
   the gate lifts and done() fires regardless of whether the
   restore-to-default-theme work succeeds. Next test inherits
   whatever theme state the throw left behind (best-effort
   teardown), but the suite advances.

2. CN1_JS_TIMEOUT_SECONDS 1800 → 2700 (and matching browser
   lifetime 1740 → 2640). #5054 introduced 14 modern-theme
   tests, each light + dark phase eats ~16-20s wall on shared
   GHA runners with the canvas-accumulation slowdown. Total
   walks past 30 min; 45 min absorbs the worst case. Job
   timeout is 6h so this is well within the GHA budget.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 30, 2026

✅ Continuous Quality Report

Test & Coverage

Static Analysis

  • SpotBugs [Report archive]
    • ByteCodeTranslator: 0 findings (no issues)
    • android: 0 findings (no issues)
    • codenameone-maven-plugin: 0 findings (no issues)
    • core-unittests: 0 findings (no issues)
    • ios: 0 findings (no issues)
  • PMD: 0 findings (no issues) [Report archive]
  • Checkstyle: 0 findings (no issues) [Report archive]

Generated automatically by the PR CI workflow.

@shai-almog
Copy link
Copy Markdown
Collaborator Author

shai-almog commented May 30, 2026

Compared 122 screenshots: 122 matched.
✅ Native Mac screenshot tests passed.

Benchmark Results

  • VM Translation Time: 0 seconds
  • Compilation Time: 91 seconds

Detailed Performance Metrics

Metric Duration
Base64 payload size 8192 bytes
Base64 benchmark iterations 6000
Base64 native encode 546.000 ms
Base64 CN1 encode 1051.000 ms
Base64 encode ratio (CN1/native) 1.925x (92.5% slower)
Base64 native decode 382.000 ms
Base64 CN1 decode 834.000 ms
Base64 decode ratio (CN1/native) 2.183x (118.3% slower)
Base64 SIMD encode 363.000 ms
Base64 encode ratio (SIMD/native) 0.665x (33.5% faster)
Base64 encode ratio (SIMD/CN1) 0.345x (65.5% faster)
Base64 SIMD decode 356.000 ms
Base64 decode ratio (SIMD/native) 0.932x (6.8% faster)
Base64 decode ratio (SIMD/CN1) 0.427x (57.3% faster)
Image encode benchmark iterations 100
Image createMask (SIMD off) 55.000 ms
Image createMask (SIMD on) 9.000 ms
Image createMask ratio (SIMD on/off) 0.164x (83.6% faster)
Image applyMask (SIMD off) 129.000 ms
Image applyMask (SIMD on) 75.000 ms
Image applyMask ratio (SIMD on/off) 0.581x (41.9% faster)
Image modifyAlpha (SIMD off) 130.000 ms
Image modifyAlpha (SIMD on) 77.000 ms
Image modifyAlpha ratio (SIMD on/off) 0.592x (40.8% faster)
Image modifyAlpha removeColor (SIMD off) 149.000 ms
Image modifyAlpha removeColor (SIMD on) 78.000 ms
Image modifyAlpha removeColor ratio (SIMD on/off) 0.523x (47.7% faster)
Image PNG encode (SIMD off) 868.000 ms
Image PNG encode (SIMD on) 703.000 ms
Image PNG encode ratio (SIMD on/off) 0.810x (19.0% faster)
Image JPEG encode 377.000 ms

@shai-almog
Copy link
Copy Markdown
Collaborator Author

shai-almog commented May 30, 2026

Compared 122 screenshots: 122 matched.
✅ Native iOS screenshot tests passed.

Benchmark Results

  • VM Translation Time: 0 seconds
  • Compilation Time: 264 seconds

Build and Run Timing

Metric Duration
Simulator Boot 93000 ms
Simulator Boot (Run) 1000 ms
App Install 30000 ms
App Launch 69000 ms
Test Execution 375000 ms

Detailed Performance Metrics

Metric Duration
Base64 payload size 8192 bytes
Base64 benchmark iterations 6000
Base64 native encode 781.000 ms
Base64 CN1 encode 2412.000 ms
Base64 encode ratio (CN1/native) 3.088x (208.8% slower)
Base64 native decode 463.000 ms
Base64 CN1 decode 1604.000 ms
Base64 decode ratio (CN1/native) 3.464x (246.4% slower)
Base64 SIMD encode 931.000 ms
Base64 encode ratio (SIMD/native) 1.192x (19.2% slower)
Base64 encode ratio (SIMD/CN1) 0.386x (61.4% faster)
Base64 SIMD decode 616.000 ms
Base64 decode ratio (SIMD/native) 1.330x (33.0% slower)
Base64 decode ratio (SIMD/CN1) 0.384x (61.6% faster)
Image encode benchmark iterations 100
Image createMask (SIMD off) 113.000 ms
Image createMask (SIMD on) 42.000 ms
Image createMask ratio (SIMD on/off) 0.372x (62.8% faster)
Image applyMask (SIMD off) 331.000 ms
Image applyMask (SIMD on) 160.000 ms
Image applyMask ratio (SIMD on/off) 0.483x (51.7% faster)
Image modifyAlpha (SIMD off) 234.000 ms
Image modifyAlpha (SIMD on) 90.000 ms
Image modifyAlpha ratio (SIMD on/off) 0.385x (61.5% faster)
Image modifyAlpha removeColor (SIMD off) 276.000 ms
Image modifyAlpha removeColor (SIMD on) 118.000 ms
Image modifyAlpha removeColor ratio (SIMD on/off) 0.428x (57.2% faster)
Image PNG encode (SIMD off) 1496.000 ms
Image PNG encode (SIMD on) 1394.000 ms
Image PNG encode ratio (SIMD on/off) 0.932x (6.8% faster)
Image JPEG encode 666.000 ms

shai-almog and others added 2 commits May 31, 2026 01:17
…ost ref

Runs 1 and 3 of the screenshot suite still hang at
ButtonThemeScreenshotTest immediately after
ToastBarTopPositionScreenshotTest's ``canvasContextWipe`` force-
timeout. The pattern is identical across both runs: ToastBar's skip
fires (port.js logs ``forcedTimeout:canvasContextWipe:HIT``), the
runner advances, and the next ``CN1SS:INFO:suite starting
test=ButtonThemeScreenshotTest`` line is the last entry before the
suite times out — no further logs, no lambda2RunBridge polls, no
``done()``.

The browser-side stack trace surfaced in our second run shows the
worker is throwing ``RuntimeException: Error: Missing JS member
createElement for host receiver`` repeatedly from inside the EDT
paint chain. Root cause is the documented "NUMBER_FOR_OBJECT
recovery=substituted-null" host-bridge bug: ``window.getDocument()``
intermittently returns the JS Number 667 (viewport height) instead
of a Document; ``createElement('canvas')`` on the wrapped Number
resolves to undefined; the worker's ``__cn1CachedDocWrapper`` then
carries that broken host ref for the rest of the suite.

The existing invalidation at ~line 1146 triggers only when
``!__cn1CachedDocWrapper.__class`` — but the 667-flavour wrapper
*does* have a valid ``__class`` (it was constructed as an
HTMLDocument wrapper), so the check returns the cached object
and the next ``createElement`` lookup hits the broken host ref.

Two narrow defenses:

  1. Sanity-check ``__cn1CachedDocWrapper.__cn1HostRef`` before
     returning the cached wrapper. If the ref isn't an object
     (Number / undefined / null), treat the cache as broken,
     null it, and re-fetch through the host bridge. The
     re-fetch round-trip is dramatically more likely to return
     a real Document than the cached Number.

  2. Invalidate ``win.__cn1CachedDocWrapper`` in the
     ``forcedTimeoutReason``-handler path before calling
     ``finalizeMethod``. Force-timeout reasons like
     ``canvasContextWipe`` are precisely the signal that the
     host-bridge wrappers got corrupted by the prior test's
     canvas activity. Clearing the cache before finalising the
     skipped test ensures the next test starts with a fresh
     ``getDocument()`` lookup instead of inheriting the broken
     cached wrapper.

Neither change touches the host-bridge state itself — the
underlying NUMBER_FOR_OBJECT bug remains for a separate
investigation. These defenses just stop the corrupted wrapper
from cascading across tests and hanging the suite on the next
test that paints a canvas.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant