test(examples-chat): kill aimock-e2e flake (chunkSize + data-streaming wait) by blove · Pull Request #327 · cacheplane/angular-agent-framework

blove · 2026-05-15T18:27:28Z

Summary

Two-part fix for the recurring e2e flake noted in #314 and #322.

What caused the flake

Aggressive default chunking — the mock LLM server's default streaming chunkSize sometimes split a triple-backtick fence mid-token, leaving partial-markdown unable to recover; the final rendered DOM contained inline <code> instead of <pre><code>. Showed up as the "code fence" spec failing.
Asserting on intermediate streaming-state DOM — the markdown specs counted <li> immediately after seeing assistant text, sometimes catching a transient 1-or-2-item state during streaming. Showed up as the "bullet list" spec failing.

Fix

Set chunkSize: 4096 on the runner so each response arrives in 1–2 SSE deltas. Streaming-progressive behavior is already covered by Phase 1's unit-variance tables (#305); the e2e harness tests final-state invariants and cross-stack integration, not the streaming partial-render path.
Extract a sendPromptAndWait helper in test-helpers.ts that waits on chat-message[data-role="assistant"][data-streaming="false"] before returning the finalized bubble. The chat composition already exposes this DOM contract — wiring [streaming]="agent.isLoading() && i === lastIndex" to chat-message's host attribute — but the specs weren't using it. Smoke, markdown, and A2UI specs now route through the helper.

Verification

Ran the full Playwright suite 5 times consecutively locally: 5/5 clean (no flakes). Before this PR, runs failed 2/5 to 3/5 on either the code-fence or the bullet-list spec. Runner unit tests still pass (3/3).

Note on the streaming-DOM contract

While investigating I confirmed the data-streaming attribute on <chat-message> already exists at libs/chat/src/lib/primitives/chat-message/chat-message.component.ts:28. No @ngaf/chat change was needed — this was a test-side bug, not a library feature gap.

Test plan

Full suite passes 5/5 locally
Runner unit tests still pass (3/3, including the directory-mode test)
No production code touched
CI green

…flake Aggressive default chunking sometimes splits a triple-backtick mid-token, producing inline <code> rendering instead of <pre><code>. The harness tests measure FINAL rendered structure (streaming-progressive behavior is covered by the Phase 1 unit-variance tables), so single-chunk replay is the right tradeoff. Comment in the runner documents the choice.

…streaming=false Asserting on intermediate streaming-state DOM is the other source of e2e flake. The chat composition flips chat-message[data-streaming] to 'false' when the agent's isLoading() goes false; helper waits on that DOM contract before returning the finalized bubble. Smoke, markdown, and A2UI specs all route through the helper now.

vercel · 2026-05-15T18:27:31Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
cacheplane	Ready	Preview, Comment	May 15, 2026 6:32pm

…g wait) (#327) * test(examples-chat): set aimock chunkSize=4096 to defeat fence-split flake Aggressive default chunking sometimes splits a triple-backtick mid-token, producing inline <code> rendering instead of <pre><code>. The harness tests measure FINAL rendered structure (streaming-progressive behavior is covered by the Phase 1 unit-variance tables), so single-chunk replay is the right tradeoff. Comment in the runner documents the choice. * test(examples-chat): extract sendPromptAndWait helper, wait for data-streaming=false Asserting on intermediate streaming-state DOM is the other source of e2e flake. The chat composition flips chat-message[data-streaming] to 'false' when the agent's isLoading() goes false; helper waits on that DOM contract before returning the finalized bubble. Smoke, markdown, and A2UI specs all route through the helper now.

blove added 2 commits May 15, 2026 11:26

blove merged commit 1c08e1f into main May 15, 2026
16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

test(examples-chat): kill aimock-e2e flake (chunkSize + data-streaming wait)#327

test(examples-chat): kill aimock-e2e flake (chunkSize + data-streaming wait)#327
blove merged 2 commits into
mainfrom
claude/chat-streaming-dom-contract

blove commented May 15, 2026

Uh oh!

vercel Bot commented May 15, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

blove commented May 15, 2026

Summary

What caused the flake

Fix

Verification

Note on the streaming-DOM contract

Test plan

Uh oh!

vercel Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented May 15, 2026 •

edited

Loading