Skip to content

📸 TUI snapshots — express-todo#2030

Closed
wizard-ci-bot[bot] wants to merge 18 commits into
mainfrom
snapshots/express-todo/28002778142
Closed

📸 TUI snapshots — express-todo#2030
wizard-ci-bot[bot] wants to merge 18 commits into
mainfrom
snapshots/express-todo/28002778142

Conversation

@wizard-ci-bot

@wizard-ci-bot wizard-ci-bot Bot commented Jun 23, 2026

Copy link
Copy Markdown

TUI snapshot review — basic-integration/javascript-node/express-todo

A real --e2e run, rendered to side-by-side screenshots (baseline │ current). 23 of 23 key-moment frames differ. Eyeball them; merge this PR to accept the new baseline.

CI run

flow

🔵 01-intro.txt — added

01-intro.txt

🔵 02-auth.txt — added

02-auth.txt

🔵 03-run.txt — added

03-run.txt

🔵 04-run.txt — added

04-run.txt

🔵 05-run.txt — added

05-run.txt

🔵 06-run.txt — added

06-run.txt

🔵 07-run.txt — added

07-run.txt

🔵 08-run.txt — added

08-run.txt

🔵 09-run.txt — added

09-run.txt

🔵 10-run.txt — added

10-run.txt

🔵 11-run.txt — added

11-run.txt

🔵 12-run.txt — added

12-run.txt

🔵 13-run.txt — added

13-run.txt

🔵 14-run.txt — added

14-run.txt

🔵 15-run.txt — added

15-run.txt

🔵 16-run.txt — added

16-run.txt

🔵 17-run.txt — added

17-run.txt

🔵 18-run.txt — added

18-run.txt

🔵 19-run.txt — added

19-run.txt

🔵 20-outro.txt — added

20-outro.txt

🔵 21-mcp.txt — added

21-mcp.txt

🔵 22-slack-connect.txt — added

22-slack-connect.txt

🔵 23-keep-skills.txt — added

23-keep-skills.txt

gewenyu99 and others added 18 commits June 21, 2026 11:07
Adds two modes to the existing wizard-ci, as an alternative to classic --ci
(LoggingUI: agent-only, stdout-grep). --e2e drives the WHOLE interactive flow
headlessly through the wizard-ci-tools control plane and asserts on structured
state; --replay plays a recorded run back in the terminal.

Core files:
- services/wizard-ci/e2e.ts — runE2e(): /tmp app-copy isolation, env hygiene
  (strips host CLAUDE*/ANTHROPIC* so the spawned agent auths with the phx key
  instead of deferring to the host), scoped --project-id, the happy-path policy
  (skip mcp+slack, delete skills, continue past health issues), spawns the
  wizard repo's headless harness, then asserts the structured result
  (runPhase=completed, posthog dep/.env, reached keep-skills, skillsComplete).
  replayRecording(): shells to the wizard repo's terminal replayer.
- services/wizard-ci/index.ts — wires --e2e (positional app, --project-id,
  --keep-skills) and --replay (--step/--delay) into the CLI + --help.

Engine lives in the wizard repo (store + driver must run in-process); point
WIZARD_PATH at it. See PostHog/wizard PR for src/lib/ci-driver + harness.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…nitions

Run each CI-e2e test definition (for now: integration on express-todo) as a real
--e2e agent run, render every key-moment frame of the recording to a real-Ink
ANSI snapshot, and diff against a committed baseline. Surfaces run-to-run
differences (e.g. the agent enqueuing tasks differently) side-by-side for a human
to review — same screens every run, deltas flagged. No mocks: real agent, real
recording, real render.

- services/wizard-ci/snapshots.ts — the flow (run → render → diff → report)
- services/wizard-ci/ansi-html.ts — dependency-free ANSI→HTML for the side-by-side
- services/wizard-ci/snapshots/express-todo/ — committed baseline (47 frames)
- pnpm wizard-ci-snapshots (+ mprocs entry); --update to accept a new baseline

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The snapshots.ts header now lists what the flow needs in .env
(POSTHOG_PERSONAL_API_KEY, POSTHOG_WIZARD_PROJECT_ID, POSTHOG_REGION) and that
WIZARD_PATH must point at a checkout containing e2e-harness/.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A real agent emits frames a little differently run to run (different number of
status updates → shifted indices), so drift is expected. Print the per-frame
diffs + report.html and exit 0; only a genuine failure (run died, no recording)
exits non-zero. Accept a new baseline with --update.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
After the diff, prompt "Replay <name> snapshots in the terminal? [y/N]" and, on
yes, launch the replay stepper directly on the run's recording — no copy/paste.
TTY-only (auto-declines in CI so nothing hangs); the replayer inherits stdio for
its own Enter-to-step loop.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Document handing the Wizard to an agent to run/drive/explore it headlessly,
pointing at the runbook (wizard repo e2e-harness/EXPLORING-AS-AN-AGENT.md) with a
copy-paste example prompt that targets wasp-lang/open-saas — the agent works out
how to build + run the target itself.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… comments

The agentic-exploration section belongs in the wizard repo's README, not here.
Also trim snapshots.ts / index.ts comments to concise current-behavior.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…wright)

services/wizard-ci/screenshot.ts rasterizes the side-by-side report — one PNG per
key-moment frame (baseline │ current) plus a full-flow strip — for attaching to a
review PR. Reuses the report HTML (ansi-html), so no new ANSI logic. Adds
playwright as a dev dep.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…shots

snapshot-review.ts runs the e2e, renders the report to side-by-side PNGs, and
opens a review PR whose body embeds them (raw URLs), changed frames first —
instead of running the agent evaluator. --dry-run writes the bundle locally.
wizard-snapshots.yml dispatches it (bot token, setup-wizard-deps, Playwright).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
`wizard-ci --e2e` and `wizard-ci-snapshots` run the wizard repo's tui-snapshots:
the real wizard TUI, driven by store state manipulation, captured per screen as
text. --e2e asserts on the result JSON it emits; snapshots diff the captured
screens against a committed baseline; snapshot-review rasterizes them to a
side-by-side image PR.

Drops the recording/replay plumbing (the --replay flag, the render step,
ansi-html) — the captured screens are already clean text. WIZARD_PATH defaults to
a sibling wizard checkout.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…as a comment

Comment `/wizard-ci [app] [wizard_ref]` on a PR to run the real-TUI e2e. The
workflow acks with 👀, checks out the PR, runs snapshot-review, and posts a
comment on the PR with the flow strip and a link to the full side-by-side review
(--comment-pr). Restricted to repo members/owner/collaborators. Manual
workflow_dispatch still works; no auto-run.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…rue)

A standalone snapshots job, independent of the evaluator (the eval still runs as
normal). Dispatch "Wizard CI" with snapshots=true to also open a real-TUI review
PR for the app — same app token + setup-wizard-deps + PostHog key as the
evaluator, project hard-coded to 2 (the bot key's project). Because wizard-ci.yml
is on main, this is dispatchable from a PR branch (pre-merge). The /wizard-ci
comment trigger stays in wizard-snapshots.yml.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…snapshot)

The wizard-ci job's Execute step runs the headless eval by default, or — with the
snapshots=true input — the real-TUI snapshot review for the app. One switch, the
same job; no separate parallel job.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…sterize)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…R step)

git('rev-parse HEAD'), not the array form — the helper runs git ${cmd}. Pass cwd
to getRepoRoot too. tsc clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@wizard-ci-bot wizard-ci-bot Bot added the CI/CD label Jun 23, 2026
@wizard-ci-bot wizard-ci-bot Bot closed this Jun 23, 2026
Comment on lines +116 to +119
- name: Install dependencies
run: pnpm install --frozen-lockfile

- name: Install Chromium for Playwright
Comment on lines +119 to +122
- name: Install Chromium for Playwright
run: pnpm exec playwright install --with-deps chromium

- name: Setup wizard dependencies
Comment on lines +122 to +132
- name: Setup wizard dependencies
# Exports WIZARD_PATH / CONTEXT_MILL_PATH / MCP_PATH.
uses: ./.github/actions/setup-wizard-deps
with:
wizard_ref: ${{ steps.req.outputs.wizard_ref }}
context_mill_ref: ${{ inputs.context_mill_ref || 'main' }}
posthog_ref: ${{ inputs.posthog_ref || 'master' }}
app_token: ${{ steps.app-token.outputs.token }}
save_cache: 'false'

- name: Render snapshots + report (review PR, and a comment when triggered by /wizard-ci)
Comment on lines +132 to +145
- name: Render snapshots + report (review PR, and a comment when triggered by /wizard-ci)
env:
GH_TOKEN: ${{ steps.app-token.outputs.token }}
POSTHOG_PERSONAL_API_KEY: ${{ secrets.GH_APP_POSTHOG_WIZARD_CI_BOT_POSTHOG_PERSONAL_KEY }}
POSTHOG_WIZARD_PROJECT_ID: ${{ inputs.project_id || github.event.client_payload.project_id || vars.WIZARD_SNAPSHOTS_PROJECT_ID }}
POSTHOG_REGION: ${{ inputs.posthog_region || 'us' }}
APP: ${{ steps.req.outputs.app }}
COMMENT_PR: ${{ steps.req.outputs.comment_pr }}
run: |
if [ -n "$COMMENT_PR" ]; then
pnpm wizard-ci-snapshot-review "$APP" --comment-pr "$COMMENT_PR"
else
pnpm wizard-ci-snapshot-review "$APP"
fi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants