test(self-driving): e2e harness + snapshots#767
Conversation
🧙 Wizard CIRun the Wizard CI and test your changes against wizard-workbench example apps by replying with a GitHub comment using one of the following commands: Test all apps:
Test all apps in a directory:
Test an individual app:
Show more apps
Results will be posted here when complete. |
This stack of pull requests is managed by Graphite. Learn more about stacking. |
506f218 to
29200fc
Compare
82c757e to
adb713c
Compare
29200fc to
b2621ad
Compare
adb713c to
48dccdb
Compare
b2621ad to
ed0f866
Compare
48dccdb to
3967468
Compare
ed0f866 to
2ea5fa0
Compare
3967468 to
e8f3054
Compare
3547e31 to
4ee849e
Compare
Stacks the self-driving e2e testing onto the feature. The detect screen is interactive-only, so it's listed no-action; the harness covers the rest of the flow plus the offline flow-trace snapshot. Re-enables the e2e-harness test suite the base PR excludes from its run. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
in-program-phases.ts was removed; the host now advances each composed step the same way run-wizard does (auth → run steps → gating screens). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
run-step composition replaced runProgram/completePhase with run/ completeRunStep; the trace test completes the integrate-run step the same way. Regenerate the self-driving golden (detect -> integrate -> handoff -> run). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- tui-host injects the detect pick (detect the framework at the install dir, single-app fixtures) so the live snapshot run advances past the interactive picker the store driver can't actuate. - e2e.json path now reflects the real flow (detect, integration run, handoff). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
a980d5d to
d3f1d54
Compare
The auto-pick detected the framework at the repo root, which a Turborepo root reports as generic node, so it would integrate the workspace root. Scan apps/ then packages/ for the first sub-app with a registered framework, falling back to the root for a single-app fixture. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ving's asks E2E snapshotting is driven, not CI — the host answers wizard_ask via its driver, so it sets WIZARD_ASK_AUTODRIVE and the runner keeps the ask bridge wired despite ci (which here is only for headless auth). Self-driving's profile declines every ask (ask: cancel) so the agent sets nothing up and walks to the outro. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Cancelling declined the required GitHub step and aborted. Answering with the
first option gives each ask its affirmative 'continue' ("GitHub connected →
done"), so with a scoped key the run completes to the success outro. Reverts
the unused cancel strategy.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
/wizard-ci-snapshots self-driving/turbo-monorepo |
1a2cfaf to
a57e9e4
Compare
|
/wizard-ci-snapshots self-driving/turbo-monorepo |
Details🧙 Wizard CI ResultsTrigger ID:
Configuration
|
The snapshot pass-check read only ${appDir}/package.json, but a monorepo
installs the SDK into the picked sub-app (e.g. apps/expo), so hasPosthogDep
false-negatived and the 'posthog dependency added' assertion failed even
though the integration ran and the outro was reached. Scan the root plus
apps/* and packages/* (mirrors pickIntegrationTarget).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
/wizard-ci-snapshots self-driving/turbo-monorepo |
🧙 Wizard CI ResultsTrigger ID:
Configuration
|
| // snapshot/MCP host answers via its driver and sets WIZARD_ASK_AUTODRIVE. | ||
| // CI/signup with neither has no answerer, so we omit the bridge and the tool | ||
| // returns an actionable error rather than hanging on a never-resolving prompt. | ||
| const askDisabled = |
There was a problem hiding this comment.
We need to keep this enabled when we run snapshot CI (and in general for CI on these interactive skills)

Problem
The e2e test harness for the self-driving onboarding flow was mixed into the feature PR. This lifts it out so the product code reviews on its own.
Changes
Stacked on #760 — everything e2e for self-driving:
action-registry(integration-check, handoff, and detect-screen entries),e2e-profile(decideE2eAction cases + theintegrateprofile field),wizard-ci-driver(exposesintegratein read_state).tui-host) — drives the composed run (detect → integrate → handoff), auto-picks a monorepo sub-app, and keepswizard_asklive for the driver (WIZARD_ASK_AUTODRIVE) so ask-driven flows run headlessly.e2e.json) + its profile registration.e2e-harnesstest suite that the base PR excludes from its run.Test plan
pnpm test— the e2e-harness suite (driver exhaustiveness + the flow-trace snapshot) runs here.done, sources/scouts configured, report written).👇 👇 Don't skip this — expand the snapshots below to watch the harness drive self-driving all the way to the ✅ success outro 👇 👇
👉 Testing — the live captured run that lands on the ✅ success outro (click to expand) 👈
Live captured self-driving run — the harness drives the real TUI to the ✅ success outro. Click any frame to enlarge.
01-self-driving-intro02-auth03-run04-run05-run06-run07-run08-run09-run10-run11-run12-run13-run14-run15-run16-run17-run18-run19-run20-run21-run22-run23-run24-wizard-ask25-run26-run27-run28-run29-run30-run31-run32-run33-outro