Skip to content

awskrug/aiengineering-live-coding

Repository files navigation

AI QA Automation — RealWorld

English · 한국어

Live-coding MVP that lets a Claude Agent SDK–powered LLM write and run Playwright E2E tests against demo.realworld.show, streaming the agent activity, generated TypeScript code, and pass/fail result back to the browser in real time.

Quickstart

cp .env.example .env
# Edit .env to set ANTHROPIC_API_KEY=sk-ant-...

npm install                    # also installs the chromium browser via postinstall
npm run dev                    # http://localhost:3000

Open http://localhost:3000, optionally pick one of the 6 preset scenarios, and click Run.

Architecture (MVP)

  • Frontend: single static page (public/index.html + public/app.js), Tailwind via CDN, EventSource for SSE.
  • Server: Hono on Node 22 with tsx (no build).
  • Agent: Claude Agent SDK query() with three custom MCP tools (write_test_file, run_playwright, report_result). vendor/realworld-spec/SELECTORS.md is baked into the system prompt.
  • Test runner: Playwright spawned as a subprocess against runs/<runId>/test.spec.ts.
  • Concurrency: one run at a time (runLock). Subsequent requests get HTTP 409.
  • No persistence: events are kept in-memory and disposed 10 minutes after a run completes.

For the full spec see docs/superpowers/specs/2026-05-07-ai-qa-automation-mvp-design.md. For the implementation walkthrough see docs/superpowers/plans/2026-05-07-ai-qa-automation-mvp.md.

Demo notes

  • The target site (demo.realworld.show) is third-party. Run a quick health check before a live demo: curl -I https://demo.realworld.show.
  • Each run signs up a new fake user with a qa-<timestamp> username so repeated runs do not collide.
  • Hard caps: maxTurns=12, single retry, 60s test timeout. Adjust in src/orchestrator.ts and playwright.config.ts.

Tutorial

Follow the full one-hour journey of building this MVP through conversational AI coding:

Vibe Coding Tutorial — 7 chapters + appendix covering brief → brainstorming → spec pivot → plan → subagent-driven execution → ship.

Chapters Time Cost (est.) Key Topics
7 + appendix ~55 minutes wall clock ~$12–14 API Superpowers skills, AskUserQuestion, RealWorld spec utilization, Claude Agent SDK + MCP tools, parallel subagent pipelining

Acceptance checklist (manual)

  1. README quickstart works end-to-end on a clean clone in under 5 minutes.
  2. Preset signup returns a PASS within 60 seconds against demo.realworld.show.
  3. A purposely broken scenario ("click a button labeled 'Definitely Not Here'") returns a FAIL with a non-empty reason.
  4. Submitting a second POST /api/run while one is in flight returns HTTP 409.

About

AI-powered E2E QA automation MVP using Claude Agent SDK + Playwright. Built as a live coding demo for AWSKRUG AI Engineering meetup.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors