Skip to content

QA follow-ups: expand mutation/fuzz/load + browser e2e in CI #14

Description

@code-with-rashid

Tracking the residual QA gaps from the measured hardening pass (#13). These are honest boundaries of what one run proved — each is wired as a recommendation or scheduled job rather than faked. State lives under qa/ (INVENTORY.md, COVERAGE.json, SEEDS.json, corpus/).

1. Expand mutation testing beyond the validator

  • Now: Stryker runs only on packages/core/src/spec/validate-response.ts (85.4%, gated ≥80 in the weekly mutation.yml). Full-repo is too slow inline (~796 mutants for 3 modules).
  • Do: incrementally add to stryker.config.json mutate: runner/run.ts, runner/resolve.ts, mock/engine.ts, mock/server.ts, importers/*, spec/scaffold.ts, spec/drift.ts. Keep ignoreStatic: true; raise the weekly job timeout. Target ≥80 per module.

2. Continuous (coverage-guided) fuzzing to the 1M-exec budget

  • Now: packages/core/test/fuzz.test.ts runs 5 seeded, deterministic property-fuzz invariants per CI run (bounded ~11s). Cumulative campaign execs ≈ 505k; corpus in qa/corpus/crashes.jsonl.
  • Do: a scheduled deep-fuzz job (atheris/jazzer-style, or fast-check with a large run budget) seeded from qa/corpus/, targeting validateAgainstSchema, parseOpenApi, importPostman, bruToRequest, pathToRegex. Budget: ≥1M execs OR 30 min/target, persist new corpus entries.

3. Throughput-to-SLO load test

  • Now: load/soak/leak verified (campaign 5: mock 30k req heap-stable, runner defeats gzip-bomb/slow-loris). No throughput-to-target-RPS measured because no SLO is defined for a local-first CLI.
  • Do: if/when an SLO is set, add a k6/autocannon gate against truspec serve and truspec mock (p95/p99 + max-RSS-growth assertion).

4. Browser/e2e regressions in CI (Playwright + axe-core)

  • Now: the web UI (.tsx) is excluded from node coverage (no DOM env). XSS/CSRF/clickjacking/keyboard/a11y were verified live (campaigns 7–10, BUG-M/N/O/P) but only manually.
  • Do: a Playwright + axe-core e2e job (playwright install --with-deps) covering: editor save/keyboard, XSS-render, cross-origin/cross-port CSRF → 403, X-Frame-Options, and axe violations = 0 (except the documented color-contrast item). This permanently guards the BUG-M/N/O/P regressions.

5. Documented design follow-ups

  • Color contrast (a11y, axe serious): muted text (--dimmer/--dim) is below WCAG AA 4.5:1. Needs a design pass to raise contrast (left out of QA hardening: 16 bug fixes + measured coverage/mutation/fuzz gates #13 to avoid guessing palette values).
  • Behavior changes now live on main — gate behind a flag only if a downstream user is affected:
    • BUG-D: truspec run exits non-zero when zero requests are found (was 0).
    • BUG-L: the runner no longer auto-follows redirects (redirect: manual).

Full per-item detail in QA_LOG.md and qa/QA_LOG.md.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions