feat: PII Bouncer program (engine) + robust PII scanner rules#510
feat: PII Bouncer program (engine) + robust PII scanner rules#510joethreepwood wants to merge 7 commits into
Conversation
Scans frontend forms for sensitive inputs and configures session recording masking. Wizard-side plumbing only: program config, detection (posthog-js presence), CLI subcommand, abort cases for no-posthog-js / no-init-call / no-frontend-templates. The actual form-scanning and edit recipes live in a follow-up context-mill PR — without that skill, the program registers and runs end-to-end but the agent hits a structured skill-not-found outro. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
🧙 Wizard CIRun the Wizard CI and test your changes against wizard-workbench example apps by replying with a GitHub comment using one of the following commands: Test all apps:
Test all apps in a directory:
Test an individual app:
Show more apps
Results will be posted here when complete. |
Addresses review feedback: the agent instructions (scan steps, abort signals, report format) were hardcoded in the program's customPrompt — "packaged English" that belongs in a context-mill skill, not the wizard. - Collapse index.ts to a createSkillProgram() factory call; customPrompt is now a one-liner. The skill (loaded via skillId) drives the run. - Delete wizard-side detection (detectPiiBouncerPrerequisites + the duplicated package walker a reviewer flagged). The skill detects prerequisites and emits [ABORT] signals; the wizard just routes them. - Rename detect.ts -> abort-cases.ts, keeping only PII_BOUNCER_ABORT_CASES (terminal UX copy). The match regexes are the contract with the skill. Wizard = engine + UX; context-mill skill = instructions. No product knowledge left in the program. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Companion to the new warlock rules (posthog_pii_in_person_properties, posthog_pii_value_in_tracking_call). The wizard ships an inline copy of warlock's PII rules until it consumes warlock directly, so the patterns are mirrored here to take effect today: - pii_in_person_properties: sensitive PII in register/setPersonProperties (mirrors identify's "email/name OK, regulated PII not" split) - pii_value_in_tracking_call: PII-shaped literal values (email/SSN/card) under any key — catches PII hidden behind innocuous property names +10 scanner tests; warlock remains the source of truth. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
sarahxsanders
left a comment
There was a problem hiding this comment.
some little things, but this is solid!!!
| ], | ||
| }; | ||
|
|
||
| // Mirror of warlock's `posthog_pii_in_person_properties`. The wizard ships an |
There was a problem hiding this comment.
the warlock should be set up and wired to the wizard by tomorrow, so this file will be getting dropped/change to use the warlock! leaving as a note so we rebase
There was a problem hiding this comment.
side note: remove tests too
There was a problem hiding this comment.
Good call — dropped the inline mirror in 0643776 rather than carry a hand-maintained copy. The two rules live only in warlock now (PostHog/warlock#33), so they'll reach the wizard through your warlock->wizard wiring. Left the pre-existing pii_in_capture_call rule untouched for you to fold into that same change.
There was a problem hiding this comment.
Removed — the 10 mirror tests are gone and the rule count is back to 15 (0643776).
| * The `match` regexes are the contract with the skill: the signals | ||
| * documented in the skill's SKILL.md (`no-posthog-js`, `no-init-call`, | ||
| * `no-frontend-templates`) must stay in sync with the patterns here. |
There was a problem hiding this comment.
might be worth a file-path breadcrumb here because looking at the skill itself it mentions the wizard, but neither side points at the actual file on the other. I worry if there might be some drift here if we aren't explicit enough
something simple like: See: context-mill/transformation-config/skills/pii-bouncer/description.md (Abort statuses section) could suffice
There was a problem hiding this comment.
Added breadcrumbs in both directions (6a02b6d). This file's header now points at context-mill/transformation-config/skills/pii-bouncer/description.md (the Abort statuses section), and that section points back here — so the [ABORT] contract is explicit on both ends. Skill side: PostHog/context-mill#170.
| kind: OutroKind.Success as const, | ||
| message: 'PII Bouncer finished', | ||
| reportFile: REPORT_FILE, | ||
| changes: [], |
There was a problem hiding this comment.
since we are adding ph-no-capture we should probably populate this with the modified file list the way the other file-editing programs do. we want to be transparent when a run finishes what we've edited and this helps us also enhance our privacy and transparency posture
There was a problem hiding this comment.
Done in 6a02b6d. buildOutroData now lists the actual edited files instead of []. A small edit-tracker (src/lib/edit-tracker.ts) records the agent's real Write/Edit/MultiEdit paths via a PostToolUse hook, so the outro reflects what hit disk — not what the agent claims it did, which felt like the right posture for a privacy tool. Paths are relativised to the project and the report file is filtered out; full per-edit detail still lives in posthog-pii-bouncer-report.md.
Review feedback (PostHog/wizard#510): the [ABORT] signals are a contract with the wizard, but neither side named the other's file. Adds a breadcrumb to the consuming side (src/lib/programs/pii-bouncer/abort-cases.ts); the wizard side now points back here. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…umbs Review feedback (#510): - Outro now lists the actual files the run modified, instead of an empty changes array. A small edit-tracker observes the agent's Write/Edit/ MultiEdit calls (what hit disk, not what the agent claimed) via a PostToolUse hook registered alongside the YARA hooks. buildOutroData relativises the paths, drops the report file, and lists them — honest transparency for a privacy tool. Per-edit detail still lives in the report. - abort-cases.ts and the context-mill skill now point at each other by file path, so the [ABORT] signal contract can't drift silently. +6 edit-tracker tests. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Per review (#510): warlock is being wired to the wizard directly, so the inline copy of the rules is throwaway. The two rules live in warlock (PostHog/warlock#33) — the source of truth — and reach the wizard through that wiring, not an inline mirror. Removes the mirrored rules and their tests; rule count back to 15. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Review feedback (PostHog/wizard#510 skill): - React projects often have no literal posthog.init(...) — config goes through <PostHogProvider options={{…}}>. The skill now detects either, writes masking into whichever it finds, and only aborts (no-init-call) when neither exists. Avoids a false abort on valid React apps. - Word-bound the name/id heuristic so `pass` no longer matches `passenger`, `pin` no longer matches `spinner`, etc. — less over-masking, less noise. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The skill now accepts a React <PostHogProvider> as a valid init site, not just posthog.init(...). Update the abort outro copy so it matches what the skill actually looks for. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Thanks for the review @sarahxsanders — all addressed:
817 tests green, lint clean. Rebased onto the latest |
Summary
Adds the PII Bouncer to the wizard. Three-repo effort; this PR is the wizard (engine) side. Companion PRs:
What's here
A new
pii-bouncersubcommand that scans frontend forms for sensitive inputs, adds theph-no-captureprivacy class, and configures session-recording masking so PII never reaches replays.Per review feedback, all the agent instructions live in the context-mill skill, not the wizard. This PR is pure engine: a
createSkillProgram()call, abort-case routing, and the outro.src/lib/programs/pii-bouncer/index.ts—createSkillProgram()factory call; one-linecustomPrompt. The outro now lists the actual files the run edited (see below).src/lib/programs/pii-bouncer/abort-cases.ts—PII_BOUNCER_ABORT_CASES(terminal UX copy). Thematchregexes are the contract with the skill's[ABORT]signals; both sides now carry a file-path breadcrumb to the other so they can't drift.src/lib/edit-tracker.ts— a small PostToolUse hook that records the agent's actualWrite/Edit/MultiEditpaths, so the outro can be transparent about exactly what was modified (what hit disk, not what the agent claimed — the right posture for a privacy tool).detect.ts(wizard-side detection + a duplicated package walker) — the skill detects prerequisites and emits[ABORT].Architecture
"wizard = engine, context-mill = cartridge, warlock = rules":
Test plan
pnpm buildcleanpnpm test— 817 pass (incl. edit-tracker coverage)pnpm lint— 0 errorsnode dist/bin.js --helplistspii-bouncer/skill-menu.json→pii-bouncer.zipwithSKILL.md)pii-bounceragainst a frontend app with forms; confirmph-no-captureadded, init mask config set, the outro lists the edited files,posthog-pii-bouncer-report.mdwritten, and the[ABORT] no-posthog-jspath renders a clean outro on a no-PostHog project.🤖 Generated with Claude Code