Skip to content

feat: PII Bouncer program (engine) + robust PII scanner rules#510

Draft
joethreepwood wants to merge 7 commits into
mainfrom
pii-bouncer
Draft

feat: PII Bouncer program (engine) + robust PII scanner rules#510
joethreepwood wants to merge 7 commits into
mainfrom
pii-bouncer

Conversation

@joethreepwood

@joethreepwood joethreepwood commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds the PII Bouncer to the wizard. Three-repo effort; this PR is the wizard (engine) side. Companion PRs:

What's here

A new pii-bouncer subcommand that scans frontend forms for sensitive inputs, adds the ph-no-capture privacy class, and configures session-recording masking so PII never reaches replays.

Per review feedback, all the agent instructions live in the context-mill skill, not the wizard. This PR is pure engine: a createSkillProgram() call, abort-case routing, and the outro.

  • src/lib/programs/pii-bouncer/index.tscreateSkillProgram() factory call; one-line customPrompt. The outro now lists the actual files the run edited (see below).
  • src/lib/programs/pii-bouncer/abort-cases.tsPII_BOUNCER_ABORT_CASES (terminal UX copy). The match regexes are the contract with the skill's [ABORT] signals; both sides now carry a file-path breadcrumb to the other so they can't drift.
  • src/lib/edit-tracker.ts — a small PostToolUse hook that records the agent's actual Write/Edit/MultiEdit paths, so the outro can be transparent about exactly what was modified (what hit disk, not what the agent claimed — the right posture for a privacy tool).
  • Deleted detect.ts (wizard-side detection + a duplicated package walker) — the skill detects prerequisites and emits [ABORT].

PII scanner rules are not in this PR. They live in warlock (PostHog/warlock#33) and reach the wizard through the warlock→wizard wiring, not an inline copy. (An earlier revision mirrored them inline; dropped per review now that the wiring is landing.)

Architecture

"wizard = engine, context-mill = cartridge, warlock = rules":

  • wizard program — the agent + terminal UX. Loads a skill by id, routes abort signals, renders the outro. No product knowledge.
  • context-mill skill — the instructions (what to scan, how to mask, what to report). Loaded at runtime.
  • warlock — the PII detection rules. Source of truth; consumed via the scanner wiring.

Test plan

  • pnpm build clean
  • pnpm test — 817 pass (incl. edit-tracker coverage)
  • pnpm lint — 0 errors
  • node dist/bin.js --help lists pii-bouncer
  • Skill resolves + downloads from the local context-mill dev server (/skill-menu.jsonpii-bouncer.zip with SKILL.md)
  • Manual e2e (reviewer): via wizard-workbench, run pii-bouncer against a frontend app with forms; confirm ph-no-capture added, init mask config set, the outro lists the edited files, posthog-pii-bouncer-report.md written, and the [ABORT] no-posthog-js path renders a clean outro on a no-PostHog project.

🤖 Generated with Claude Code

Scans frontend forms for sensitive inputs and configures session
recording masking. Wizard-side plumbing only: program config,
detection (posthog-js presence), CLI subcommand, abort cases for
no-posthog-js / no-init-call / no-frontend-templates. The actual
form-scanning and edit recipes live in a follow-up context-mill
PR — without that skill, the program registers and runs end-to-end
but the agent hits a structured skill-not-found outro.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown

🧙 Wizard CI

Run the Wizard CI and test your changes against wizard-workbench example apps by replying with a GitHub comment using one of the following commands:

Test all apps:

  • /wizard-ci all

Test all apps in a directory:

  • /wizard-ci basic-integration
  • /wizard-ci error-tracking-upload-source-maps
  • /wizard-ci misc
  • /wizard-ci revenue

Test an individual app:

  • /wizard-ci basic-integration/android
  • /wizard-ci basic-integration/angular
  • /wizard-ci basic-integration/astro
Show more apps
  • /wizard-ci basic-integration/django
  • /wizard-ci basic-integration/fastapi
  • /wizard-ci basic-integration/flask
  • /wizard-ci basic-integration/javascript-node
  • /wizard-ci basic-integration/javascript-web
  • /wizard-ci basic-integration/laravel
  • /wizard-ci basic-integration/next-js
  • /wizard-ci basic-integration/nuxt
  • /wizard-ci basic-integration/python
  • /wizard-ci basic-integration/rails
  • /wizard-ci basic-integration/react-native
  • /wizard-ci basic-integration/react-router
  • /wizard-ci basic-integration/sveltekit
  • /wizard-ci basic-integration/swift
  • /wizard-ci basic-integration/tanstack-router
  • /wizard-ci basic-integration/tanstack-start
  • /wizard-ci basic-integration/vue
  • /wizard-ci error-tracking-upload-source-maps/android
  • /wizard-ci error-tracking-upload-source-maps/flutter
  • /wizard-ci error-tracking-upload-source-maps/ios
  • /wizard-ci error-tracking-upload-source-maps/next
  • /wizard-ci error-tracking-upload-source-maps/next-no-posthog
  • /wizard-ci error-tracking-upload-source-maps/node-raw
  • /wizard-ci error-tracking-upload-source-maps/node-rollup
  • /wizard-ci error-tracking-upload-source-maps/node-rollup-typescript-plugin
  • /wizard-ci error-tracking-upload-source-maps/node-webpack
  • /wizard-ci error-tracking-upload-source-maps/nuxt-3-6
  • /wizard-ci error-tracking-upload-source-maps/nuxt-4-3
  • /wizard-ci error-tracking-upload-source-maps/react-native
  • /wizard-ci error-tracking-upload-source-maps/react-vite
  • /wizard-ci error-tracking-upload-source-maps/rust
  • /wizard-ci misc/quack-quack
  • /wizard-ci revenue/stripe

Results will be posted here when complete.

joethreepwood and others added 2 commits June 5, 2026 14:01
Addresses review feedback: the agent instructions (scan steps, abort
signals, report format) were hardcoded in the program's customPrompt —
"packaged English" that belongs in a context-mill skill, not the wizard.

- Collapse index.ts to a createSkillProgram() factory call; customPrompt
  is now a one-liner. The skill (loaded via skillId) drives the run.
- Delete wizard-side detection (detectPiiBouncerPrerequisites + the
  duplicated package walker a reviewer flagged). The skill detects
  prerequisites and emits [ABORT] signals; the wizard just routes them.
- Rename detect.ts -> abort-cases.ts, keeping only PII_BOUNCER_ABORT_CASES
  (terminal UX copy). The match regexes are the contract with the skill.

Wizard = engine + UX; context-mill skill = instructions. No product
knowledge left in the program.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Companion to the new warlock rules (posthog_pii_in_person_properties,
posthog_pii_value_in_tracking_call). The wizard ships an inline copy of
warlock's PII rules until it consumes warlock directly, so the patterns
are mirrored here to take effect today:

- pii_in_person_properties: sensitive PII in register/setPersonProperties
  (mirrors identify's "email/name OK, regulated PII not" split)
- pii_value_in_tracking_call: PII-shaped literal values (email/SSN/card)
  under any key — catches PII hidden behind innocuous property names

+10 scanner tests; warlock remains the source of truth.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@joethreepwood joethreepwood changed the title feat: add pii-bouncer wizard program scaffold feat: PII Bouncer program (engine) + robust PII scanner rules Jun 5, 2026
@joethreepwood joethreepwood self-assigned this Jun 5, 2026

@sarahxsanders sarahxsanders left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some little things, but this is solid!!!

Comment thread src/lib/yara-scanner.ts Outdated
],
};

// Mirror of warlock's `posthog_pii_in_person_properties`. The wizard ships an

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the warlock should be set up and wired to the wizard by tomorrow, so this file will be getting dropped/change to use the warlock! leaving as a note so we rebase

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

side note: remove tests too

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call — dropped the inline mirror in 0643776 rather than carry a hand-maintained copy. The two rules live only in warlock now (PostHog/warlock#33), so they'll reach the wizard through your warlock->wizard wiring. Left the pre-existing pii_in_capture_call rule untouched for you to fold into that same change.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed — the 10 mirror tests are gone and the rule count is back to 15 (0643776).

Comment on lines +9 to +11
* The `match` regexes are the contract with the skill: the signals
* documented in the skill's SKILL.md (`no-posthog-js`, `no-init-call`,
* `no-frontend-templates`) must stay in sync with the patterns here.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be worth a file-path breadcrumb here because looking at the skill itself it mentions the wizard, but neither side points at the actual file on the other. I worry if there might be some drift here if we aren't explicit enough

something simple like: See: context-mill/transformation-config/skills/pii-bouncer/description.md (Abort statuses section) could suffice

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added breadcrumbs in both directions (6a02b6d). This file's header now points at context-mill/transformation-config/skills/pii-bouncer/description.md (the Abort statuses section), and that section points back here — so the [ABORT] contract is explicit on both ends. Skill side: PostHog/context-mill#170.

Comment thread src/lib/programs/pii-bouncer/index.ts Outdated
kind: OutroKind.Success as const,
message: 'PII Bouncer finished',
reportFile: REPORT_FILE,
changes: [],

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we are adding ph-no-capture we should probably populate this with the modified file list the way the other file-editing programs do. we want to be transparent when a run finishes what we've edited and this helps us also enhance our privacy and transparency posture

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 6a02b6d. buildOutroData now lists the actual edited files instead of []. A small edit-tracker (src/lib/edit-tracker.ts) records the agent's real Write/Edit/MultiEdit paths via a PostToolUse hook, so the outro reflects what hit disk — not what the agent claims it did, which felt like the right posture for a privacy tool. Paths are relativised to the project and the report file is filtered out; full per-edit detail still lives in posthog-pii-bouncer-report.md.

joethreepwood added a commit to PostHog/context-mill that referenced this pull request Jun 8, 2026
Review feedback (PostHog/wizard#510): the [ABORT] signals are a contract
with the wizard, but neither side named the other's file. Adds a breadcrumb
to the consuming side (src/lib/programs/pii-bouncer/abort-cases.ts); the
wizard side now points back here.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
joethreepwood and others added 2 commits June 8, 2026 11:57
…umbs

Review feedback (#510):

- Outro now lists the actual files the run modified, instead of an empty
  changes array. A small edit-tracker observes the agent's Write/Edit/
  MultiEdit calls (what hit disk, not what the agent claimed) via a
  PostToolUse hook registered alongside the YARA hooks. buildOutroData
  relativises the paths, drops the report file, and lists them — honest
  transparency for a privacy tool. Per-edit detail still lives in the report.

- abort-cases.ts and the context-mill skill now point at each other by
  file path, so the [ABORT] signal contract can't drift silently.

+6 edit-tracker tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Per review (#510): warlock is being wired to the wizard
directly, so the inline copy of the rules is throwaway. The two rules
live in warlock (PostHog/warlock#33) — the source of truth — and reach
the wizard through that wiring, not an inline mirror. Removes the mirrored
rules and their tests; rule count back to 15.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
joethreepwood added a commit to PostHog/context-mill that referenced this pull request Jun 8, 2026
Review feedback (PostHog/wizard#510 skill):

- React projects often have no literal posthog.init(...) — config goes
  through <PostHogProvider options={{…}}>. The skill now detects either,
  writes masking into whichever it finds, and only aborts (no-init-call)
  when neither exists. Avoids a false abort on valid React apps.

- Word-bound the name/id heuristic so `pass` no longer matches `passenger`,
  `pin` no longer matches `spinner`, etc. — less over-masking, less noise.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The skill now accepts a React <PostHogProvider> as a valid init site, not
just posthog.init(...). Update the abort outro copy so it matches what the
skill actually looks for.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@joethreepwood

Copy link
Copy Markdown
Contributor Author

Thanks for the review @sarahxsanders — all addressed:

  • Instructions out of the wizard — the program is now a createSkillProgram() call; the scan/mask/report methodology lives in the context-mill skill (feat: add pii-bouncer skill context-mill#170). (085bb5f)
  • Outro lists edited files — instead of changes: [], a small edit-tracker observes the agent's actual Write/Edit/MultiEdit calls and the outro lists what was touched (what hit disk, not self-reported). (6a02b6d)
  • Cross-repo breadcrumbabort-cases.ts and the skill's Abort statuses section now point at each other so the [ABORT] contract can't drift. (6a02b6d)
  • Dropped the inline yara mirror + its tests — warlock is the source of truth (feat: add two robust PostHog PII rules warlock#33) and reaches the wizard via the warlock→wizard wiring, not a hand-maintained copy. Rule count back to 15. (0643776)
  • no-init-call copy — broadened to mention <PostHogProvider>, matching the skill now accepting it as a valid init site. (dd17e38)

817 tests green, lint clean. Rebased onto the latest main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants