feat: add two robust PostHog PII rules by joethreepwood · Pull Request #33 · PostHog/warlock

joethreepwood · 2026-06-05T13:20:30Z

Summary

Hardens PII detection beyond the single posthog_pii_in_capture_call rule. Sarah flagged it could be more robust — these fill the two biggest gaps: more call-sites and value-shaped detection.

`posthog_pii_in_person_properties`

Sensitive PII passed to person-property call-sites: register, register_once, setPersonProperties, setPersonPropertiesForFlags. Mirrors the identify() half of the existing rule — email and names are standard, expected person properties (NOT flagged); only regulated PII is (SSN, DOB, financial, government ID, medical).

`posthog_pii_value_in_tracking_call`

PII-shaped literal values, regardless of key name — catches PII hidden under innocuous keys like posthog.capture('signup', { referrer: 'jane@example.com' }), which the key-name rule misses. Matches:

email literal — capture() only (an email value in identify()/person-properties is the standard, correct pattern)
US SSN (123-45-6789) in any tracking call
separator-grouped card number (4111 1111 1111 1111) in any tracking call

Conventions followed

One rule per file, filename === rule name; all six meta fields (description, remediation, severity, category, action, scan_context)
category = posthog_pii, severity = high, action = remediate, scan_context = output (matches the existing PII rule)
No regex lookarounds (YARA-X rejects them); same first-{}-object FP-avoidance scoping as the sibling rule
Deliberate precision over recall to limit false positives: card detection requires separators (YARA-X can't run a Luhn check, so an unseparated 16-digit run is intentionally not matched)

Test plan

19 new tests (positive / negative / metadata per rule) under __tests__/rules/
pnpm test — full suite green (535 tests, 32 files)
Negative coverage locks in the FP guards: email-in-identify, name in person properties, unseparated 16-digit runs, and capture-owned PII not double-firing the person-property rule

Note

Independent of the in-flight roin-orca-rules branch (it doesn't touch PII rules). The PostHog wizard ships an inline mirror of these patterns (PostHog/wizard#510) until it consumes warlock directly.

🤖 Generated with Claude Code

Extends PII coverage beyond the single posthog_pii_in_capture_call rule (which matches PII by key name in capture/identify): - posthog_pii_in_person_properties: sensitive PII passed to person-property call-sites (register / register_once / setPersonProperties / setPersonPropertiesForFlags). Mirrors the identify() split — email and names are standard person properties and are NOT flagged; only regulated PII (SSN, DOB, financial, gov ID, medical) is. - posthog_pii_value_in_tracking_call: PII-shaped literal VALUES (email, US SSN, separator-grouped card number) under any property key — catches PII hidden behind innocuous keys like { referrer: 'jane@example.com' }. Both follow house conventions: one rule per file, all six meta fields, no regex lookarounds, same first-object FP-avoidance scoping as the sibling rule. Email values are flagged in capture() only (email is valid in identify/person-properties). Card detection requires separators since YARA-X can't run a Luhn check. 19 tests; full suite green (535). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

sarahxsanders

one missing test then you're a warlock harry >:)

sarahxsanders · 2026-06-08T02:56:30Z

+//   - email: real address shape, and only in capture() — an email value in
+//     identify()/person-properties is the standard, correct pattern.


tiny nit. this rule only catches emails written as literal strings, not when passed as a variable. it's fine as a limit if that's intended, but might be worth adding a comment here for context:

// Known limit: only catches quoted email values, not variables

triage will catch most of this at runtime but it could potentially be a gap and having the context for robots and humans is helpful!

Added in 7ffc998 — a 'Known limit' note in the rule header: only catches quoted/literal values, not variables (e.g. { email: userEmail }); runtime triage covers most of those, this rule is the static-analysis backstop for hardcoded leaks.

sarahxsanders · 2026-06-08T02:58:02Z

+    });
+  });
+
+  describe('negative cases – should NOT match', () => {


rule 1 has a test that confirms it correctly doesn't fire on { $set: { ssn: '...' } }

rule 2 behaves the same way but there's no test for it. could we add a matching negative test? something like

it('does NOT match an SSN value nested inside a $set object', async () => { await expectRuleDidNotMatch( `posthog.setPersonProperties({ $set: { ssn: '123-45-6789' } })`, RULE, ); });

Added your suggested test verbatim in 7ffc998 — confirms the value rule stays silent on setPersonProperties({ $set: { ssn: '123-45-6789' } }). The nested object isn't matched because the [^{}]* scope stops at the inner brace, same FP-avoidance as rule 1. Suite green (536).

Per review (#510): warlock is being wired to the wizard directly, so the inline copy of the rules is throwaway. The two rules live in warlock (PostHog/warlock#33) — the source of truth — and reach the wizard through that wiring, not an inline mirror. Removes the mirrored rules and their tests; rule count back to 15. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

joethreepwood · 2026-06-08T11:04:46Z

Update: the wizard-side inline mirror referenced in the PR description was dropped (PostHog/wizard#510) at review — the wizard will consume these rules through the warlock→wizard scanner wiring instead of a hand-maintained copy. These two rules remain the source of truth here.

Review feedback (#33): - Document that posthog_pii_value_in_tracking_call only catches quoted/ literal values, not variables (runtime triage covers the rest). - Add a negative test confirming the rule does not fire on an SSN nested inside a $set object — matching the coverage the person-properties rule already has. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

joethreepwood · 2026-06-08T12:10:18Z

Thanks @sarahxsanders — both addressed in 7ffc998:

Literal-only limit documented — added a // Known limit: note to posthog_pii_value_in_tracking_call: it only catches quoted/literal values, not variables (e.g. { email: userEmail }); runtime triage covers most of those, this rule is the static-analysis backstop for hardcoded leaks.
Nested $set negative test — added your suggested test verbatim; confirms the value rule stays silent on setPersonProperties({ $set: { ssn: '…' } }), matching the coverage the person-properties rule already had.

Full suite green (536). Heads up: the wizard-side inline mirror these rules referenced was dropped at review (PostHog/wizard#510) — the wizard will consume them through the warlock→wizard wiring instead.

joethreepwood mentioned this pull request Jun 5, 2026

feat: PII Bouncer program (engine) + robust PII scanner rules PostHog/wizard#510

Draft

6 tasks

joethreepwood self-assigned this Jun 5, 2026

joethreepwood requested a review from sarahxsanders June 5, 2026 13:25

sarahxsanders mentioned this pull request Jun 5, 2026

Remove WIZARD_DOC_BASENAMES PII band-aid after warlock PII rules ship PostHog/wizard#531

Open

sarahxsanders requested changes Jun 8, 2026

View reviewed changes

joethreepwood requested a review from sarahxsanders June 8, 2026 12:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add two robust PostHog PII rules#33

feat: add two robust PostHog PII rules#33
joethreepwood wants to merge 2 commits into
mainfrom
posthog-pii-rules

joethreepwood commented Jun 5, 2026

Uh oh!

sarahxsanders left a comment

Uh oh!

sarahxsanders Jun 8, 2026

Uh oh!

joethreepwood Jun 8, 2026

Uh oh!

sarahxsanders Jun 8, 2026

Uh oh!

joethreepwood Jun 8, 2026

Uh oh!

joethreepwood commented Jun 8, 2026

Uh oh!

joethreepwood commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		// - email: real address shape, and only in capture() — an email value in
		// identify()/person-properties is the standard, correct pattern.

Conversation

joethreepwood commented Jun 5, 2026

Summary

posthog_pii_in_person_properties

posthog_pii_value_in_tracking_call

Conventions followed

Test plan

Note

Uh oh!

sarahxsanders left a comment

Choose a reason for hiding this comment

Uh oh!

sarahxsanders Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

joethreepwood Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

sarahxsanders Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

joethreepwood Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

joethreepwood commented Jun 8, 2026

Uh oh!

joethreepwood commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

`posthog_pii_in_person_properties`

`posthog_pii_value_in_tracking_call`