feat: mask sensitive data inside objects and URLs in code variables#688
Merged
Merged
Conversation
Contributor
Prompt To Fix All With AIFix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
posthog/exception_utils.py:1083-1096
**`mask_url_credentials` silently inert when `mask_patterns` is empty**
The early return `if not compiled_mask: return value` means URL credential scrubbing is bypassed entirely whenever `compiled_mask` is `None` — which happens when `mask_patterns=[]`. The same guard appears in `_serialize_variable_value` (`elif compiled_mask and mask_url_credentials:`), so a user who explicitly disables name-based masks but still expects URL credentials to be scrubbed gets no protection. The two features are advertised as independent toggles but share a single gate.
### Issue 2 of 2
posthog/test/test_exception_capture.py:984-1002
**Prefer `@pytest.mark.parametrize` for multi-case unit tests**
`test_redact_url_credentials` bundles four distinct input/output assertions in a single test body. Per the team convention, these cases should be expressed as separate parametrize entries so each case gets its own pass/fail signal and name. The same applies to `test_mask_url_credentials_can_be_toggled` (two cases: enabled vs disabled) and the inline assertions inside `test_compile_patterns_fast_path_and_regex_fallback`.
Reviews (1): Last reviewed commit: "feat: mask sensitive data inside objects..." | Re-trigger Greptile |
Contributor
posthog-python Compliance ReportDate: 2026-06-22 07:27:30 UTC ✅ All Tests Passed!45/45 tests passed Capture Tests✅ 29/29 tests passed View Details
Feature_Flags Tests✅ 16/16 tests passed View Details
|
Contributor
|
Reviews (2): Last reviewed commit: "fix: comments" | Re-trigger Greptile |
hpouillot
reviewed
Jun 20, 2026
45e9e9e to
f8c8aea
Compare
f8c8aea to
2de7852
Compare
hpouillot
reviewed
Jun 21, 2026
2de7852 to
5ab23b5
Compare
hpouillot
approved these changes
Jun 22, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
It's faster. Much faster on nasty inputs (~57× on the worst case, comfortably sub-1ms) and faster than where we started even on ordinary exceptions. We compile the masking patterns once and cache them, fast-path the common value types, and collapse all the secret-name matching into a single regex pass.
A little bit stricter limits — it's best-effort. If some exception has one bazillion triple-nested variables
and crazy collections, we just stop very early instead of grinding through all of it. Hard caps on depth, collection width, total nodes, and string length.
Connection URL detection. We scrub
user:pass@hostcredentials out of DSNs / connection strings(
postgres://…,redis://…, etc.), so a URL sitting in a local variable can't leak its password.Much more support for all the crazy ways a secret can end up in the final event. Secrets don't only live in plain attributes, so we now also catch sensitively-named
@property,cached_property,__slots__, class-levelattributes, descriptors and namedtuple fields, plus weird non-string dict keys. We never call a getter, and never
trust a custom
__repr__that could rename a field out of the mask — when in doubt, fail closed.Brand new, ultra clean tests. Rewrote the entire suite. One idea per assertion.
Benchmark
Examples
Benchmark vs
mainmaincurrent