Skip to content

markers: replace regexp with manual scanning in Redact()#36

Merged
dhartunian merged 1 commit into
masterfrom
davidh/unify-redact-impl
Mar 13, 2026
Merged

markers: replace regexp with manual scanning in Redact()#36
dhartunian merged 1 commit into
masterfrom
davidh/unify-redact-impl

Conversation

@dhartunian

@dhartunian dhartunian commented Mar 11, 2026

Copy link
Copy Markdown
Contributor

Replace the regexp-based implementation of RedactableString.Redact() and
RedactableBytes.Redact() with manual index-based scanning using
bytes.Index and a shared redactBytes() implementation.

The new implementation:

  • Scans for start markers (‹) using Index, then finds the closing marker (›)
  • Distinguishes hash markers (‹†...›) from regular markers (‹...›) inline
  • Appends hashes directly to the output buffer via appendHash, avoiding
    intermediate string allocations
  • Returns early on the fast path when no markers are present
  • Unifies string and byte Redact() into a single redactBytes() function
  • Adds a string-level early return in RedactableString.Redact() to avoid
    []byte conversion allocations when no markers are present

Also adds an exhaustive table-driven test (TestRedact) covering edge cases
for both RedactableString and RedactableBytes including empty inputs,
unclosed markers, unicode content, and hash marker variations.

Benchmark results (Apple M1 Pro):

name                              old time/op    new time/op    delta
RedactCall_RegularRedaction-10      243ns ± 2%      62ns ± 1%  -74.48%  (p=0.000 n=8+8)
RedactCall_HashEnabled-10           376ns ± 1%     162ns ± 4%  -56.88%  (p=0.000 n=8+8)
RedactCall_HashWithSalt-10          447ns ± 2%     222ns ± 4%  -50.30%  (p=0.000 n=8+8)

name                              old alloc/op   new alloc/op   delta
RedactCall_HashEnabled-10           161B ± 0%       64B ± 0%   -60.25%  (p=0.000 n=8+8)
RedactCall_HashWithSalt-10          161B ± 0%       64B ± 0%   -60.25%  (p=0.000 n=8+8)
RedactCall_RegularRedaction-10      104B ± 0%       64B ± 0%   -38.46%  (p=0.000 n=8+8)

name                              old allocs/op  new allocs/op  delta
RedactCall_HashEnabled-10           8.00 ± 0%      2.00 ± 0%   -75.00%  (p=0.000 n=8+8)
RedactCall_HashWithSalt-10          8.00 ± 0%      2.00 ± 0%   -75.00%  (p=0.000 n=8+8)
RedactCall_RegularRedaction-10      5.00 ± 0%      2.00 ± 0%   -60.00%  (p=0.000 n=8+8)

Co-Authored-By: roachdev-claude roachdev-claude-bot@cockroachlabs.com


This change is Reviewable

@cockroachlabs-cla-agent

cockroachlabs-cla-agent Bot commented Mar 11, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@visheshbardia

Copy link
Copy Markdown
Contributor

Nice! Thanks for working on this.
Left a few comments below.

@visheshbardia

Copy link
Copy Markdown
Contributor

NIT: The ReStripSensitive var in constants.go is not being used anymore. So, it could be removed.

Comment thread internal/markers/markers.go
@dhartunian dhartunian force-pushed the davidh/unify-redact-impl branch from 9a95ff6 to 3f7b73a Compare March 12, 2026 19:35
Replace the regexp-based implementation of RedactableString.Redact() and
RedactableBytes.Redact() with manual index-based scanning using
bytes.Index and a shared redactBytes() implementation.

The new implementation:
- Scans for start markers (‹) using Index, then finds the closing marker (›)
- Distinguishes hash markers (‹†...›) from regular markers (‹...›) inline
- Appends hashes directly to the output buffer via appendHash, avoiding
  intermediate string allocations
- Returns early on the fast path when no markers are present
- Unifies string and byte Redact() into a single redactBytes() function
- Adds a string-level early return in RedactableString.Redact() to avoid
  []byte conversion allocations when no markers are present

Also adds an exhaustive table-driven test (TestRedact) covering edge cases
for both RedactableString and RedactableBytes including empty inputs,
unclosed markers, unicode content, and hash marker variations.

Benchmark results (Apple M1 Pro):

name                              old time/op    new time/op    delta
RedactCall_RegularRedaction-10      243ns ± 2%      62ns ± 1%  -74.48%  (p=0.000 n=8+8)
RedactCall_HashEnabled-10           376ns ± 1%     162ns ± 4%  -56.88%  (p=0.000 n=8+8)
RedactCall_HashWithSalt-10          447ns ± 2%     222ns ± 4%  -50.30%  (p=0.000 n=8+8)

name                              old alloc/op   new alloc/op   delta
RedactCall_HashEnabled-10           161B ± 0%       64B ± 0%   -60.25%  (p=0.000 n=8+8)
RedactCall_HashWithSalt-10          161B ± 0%       64B ± 0%   -60.25%  (p=0.000 n=8+8)
RedactCall_RegularRedaction-10      104B ± 0%       64B ± 0%   -38.46%  (p=0.000 n=8+8)

name                              old allocs/op  new allocs/op  delta
RedactCall_HashEnabled-10           8.00 ± 0%      2.00 ± 0%   -75.00%  (p=0.000 n=8+8)
RedactCall_HashWithSalt-10          8.00 ± 0%      2.00 ± 0%   -75.00%  (p=0.000 n=8+8)
RedactCall_RegularRedaction-10      5.00 ± 0%      2.00 ± 0%   -60.00%  (p=0.000 n=8+8)

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
@dhartunian dhartunian force-pushed the davidh/unify-redact-impl branch from 3f7b73a to 8dc18d3 Compare March 12, 2026 19:38

@visheshbardia visheshbardia left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

@visheshbardia made 1 comment.
Reviewable status: 0 of 5 files reviewed, 1 unresolved discussion (waiting on aa-joshi).

@dhartunian

Copy link
Copy Markdown
Contributor Author

TFTR!

@dhartunian dhartunian merged commit 45c49b2 into master Mar 13, 2026
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants