Skip to content

Optimize speed, part 2#41

Merged
woct0rdho merged 5 commits intopeteromallet:mainfrom
woct0rdho:ac-automaton
May 2, 2026
Merged

Optimize speed, part 2#41
woct0rdho merged 5 commits intopeteromallet:mainfrom
woct0rdho:ac-automaton

Conversation

@woct0rdho
Copy link
Copy Markdown
Collaborator

@woct0rdho woct0rdho commented May 2, 2026

Before rewriting it in a compiled language, let's see how much speedup we can do within Python. We use the Aho-Corasick automaton algorithm to do multiple pattern matches in secret redaction. On my machine this improves the speed of secret redaction by 50%. pyahocorasick is added as a dependency.

We do anonymization in one pass after parsing, like how secret redaction is done. This also makes it easier to implement new providers without the need of calling the anonymizer everywhere.

@woct0rdho woct0rdho merged commit 8649935 into peteromallet:main May 2, 2026
5 checks passed
@woct0rdho woct0rdho deleted the ac-automaton branch May 2, 2026 06:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant