fix(hooks): stop marker-directory leaks and harden toolArgs parsing#966
Open
magicpro97 wants to merge 1 commit into
Open
fix(hooks): stop marker-directory leaks and harden toolArgs parsing#966magicpro97 wants to merge 1 commit into
magicpro97 wants to merge 1 commit into
Conversation
Three related hook robustness bugs, all in the hooks/ module:
1. hook_runner.py: hook-dedup-* markers were never pruned, leaking one
file per unique tool call (observed 134k+ files) and slowing every
hook. Add time-gated _prune_stale_dedup_markers() (5s TTL, swept at
most once/60s, fail-open).
2. rules/tentacle.py: _read_edits() re-stamped legacy edit entries with
now() on every read, so the 24h TTL never fired and the legacy bucket
permanently triggered false-positive TENTACLE REQUIRED blocks. Stamp
legacy entries with the marker file mtime instead.
3. rules/error_kb.py: ErrorFixNudgeRule crashed with "'str' object has
no attribute 'get'" (367 logged) when toolArgs was a string. Normalize
toolArgs: parse JSON strings, coerce other non-dicts to {}.
Adds regression tests for all three. Scoped to the Python hook path;
native-Rust hook parity for bugs 1 and 3 is a follow-up.
Closes #963
Closes #964
Closes #965
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR hardens the Python hooks/ subsystem against marker-file accumulation and unexpected toolArgs shapes, based on issues observed on live installs (dedup marker leaks, tentacle legacy TTL never expiring, and ErrorFixNudgeRule crashing on string toolArgs).
Changes:
- Add time-gated pruning of stale
hook-dedup-*markers inhooks/hook_runner.pyto prevent unbounded marker directory growth. - Fix tentacle legacy edit migration to stamp entries with marker file
mtimeso TTL expiry works as intended. - Normalize
toolArgsinErrorFixNudgeRuleto avoid crashes whentoolArgsis a string, with regression tests added.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| hooks/hook_runner.py | Adds coarse, fail-open sweep to prune stale dedup markers and reduce marker-directory growth over time. |
| hooks/rules/tentacle.py | Uses marker mtime (not now()) when migrating legacy edit entries so TTL can expire stale entries. |
| hooks/rules/error_kb.py | Normalizes toolArgs to a dict to avoid .get() crashes and keep learn-marker clearing working. |
| tests/test_hooks.py | Adds regression coverage for legacy tentacle marker timestamping and TTL pruning behavior. |
| tests/test_hook_runner_entrypoints.py | Adds regression coverage asserting dedup markers are pruned while preserving fresh markers. |
| tests/test_hook_rules_more.py | Adds regression coverage for ErrorFixNudgeRule handling string and JSON-string toolArgs. |
Comment on lines
+135
to
+146
| # toolArgs may arrive as a dict, a JSON string, or be absent depending on | ||
| # the runtime. Normalise to a dict so the later .get() calls never crash | ||
| # with "'str' object has no attribute 'get'". | ||
| tool_args = data.get("toolArgs", {}) | ||
| if isinstance(tool_args, str): | ||
| try: | ||
| parsed = json.loads(tool_args) | ||
| tool_args = parsed if isinstance(parsed, dict) else {} | ||
| except Exception: | ||
| tool_args = {} | ||
| elif not isinstance(tool_args, dict): | ||
| tool_args = {} |
Comment on lines
+112
to
+116
| # Legacy: flat set of file paths → migrate with the marker's last-modified | ||
| # time (NOT the current time). Stamping with now() on every read would | ||
| # perpetually refresh the entries so they never satisfy the 24 h TTL, | ||
| # permanently poisoning the "legacy" bucket. Using the file mtime lets | ||
| # migrated entries expire ~24 h after the marker was last written. |
Comment on lines
+1073
to
+1082
| test( | ||
| "dedup-prune: stale hook-dedup marker removed by sweep", | ||
| not _stale_marker.exists(), | ||
| f"stale marker still present; markers={list(_ddp_markers.glob('hook-dedup-*'))}", | ||
| ) | ||
| test( | ||
| "dedup-prune: fresh marker for current call survives sweep", | ||
| any(f.name.startswith("hook-dedup-preToolUse") for f in _ddp_markers.iterdir() if f.is_file()), | ||
| f"markers={list(_ddp_markers.glob('hook-dedup-*'))}", | ||
| ) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What & why
Three related robustness bugs in the
hooks/subsystem, all surfaced from a single investigation of "hook lỗi" on a live install. All are marker-accumulation / defensive-parsing issues.hooks/hook_runner.pyhook-dedup-*markers never pruned → 134,426 leaked files slowing every hook_prune_stale_dedup_markers()(5 s TTL, ≤1 sweep/60 s, fail-open)hooks/rules/tentacle.py_read_edits()re-stamped legacy entries withnow()each read → 24 h TTL never fires → 2032 stale paths caused permanent false-positiveTENTACLE REQUIREDmtimehooks/rules/error_kb.pyErrorFixNudgeRulecrashed'str' object has no attribute 'get'on stringtoolArgs(367 logged)toolArgs: parse JSON string, else coerce non-dict →{}Closes #963. Closes #964. Closes #965.
Verification evidence (commands run)
AST-parsed all three modified source files (exit 0).
Scope / follow-up
hook_runner.py+ non-binary installs). Native-Rust hook parity for bugs 1 and 3 should be verified separately (rubber-duck flagged the native tentacle/legacy reader may share bug 2's pattern) — tracked as follow-up, not claimed fixed here.Minimum footprint
3 source files (+45/+13/+13 lines), 3 test files; no new files, no new dependencies (pure stdlib).