Skip to content

Add regex mode to SearchMemory tool#343

Merged
rockfordlhotka merged 1 commit intomainfrom
issue-339/add-grep-regex-mode-to-searchmemory
May 6, 2026
Merged

Add regex mode to SearchMemory tool#343
rockfordlhotka merged 1 commit intomainfrom
issue-339/add-grep-regex-mode-to-searchmemory

Conversation

@rockfordlhotka
Copy link
Copy Markdown
Member

Summary

  • Adds a mode parameter to the SearchMemory tool: hybrid (default — BM25 + optional vector, unchanged) or regex (literal pattern matching).
  • Regex backend matches against each entry's logical memory path name ({category}/{id}) plus the BM25 document text (content + tags + category words). The on-disk file path from FileMemoryStore.GetFilePath is deliberately never in the match surface — the model operates in id-space.
  • Two-layer cost ceiling: per-entry Regex.MatchTimeout of 1s catches catastrophic backtracking on a single input; an overall 10s wall-clock budget across the scan stops a slow-but-not-pathological pattern from dominating as the corpus grows.
  • New MemorySearchException carries human-readable backend errors (invalid pattern, per-entry timeout, overall budget) up to the tool layer, which surfaces the message verbatim so the model can refine its query.
  • Hybrid path, vector cache, and _embeddingCache are skipped entirely in regex mode — behaves identically whether the deployment runs hybrid or BM25-only.
  • Updated common-directives.md (loaded by both primary and sub agents) with one short paragraph teaching when to reach for mode='regex'.

Test plan

  • dotnet build RockBot.slnx — clean
  • dotnet test RockBot.slnx — full unit suite green (615 Host tests, 130 Agent tests, all others pass)
  • Targeted Search_RegexMode_* tests cover: literal-token match, id-in-path match, category-in-path match, on-disk-path-must-not-leak, default case-insensitive, case-sensitive flag, category pre-filter, MaxResults cap, invalid pattern → exception, no-match → empty, importance/last-seen ordering, null-query fallback, catastrophic-backtracking timeout, overall-budget overrun
  • MemoryToolsTests cover: default mode is hybrid, regex / Regex parsed, unknown mode returns error string and skips the store, MemorySearchException flows through verbatim
  • Optional manual verification in a live cluster (out of scope — unit coverage is sufficient per issue)

Closes #339

🤖 Generated with Claude Code

Extends the existing SearchMemory tool with a `mode` parameter that
switches the search backend between hybrid (BM25 + optional vector,
default) and regex/grep-style literal pattern matching. The model picks
the mode based on whether it knows a literal token to search for.

The regex backend matches against the memory's logical path name
(`{category}/{id}`) plus the BM25 document text (content + tags +
category words), never the on-disk file path. Bounded by a 1s per-entry
Regex.MatchTimeout (catches catastrophic backtracking) plus a 10s
overall wall-clock budget across the scan (bounds total cost as the
corpus grows).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rockfordlhotka rockfordlhotka merged commit 3ef5fba into main May 6, 2026
2 checks passed
@rockfordlhotka rockfordlhotka deleted the issue-339/add-grep-regex-mode-to-searchmemory branch May 6, 2026 07:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add grep/regex mode to SearchMemory as complement to hybrid search

1 participant