feat(semantic-cache): per-entry hit analytics and cold-entry diagnostics#215
feat(semantic-cache): per-entry hit analytics and cold-entry diagnostics#215amitkojha05 wants to merge 1 commit into
Conversation
a8a4fdc to
ebc046e
Compare
ebc046e to
b764e1c
Compare
b764e1c to
1210123
Compare
1210123 to
e1b3704
Compare
e1b3704 to
a0de39f
Compare
| await pipeline.exec(); | ||
| } catch { | ||
| // best-effort: usage tracking must never fail a cache hit | ||
| } |
There was a problem hiding this comment.
TTL refresh errors now silently swallowed on hits
Medium Severity
The recordEntryUsage and recordEntryUsageBatch methods wrap the expire call (TTL refresh) in a try/catch that swallows all errors. Previously, the expire call in check() and checkBatch() was standalone and would propagate failures. By bundling TTL refresh with the new usage-tracking pipeline under a blanket catch, a transient Redis error now silently skips the TTL refresh, potentially causing active entries to expire prematurely. The "best-effort" comment is appropriate for hit_count/last_accessed_at tracking, but the pre-existing TTL refresh is a correctness concern that lost its error signal.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit a0de39f. Configure here.
a0de39f to
79bdc43
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 2 total unresolved issues (including 1 from previous review).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 79bdc43. Configure here.
| const [totalEntries, neverHitCount, coldEntryCount] = await Promise.all([ | ||
| countOf('*'), | ||
| countOf('@hit_count:[0 0]'), | ||
| countOf(`@last_accessed_at:[0 ${coldCutoff}]`), |
There was a problem hiding this comment.
Cold entry count boundary differs between search and scan
Low Severity
The fast path (FT.SEARCH) and slow path (SCAN) use different comparison semantics for the coldCutoff boundary. The search query @last_accessed_at:[0 ${coldCutoff}] uses an inclusive upper bound (<=), while the scan path filter e.lastAccessedAt < coldCutoff uses a strict comparison (<). An entry whose lastAccessedAt equals coldCutoff exactly would be counted as cold by the search path but not by the scan path. RediSearch supports exclusive bounds via [( syntax, so matching the scan path's < semantics is straightforward.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 79bdc43. Configure here.


What this PR does
Monitor can already answer "what is this cache's hit rate?" and "where do similarity scores cluster?" It cannot answer "of 40,000 stored prompts, how many have ever been returned as a hit — and which ones are dead weight?"
Never-hit entries sit silently in the HNSW index consuming memory and slowing every
FT.SEARCH. Without per-entry signals there is no data-driven TTL sizing, no cold-entry dashboard, and no usage-based invalidation.This PR moves observability from cache-level → entry-level, in the same spirit as the discovery-marker work (v0.3.0) moved it from nothing → cache-level.
Changes
Per-entry state (store / storeMultipart)
Every new entry gets two additional hash fields:
hit_count: '0'andlast_accessed_at: '0'. Pre-existing entries automatically start tracking on their first post-upgrade hit viaHINCRBYauto-creation — no migration required.Hit path (check / checkBatch)
On a genuine returned hit only (not judge-rejected, not stale-evicted), a pipeline atomically increments
hit_countand setslast_accessed_at. This is batched with the existing TTLEXPIREinto a single round trip — no extra latency on the hot path. Best-effort: a failure never breakscheck().FT index schema
hit_count NUMERIC SORTABLEandlast_accessed_at NUMERIC SORTABLEadded toFT.CREATE. Existing indexes keep working —_hasUsageFieldsis detected viaFT.INFOoninitialize(), mirroring the_hasBinaryRefspattern. Runflush() + initialize()to rebuild and enable the fast analytics path.entryAnalytics(options?) — new public method
Returns
totalEntries,neverHitCount,hitAtLeastOnceCount,coldEntryCount,topEntries(sorted byhitCountdesc, capped attopN), andcoldAfterDays.Two collection paths:
Fast path (
_hasUsageFields = true): three parallelFT.SEARCH … LIMIT 0 0count queries (no document materialization, exact on any index size) + oneSORTBY hit_count DESC LIMIT 0 topNquery for the hot list. Counts and top entries are independent queries — counts are never derived from a sampled hot list.Slow path (
_hasUsageFields = false, legacy index):clusterScan + HGETALL, capped at 10,000 entries.hgetallcalls stop at the cap; alimitReachedflag prevents even callback overhead beyond the limit.totalEntriesreflects the sample size, documented in JSDoc and README.Discovery capability
'entry_analytics'added to the capability array. Monitor can gate new endpoints/tools on this the same way it gatesthreshold_adjust.New exported types
EntryAnalyticsOptions,EntryAnalyticsResult,EntrySummary— all exported from the package root.Testing
Validated with:
pnpm --filter @betterdb/semantic-cache testIntegration tests pass successfully using Redis Stack / RediSearch setup.
Note
Medium Risk
Adds write-side behavior on every cache hit (pipelined
HINCRBY/HSETand optionalEXPIRE) and extends the FT index schema, which can affect performance and error handling on the hot path and requires index rebuild to enable fast analytics.Overview
Adds per-entry usage tracking by storing
hit_countandlast_accessed_aton each entry, incrementing/updating them on cache hits (and batching with TTL refresh); hit-side updates are now best-effort and occur even whendefaultTtlis unset.Introduces
cache.entryAnalytics()plus exported types (EntryAnalyticsOptions,EntryAnalyticsResult,EntrySummary) to report total/never-hit/cold entry counts and top hot entries, usingFT.SEARCH LIMIT 0 0/SORTBY hit_countwhen the index has the new sortable fields and falling back to a cappedSCAN+ pipelinedHMGETsampling path for legacy indexes.Updates discovery markers to advertise a new
entry_analyticscapability, extendsFT.CREATEschema withhit_count/last_accessed_at, and adds tests covering both the search-based and scan-based analytics paths.Reviewed by Cursor Bugbot for commit 79bdc43. Bugbot is set up for automated code reviews on this repo. Configure here.