-
Notifications
You must be signed in to change notification settings - Fork 5
feat(core): add QMD-inspired hybrid search pipeline #40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Implement a hybrid search system combining vector embeddings with keyword search: - Vector embeddings using node-llama-cpp with GGUF models (nomic-embed-text) - RRF (Reciprocal Rank Fusion) for combining multiple rankers - Query expansion with LLM and synonym fallback - Position-aware score blending (different weights for top3, top10, rest) - SQLite storage with sqlite-vec extension (falls back to in-memory) - Auto-download of local GGUF models (~2GB) New search module: - embeddings.ts: Vector embedding generation service - vector-store.ts: SQLite-based vector storage - rrf.ts: RRF fusion algorithm - expansion.ts: Query expansion with caching - hybrid.ts: Main hybrid search pipeline - local-models.ts: GGUF model management CLI integration: - skillkit recommend --hybrid: Enable hybrid search - skillkit recommend --expand: Enable query expansion - skillkit recommend --rerank: Enable LLM re-ranking - skillkit recommend --build-index: Build embedding index Tests: 56 new tests for search module (all passing)
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📝 WalkthroughWalkthroughImplements a hybrid search subsystem: local model management, embedding service, SQLite-backed vector store (optional sqlite-vec), query expansion (LLM-backed with fallback), RRF fusion and reranking, a HybridSearchPipeline plus helpers, integration into RecommendationEngine and CLI, and comprehensive tests for the new search components. Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as CLI
participant Engine as RecommendationEngine
participant Pipeline as HybridSearchPipeline
participant Embed as EmbeddingService
participant VecStore as VectorStore
participant Expand as QueryExpander
participant RRF as RRF
CLI->>Engine: recommend --hybrid query
Engine->>Pipeline: ensure initialized
Pipeline->>Embed: initialize()
Pipeline->>VecStore: initialize()
Pipeline->>Expand: initialize() (if expand enabled)
Pipeline->>Expand: expand(query) (optional)
Expand-->>Pipeline: ExpandedQuery
Pipeline->>Embed: embed(query)
Embed-->>Pipeline: queryVector
Pipeline->>VecStore: search(queryVector)
VecStore-->>Pipeline: vectorResults
Pipeline->>Pipeline: keywordSearch(query) → keywordResults
Pipeline->>RRF: fuseWithRRF(vectorResults, keywordResults)
RRF-->>Pipeline: fusedResults
Pipeline->>Pipeline: rerank(topResults) (optional)
Pipeline-->>Engine: HybridSearchResponse
Engine-->>CLI: print results
sequenceDiagram
participant CLI as CLI
participant Engine as RecommendationEngine
participant Pipeline as HybridSearchPipeline
participant Embed as EmbeddingService
participant VecStore as VectorStore
CLI->>Engine: recommend --build-index
Engine->>Pipeline: ensure initialized
Pipeline->>Embed: initialize(onProgress)
Embed-->>Pipeline: ready
Pipeline->>Engine: load current skill index
Pipeline->>Embed: embedSkills(allSkills, onProgress)
Embed-->>Pipeline: SkillEmbeddings
Pipeline->>VecStore: initialize()
VecStore->>Pipeline: ready
Pipeline->>VecStore: storeBatch(embeddings, onProgress)
VecStore-->>Pipeline: persisted
Pipeline-->>Engine: build complete
Engine-->>CLI: report output location
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if (embedding.chunks) { | ||
| db.prepare(` | ||
| DELETE FROM ${this.config.tableName}_chunks WHERE skill_name = ? | ||
| `).run(embedding.skillName); | ||
|
|
||
| for (const chunk of embedding.chunks) { | ||
| const result = db.prepare(` | ||
| INSERT INTO ${this.config.tableName}_chunks | ||
| (skill_name, content, start_line, end_line) | ||
| VALUES (?, ?, ?, ?) | ||
| `).run(embedding.skillName, chunk.content, chunk.startLine, chunk.endLine); | ||
|
|
||
| const rowId = result.lastInsertRowid ?? 0; | ||
| db.prepare(` | ||
| INSERT INTO ${this.config.tableName}_chunks_vec | ||
| (chunk_id, embedding) | ||
| VALUES (?, ?) | ||
| `).run(rowId, Buffer.from(chunk.vector.buffer)); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔴 Orphaned chunk vector records when updating/deleting skills in sqlite-vec mode
When using sqlite-vec mode, chunk vector embeddings in the _chunks_vec table are never cleaned up during store, delete, or clear operations, leading to orphaned records and potential database bloat.
Click to expand
Issue Details
The VectorStore class creates a _chunks_vec virtual table (line 109) to store chunk embeddings when using sqlite-vec. However:
-
In
store()(lines 223-241), when re-storing a skill with chunks, only the_chunkstable is deleted before re-inserting:db.prepare(` DELETE FROM ${this.config.tableName}_chunks WHERE skill_name = ? `).run(embedding.skillName);
The
_chunks_vectable entries are not deleted, causing orphaned vector records. -
In
delete()(lines 420-427), the_chunks_vectable is never cleaned up. -
In
clear()(lines 438-445), the_chunks_vectable is never cleaned up.
Impact
- Database grows unbounded with orphaned chunk vector records
- Potential incorrect search results if orphaned vectors are matched
- Storage waste over time as skills are updated/deleted
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 15
🤖 Fix all issues with AI agents
In `@packages/core/package.json`:
- Around line 60-67: Tighten the node-llama-cpp dependency and align fallback
behavior: change the node-llama-cpp semver in package.json to restrict upgrades
(e.g., ^3.0.0 <3.7.0) unless you intend to raise the engines field to ">=20.0.0"
and document that breaking change; then update expansion.ts and embeddings.ts to
mirror the safe import/fallback pattern used in vector-store.ts (i.e., attempt
to require/import node-llama-cpp inside a try/catch or guarded lazy loader, set
a null/undefined client when unavailable, and throw only when actually calling
functionality that requires it) so missing node-llama-cpp degrades gracefully
like better-sqlite3 and sqlite-vec.
In `@packages/core/src/search/__tests__/embeddings.test.ts`:
- Around line 1-11: The test file uses beforeEach in the EmbeddingService tests
but does not import it from Vitest; update the import statement at the top of
packages/core/src/search/__tests__/embeddings.test.ts to include beforeEach
alongside describe, it, and expect so the test suite's beforeEach hook for
creating a new EmbeddingService instance works without a ReferenceError.
In `@packages/core/src/search/__tests__/expansion.test.ts`:
- Around line 76-82: The test currently asserts result1.original !==
result2.original but should assert they are the same to confirm case-insensitive
caching; update the test in expansion.test.ts to call expander.expand('Auth')
then expander.expand('AUTH') and assert the second call returns the same cached
ExpandedQuery (same object or same .original value) as the first—this verifies
getFromCache (which uses query.toLowerCase()) correctly returns the cached entry
from expansion.ts.
In `@packages/core/src/search/embeddings.ts`:
- Around line 227-229: The calculation of startLine can produce a negative value
when overlapLines.length > i; update the assignment in the overlap handling
(where currentChunk, currentChars and startLine are set) to clamp startLine to a
minimum of 0 (e.g., startLine = Math.max(0, i - overlapLines.length)) and ensure
any downstream logic that uses startLine (references to currentChunk,
currentChars, overlapLines, overlapCharsCount, and i) still behaves correctly
with the clamped value.
In `@packages/core/src/search/expansion.ts`:
- Around line 171-178: The weights array can be longer than the returned
variations because weights is built from the full variations list before
slicing; update the logic in the function that builds the ExpandedQuery so you
first compute a slicedVariations = variations.slice(0, 3) (or reuse a similar
variable) and then construct weights as [2.0, ...slicedVariations.map(() =>
1.0)], and return original: query, variations: slicedVariations, weights so the
weights length always matches the returned variations.
- Around line 159-169: The loop that builds variations uses new RegExp(term,
'i') with user-controlled term (variables: term, terms, query, synonyms,
variations), creating a ReDoS risk; fix by escaping regex metacharacters in term
before constructing the RegExp (implement/inline an escapeRegExp function that
replaces characters like . * + ? ^ $ { } ( ) | [ ] \ / with backslash-escaped
versions) and then use new RegExp(escapedTerm, 'i') when creating the variation;
ensure the rest of the logic (slice(0,2), variation !== query,
!variations.includes) remains unchanged.
In `@packages/core/src/search/hybrid.ts`:
- Around line 270-272: The relevance field in the object pushed to hybridResults
can exceed 100 because hybridScore may be >1.0; clamp the computed relevance to
the 0–100 range before pushing. Locate the hybridResults.push call that uses
hybridScore (the object with keys skill and relevance) and replace the raw
Math.round(hybridScore * 100) with a clamped value (e.g., compute const
relevance = Math.min(100, Math.max(0, Math.round(hybridScore * 100))) and use
that) so relevance never falls outside 0–100.
In `@packages/core/src/search/rrf.ts`:
- Around line 153-163: The code mutates input arrays by pushing into
existing.ranks (from mergedScores) which may reference original ranking.ranks;
update the merge logic to never mutate caller data: when creating a new
mergedScores entry use ranks: [...ranking.ranks] and when appending to an
existing entry replace its ranks with a new array (e.g., existing.ranks =
[...existing.ranks, ...ranking.ranks]) instead of
existing.ranks.push(...ranking.ranks); update the block handling
secondaryRankings / mergedScores to always copy ranking.ranks into new arrays.
In `@packages/core/src/search/types.ts`:
- Around line 129-145: The gemma-2b-it entry in MODEL_REGISTRY points to a gated
Hugging Face URL causing 401s; update the registry and download flow to handle
gated models by (1) replacing 'gemma-2b-it-Q4_K_M.gguf' URL with an accessible
community GGUF mirror or mark the entry as gated (e.g., add a gated: true flag)
in MODEL_REGISTRY, and (2) enhance the downloader in local-models.ts to support
authenticated downloads (read HF token from env/credentials and send
Authorization header), add simple retry logic for transient HTTP errors, and
implement a fallback sequence that tries the primary URL then the community
mirror when gated or unauthorized. Reference MODEL_REGISTRY
(llm['gemma-2b-it-Q4_K_M.gguf']) and the downloader in local-models.ts to locate
where to apply these changes.
In `@packages/core/src/search/vector-store.ts`:
- Around line 145-191: The empty catch in loadEmbeddingsFromDb silently swallows
DB errors; update the catch to surface or log the error (e.g., via
this.logger.error or rethrow) so failures when reading from this.db (used with
usingSqliteVec and the prepared queries on this.config.tableName/_meta) are
visible; ensure the error message includes context (function name, tableName,
skill_name or row context) and preserve existing behavior for successful rows
while not masking exceptions thrown during the db.prepare/.all or vector
conversion.
- Around line 430-446: The clear() method fails to remove entries from the
chunks vector table in sqlite-vec mode; update the logic in clear() (the method
on the vector store class using this.usingSqliteVec and this.config.tableName)
so that when this.usingSqliteVec is true you also execute a DELETE on
`${this.config.tableName}_chunks_vec` (in addition to `_meta`, `_vec`, and
`_chunks`), ensuring all sqlite-vec related tables are cleared; keep the
existing non-sqlite branch unchanged.
- Around line 410-428: The delete method (async delete(skillName: string)) fails
to remove chunk vector rows when this.usingSqliteVec is true, leaving orphaned
entries in the _chunks_vec table; update the sqlite-vec branch inside delete to
also run a DELETE against `${this.config.tableName}_chunks_vec` (similar to the
existing deletes for `_meta` and `_vec`), ensuring you call
db.prepare(...).run(skillName) for that table when this.usingSqliteVec is true
so all related tables (`_meta`, `_vec`, `_chunks`, and `_chunks_vec`) are
cleaned up.
- Around line 75-143: The SQL injection risk comes from interpolating
this.config.tableName into SQL in initializeSqliteVecTables and
initializeFallbackTables; before using it, validate or sanitize the table name
(e.g., enforce an allowlist or a strict identifier regex like
/^[A-Za-z_][A-Za-z0-9_]*$/) and throw an error if it doesn't match, so only safe
SQL identifiers are used; alternatively restrict tableName to configured
internal values or map user input to predetermined safe names rather than
directly interpolating user-provided strings.
- Around line 164-171: The loop that populates this.embeddings currently sets
each entry's vector to new Float32Array(this.config.dimensions) (all zeros)
instead of reading the actual vector bytes from the sqlite-vec storage; update
the loader in vector-store.ts (the block iterating over rows and setting
this.embeddings) to fetch the corresponding vector data from the _vec table (or
the query result that contains the vector column), decode/convert it into a
Float32Array of length this.config.dimensions, and assign that to the vector
field for each entry (preserve skillName, textContent, generatedAt); ensure you
reference the same key (row.skill_name) when joining or looking up the _vec data
so in-memory searches use the real vectors rather than zeroed arrays.
- Around line 337-341: The similarity calculation currently uses "similarity: 1
- row.distance" (in the rows.map mapping) but the vec0 table was created without
a distance_metric so it defaults to L2 (unbounded), causing negative
similarities; either change the vec0 table creation to use
distance_metric='cosine' so row.distance is a cosine distance in [0,1], or keep
the L2 metric and replace the mapping logic with a proper normalization (e.g.,
convert cosine similarity from dot product or clamp/scale distances into [0,1])
so the result of the rows.map (where similarity is computed) always yields a
0..1 similarity; update the vec0 table creation code or the rows.map similarity
expression accordingly.
🧹 Nitpick comments (19)
packages/core/src/recommend/types.ts (1)
330-349: Consider aliasing to the search module’s hybrid types to avoid drift.If these mirror
HybridSearchOptions/Resultinpackages/core/src/search/types.ts, a type alias (or explicit doc note) will prevent divergence over time.packages/core/src/search/__tests__/embeddings.test.ts (1)
73-216: Avoid re-implementing production logic inside tests.
buildSkillTextandchunkTextare recreated locally, so these tests won’t catch regressions inEmbeddingService. Consider asserting via the public API (e.g., methods that invoke these helpers) or exporting the helpers for test use.packages/core/src/search/__tests__/expansion.test.ts (1)
85-92: Test forclearCachedoes not verify the cache was actually cleared.The test only checks that
expandstill works after clearing. Consider verifying the cache was cleared by checking that a subsequent call produces a fresh result (e.g., by comparing object identity or confirming the expansion logic ran again).💡 Suggested improvement
describe('clearCache', () => { it('should clear the expansion cache', async () => { - await expander.expand('auth'); + const result1 = await expander.expand('auth'); expander.clearCache(); const result = await expander.expand('auth'); expect(result).toBeDefined(); + // After clearing, we should get a new object, not the cached one + expect(result).not.toBe(result1); }); });packages/cli/src/commands/recommend.ts (1)
571-604: Hardcoded storage path may not match actual implementation.Line 595 displays
~/.skillkit/search.dbas the index location, but this should ideally be retrieved from the actual configuration orLocalModelManagerto ensure consistency.packages/core/src/search/embeddings.ts (1)
79-85: Sequential embedding inembedBatchmay be slow for large batches.The current implementation processes texts one at a time. If
node-llama-cppsupports batch embedding, consider leveraging it for better performance. Otherwise, this is acceptable as a baseline implementation.packages/core/src/recommend/engine.ts (1)
665-677: Post-filtering may return fewer results than requestedlimit.Filters are applied after the hybrid pipeline returns results, which may reduce the count below the requested
limit. Consider passing filters to the pipeline or fetching extra candidates to compensate.This is a minor UX concern rather than a bug, as the user may not get the expected number of results when filters are active.
packages/core/src/search/rrf.ts (1)
129-136:getRankFromScorere-sorts on every call.If this function is called frequently in a loop, consider pre-sorting once and passing the sorted array, or caching the sorted result.
packages/core/src/search/hybrid.ts (3)
130-136: Silent catch block swallows embedding errors.If embedding fails, the code silently continues without the vector search component. Consider logging the error or surfacing it in stats for debugging purposes.
💡 Suggested improvement
try { if (this.embeddingService.isInitialized()) { queryVector = await this.embeddingService.embed(query); } - } catch { + } catch (error) { + // Log for debugging - vector search will be skipped + console.debug?.('Embedding failed, falling back to keyword-only search:', error); }
359-384: Helper function creates and disposes pipeline per call.The
hybridSearchconvenience function creates a new pipeline for each call, which is inefficient for repeated searches. Consider documenting this or providing guidance to useHybridSearchPipelinedirectly for batch operations.
295-319: Reranker uses a simple heuristic rather than an LLM.The
rerankmethod uses substring matching with fixed weights (+0.3 base score). This is a reasonable placeholder but differs from the PR description mentioning "LLM reranking". Consider adding a comment or updating documentation to clarify this is a heuristic-based approach.packages/core/src/search/types.ts (1)
13-19:tokenCountfield inSkillChunkSchemais not populated bychunkText.The schema includes
tokenCountbut the chunking logic inembeddings.tsdoesn't calculate or include this field in the returned chunks. Consider either removing it from the schema or adding token counting tochunkText.packages/core/src/search/local-models.ts (3)
109-139: Consider cleaning up the reader on error.When an error occurs during the download loop, the
readeris not released before cleanup. While this likely gets garbage collected, explicitly canceling the reader is a best practice for resource cleanup.♻️ Proposed fix
} catch (error) { + reader.cancel().catch(() => {}); // Best-effort cleanup if (fs.existsSync(tempPath)) { await fs.promises.unlink(tempPath); } throw error; }
141-159: Potential stale cache if the model file is deleted externally.If
this.embedModelPathis cached but the file is later deleted externally,ensureEmbedModelreturns the stale path because Line 142 checksfs.existsSync(this.embedModelPath)first. However, Line 146 rechecksisEmbedModelAvailable()which would catch this. The logic is correct but slightly redundant. Consider simplifying to always useisEmbedModelAvailable().♻️ Simplified approach
async ensureEmbedModel(onProgress?: IndexBuildCallback): Promise<string> { - if (this.embedModelPath && fs.existsSync(this.embedModelPath)) { - return this.embedModelPath; - } - - if (!this.isEmbedModelAvailable()) { + if (!this.isEmbedModelAvailable()) { if (!this.config.autoDownload) { throw new Error( `Embedding model not found at ${this.getEmbedModelPath()}. ` + 'Set autoDownload: true or manually download the model.' ); } this.embedModelPath = await this.downloadModel('embed', onProgress); } else { this.embedModelPath = this.getEmbedModelPath(); } return this.embedModelPath; }
230-236: Edge case:formatBytesreturnsInfinity GBfor very large values.If
bytesexceeds the range whereMath.log(bytes) / Math.log(k)yields an index beyond thesizesarray (e.g., terabytes),sizes[i]will beundefined. While unlikely for model files, consider adding bounds checking or extending the sizes array.♻️ Proposed fix with bounds check
export function formatBytes(bytes: number): string { if (bytes === 0) return '0 Bytes'; const k = 1024; - const sizes = ['Bytes', 'KB', 'MB', 'GB']; - const i = Math.floor(Math.log(bytes) / Math.log(k)); + const sizes = ['Bytes', 'KB', 'MB', 'GB', 'TB']; + const i = Math.min(Math.floor(Math.log(bytes) / Math.log(k)), sizes.length - 1); return `${parseFloat((bytes / Math.pow(k, i)).toFixed(2))} ${sizes[i]}`; }packages/core/src/search/expansion.ts (3)
1-11: Module-level cache persists across QueryExpander instances.
expansionCacheis a module-levelMap, meaning allQueryExpanderinstances share the same cache. This could lead to unexpected behavior if different instances use different TTLs or if one instance'sclearCache()affects others. Consider making the cache instance-specific.♻️ Make cache instance-specific
-const expansionCache = new Map<string, CacheEntry>(); - export class QueryExpander { private modelManager: LocalModelManager; private model: unknown = null; private context: unknown = null; private initialized = false; private cacheTtlMs: number; + private cache = new Map<string, CacheEntry>(); // Then update all references from expansionCache to this.cache
193-206: Cache eviction removes oldest inserted, not oldest accessed (not LRU).The current strategy removes the first key inserted when the cache is full. This is FIFO, not LRU. If a frequently accessed entry was inserted first, it will be evicted even if recently used. Consider tracking access time or using an LRU structure if access patterns matter.
236-239:expandQuerySimpleaccesses a private method via bracket notation.Using
expander['expandWithoutLLM'](query)bypasses TypeScript's access control. This is a code smell that could break if the private method is renamed. Consider makingexpandWithoutLLMa standalone exported function or exposing it through a public method.♻️ Extract as standalone function
+// Move expandWithoutLLM logic to a standalone function +function expandWithoutLLMImpl(query: string): ExpandedQuery { + // ... existing expandWithoutLLM logic ... +} + export class QueryExpander { // ... private expandWithoutLLM(query: string): ExpandedQuery { - // ... existing implementation ... + return expandWithoutLLMImpl(query); } } export function expandQuerySimple(query: string): ExpandedQuery { - const expander = new QueryExpander(); - return expander['expandWithoutLLM'](query); + return expandWithoutLLMImpl(query); }packages/core/src/search/vector-store.ts (2)
277-304:storeBatchperforms sequential stores without transaction.Each
storecall is executed sequentially and independently. If an error occurs mid-batch, partial data remains committed. Consider wrapping in a transaction for atomicity, especially for batch operations.♻️ Wrap batch operations in a transaction
async storeBatch( embeddings: SkillEmbedding[], onProgress?: IndexBuildCallback ): Promise<void> { await this.initialize(); + if (this.db) { + const db = this.db as { exec(sql: string): void }; + db.exec('BEGIN TRANSACTION'); + } + + try { const total = embeddings.length; for (let i = 0; i < embeddings.length; i++) { // ... existing logic ... } + if (this.db) { + (this.db as { exec(sql: string): void }).exec('COMMIT'); + } + } catch (error) { + if (this.db) { + (this.db as { exec(sql: string): void }).exec('ROLLBACK'); + } + throw error; + } }
448-472:getStats()hardcodes model name.Line 469 returns
modelName: 'nomic-embed-text-v1.5'regardless of the actual model used. This should ideally come from theEmbeddingServiceor be configurable.
| "optionalDependencies": { | ||
| "node-llama-cpp": "^3.0.0", | ||
| "better-sqlite3": "^11.0.0", | ||
| "sqlite-vec": "^0.1.0" | ||
| }, | ||
| "devDependencies": { | ||
| "@types/better-sqlite3": "^7.6.11", | ||
| "@types/node": "^22.10.5", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# Check metadata for optional native deps
npm view node-llama-cpp@^3.0.0 engines os cpu 2>&1 | head -50
echo "---"
npm view better-sqlite3@^11.0.0 engines os cpu 2>&1 | head -50
echo "---"
npm view sqlite-vec@^0.1.0 engines os cpu 2>&1 | head -50Repository: rohitg00/skillkit
Length of output: 1553
🏁 Script executed:
# Search for imports and usage of these optional dependencies
rg "node-llama-cpp|better-sqlite3|sqlite-vec" --type js --type ts --type json -B 2 -A 2Repository: rohitg00/skillkit
Length of output: 5201
🏁 Script executed:
# Check if there are conditional imports or try-catch blocks for these packages
rg "require.*node-llama-cpp|require.*better-sqlite3|require.*sqlite-vec|import.*node-llama-cpp|import.*better-sqlite3|import.*sqlite-vec" --type js --type ts -B 3 -A 3Repository: rohitg00/skillkit
Length of output: 2141
🏁 Script executed:
# Check the package.json file for context
cat -n packages/core/package.json | sed -n '55,75p'Repository: rohitg00/skillkit
Length of output: 669
🏁 Script executed:
# Get the full error handling in vector-store.ts
cat -n packages/core/src/search/vector-store.ts | grep -A 30 "sqlite-vec.*optional"Repository: rohitg00/skillkit
Length of output: 1235
🌐 Web query:
better-sqlite3 Node.js version requirements engines compatibility
💡 Result:
- Current
better-sqlite3(v12.4.6) Node.js engine constraint (npmengines):node: "20.x || 22.x || 23.x || 24.x || 25.x"— notably Node 21.x is not included. [1] - Project docs/README minimum: states “Requires Node.js v14.21.1 or later”, and notes prebuilt binaries are available for LTS versions (but the actual install-time compatibility is governed by the published version’s
package.json/ native ABI). [2] - Older major line (example: v11.9.0): its published
package.jsonon unpkg shows noenginesfield, so npm won’t enforce a Node version viaenginesfor that release line (though it still may or may not build/run on your Node due to native ABI). [3]
Sources:
[1] better-sqlite3@12.4.6 package.json (engines) (app.unpkg.com)
[2] better-sqlite3 README / install notes (github.com)
[3] better-sqlite3@11.9.0 package.json (no engines) (app.unpkg.com)
Citations:
- 1: https://app.unpkg.com/better-sqlite3%4012.4.6/files/package.json?utm_source=openai
- 2: https://github.com/WiseLibs/better-sqlite3
- 3: https://app.unpkg.com/better-sqlite3%4011.9.0/files/package.json?utm_source=openai
🌐 Web query:
sqlite-vec Node.js version requirements engines
💡 Result:
The sqlite-vec npm distribution does not declare a Node.js version requirement via package.json#engines (at least in the published 0.1.7-alpha.2 artifacts). For example, the platform packages’ package.json files contain no engines field—only metadata like os/cpu and entrypoints. [1], [2]
Practically, that means npm won’t enforce a minimum Node version for sqlite-vec itself; your effective Node version requirement will come from whichever SQLite driver you use to load the extension (e.g., better-sqlite3 / sqlite3). The upstream project documents installing via npm install sqlite-vec for Node.js. [3]
Sources: [1], [2], [3]
Tighten node-llama-cpp version constraint and ensure consistent fallback behavior.
The node-llama-cpp range ^3.0.0 can resolve to versions 3.7.0+ which require Node >=20.0.0, but the package declares engines Node >=18.0.0. This creates an incompatibility: installation will fail on Node 18.x.
Additionally, fallback behavior is inconsistent—better-sqlite3 and sqlite-vec gracefully degrade when missing, but node-llama-cpp throws an error in expansion.ts and embeddings.ts rather than following the same fallback pattern used in vector-store.ts.
Either:
- Narrow
node-llama-cppto^3.0.0 <3.7.0to maintain Node 18 compatibility, or - Update the engines field to
>=20.0.0and document the breaking change.
🤖 Prompt for AI Agents
In `@packages/core/package.json` around lines 60 - 67, Tighten the node-llama-cpp
dependency and align fallback behavior: change the node-llama-cpp semver in
package.json to restrict upgrades (e.g., ^3.0.0 <3.7.0) unless you intend to
raise the engines field to ">=20.0.0" and document that breaking change; then
update expansion.ts and embeddings.ts to mirror the safe import/fallback pattern
used in vector-store.ts (i.e., attempt to require/import node-llama-cpp inside a
try/catch or guarded lazy loader, set a null/undefined client when unavailable,
and throw only when actually calling functionality that requires it) so missing
node-llama-cpp degrades gracefully like better-sqlite3 and sqlite-vec.
| it('should handle case-insensitive caching', async () => { | ||
| const result1 = await expander.expand('Auth'); | ||
| const result2 = await expander.expand('AUTH'); | ||
|
|
||
| expect(result1.original).toBe('Auth'); | ||
| expect(result2.original).toBe('AUTH'); | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Case-insensitive caching test doesn't verify caching behavior correctly.
The test checks that result1.original differs from result2.original, but this actually demonstrates that the cache is returning different results for the same normalized key (both "Auth" and "AUTH" normalize to "auth"). Per the implementation in expansion.ts, getFromCache uses query.toLowerCase() as the key, so the second call should return the cached ExpandedQuery from the first call.
If the intent is to verify case-insensitive caching works, the assertion should verify both calls return the same cached object:
💡 Suggested fix
it('should handle case-insensitive caching', async () => {
const result1 = await expander.expand('Auth');
const result2 = await expander.expand('AUTH');
- expect(result1.original).toBe('Auth');
- expect(result2.original).toBe('AUTH');
+ // Cache key is normalized to lowercase, so both should return the same cached result
+ expect(result1).toBe(result2);
+ expect(result1.original).toBe('Auth'); // Original case preserved from first call
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| it('should handle case-insensitive caching', async () => { | |
| const result1 = await expander.expand('Auth'); | |
| const result2 = await expander.expand('AUTH'); | |
| expect(result1.original).toBe('Auth'); | |
| expect(result2.original).toBe('AUTH'); | |
| }); | |
| it('should handle case-insensitive caching', async () => { | |
| const result1 = await expander.expand('Auth'); | |
| const result2 = await expander.expand('AUTH'); | |
| // Cache key is normalized to lowercase, so both should return the same cached result | |
| expect(result1).toBe(result2); | |
| expect(result1.original).toBe('Auth'); // Original case preserved from first call | |
| }); |
🤖 Prompt for AI Agents
In `@packages/core/src/search/__tests__/expansion.test.ts` around lines 76 - 82,
The test currently asserts result1.original !== result2.original but should
assert they are the same to confirm case-insensitive caching; update the test in
expansion.test.ts to call expander.expand('Auth') then expander.expand('AUTH')
and assert the second call returns the same cached ExpandedQuery (same object or
same .original value) as the first—this verifies getFromCache (which uses
query.toLowerCase()) correctly returns the cached entry from expansion.ts.
| for (const term of terms) { | ||
| const termSynonyms = synonyms[term]; | ||
| if (termSynonyms) { | ||
| for (const syn of termSynonyms.slice(0, 2)) { | ||
| const variation = query.replace(new RegExp(term, 'i'), syn); | ||
| if (variation !== query && !variations.includes(variation)) { | ||
| variations.push(variation); | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential ReDoS vulnerability with user-controlled regex pattern.
The term variable comes from user input (query.split()), which is used directly in new RegExp(term, 'i'). Malicious input with special regex characters could cause catastrophic backtracking.
🛡️ Proposed fix to escape regex special characters
+function escapeRegExp(str: string): string {
+ return str.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
+}
+
private expandWithoutLLM(query: string): ExpandedQuery {
// ... existing code ...
for (const term of terms) {
const termSynonyms = synonyms[term];
if (termSynonyms) {
for (const syn of termSynonyms.slice(0, 2)) {
- const variation = query.replace(new RegExp(term, 'i'), syn);
+ const variation = query.replace(new RegExp(escapeRegExp(term), 'i'), syn);
if (variation !== query && !variations.includes(variation)) {
variations.push(variation);
}
}
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for (const term of terms) { | |
| const termSynonyms = synonyms[term]; | |
| if (termSynonyms) { | |
| for (const syn of termSynonyms.slice(0, 2)) { | |
| const variation = query.replace(new RegExp(term, 'i'), syn); | |
| if (variation !== query && !variations.includes(variation)) { | |
| variations.push(variation); | |
| } | |
| } | |
| } | |
| } | |
| function escapeRegExp(str: string): string { | |
| return str.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); | |
| } | |
| for (const term of terms) { | |
| const termSynonyms = synonyms[term]; | |
| if (termSynonyms) { | |
| for (const syn of termSynonyms.slice(0, 2)) { | |
| const variation = query.replace(new RegExp(escapeRegExp(term), 'i'), syn); | |
| if (variation !== query && !variations.includes(variation)) { | |
| variations.push(variation); | |
| } | |
| } | |
| } | |
| } |
🧰 Tools
🪛 ast-grep (0.40.5)
[warning] 162-162: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(term, 'i')
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
🤖 Prompt for AI Agents
In `@packages/core/src/search/expansion.ts` around lines 159 - 169, The loop that
builds variations uses new RegExp(term, 'i') with user-controlled term
(variables: term, terms, query, synonyms, variations), creating a ReDoS risk;
fix by escaping regex metacharacters in term before constructing the RegExp
(implement/inline an escapeRegExp function that replaces characters like . * + ?
^ $ { } ( ) | [ ] \ / with backslash-escaped versions) and then use new
RegExp(escapedTerm, 'i') when creating the variation; ensure the rest of the
logic (slice(0,2), variation !== query, !variations.includes) remains unchanged.
| private async loadEmbeddingsFromDb(): Promise<void> { | ||
| if (!this.db) return; | ||
|
|
||
| const db = this.db as { | ||
| prepare(sql: string): { all(): Array<{ | ||
| skill_name: string; | ||
| text_content: string; | ||
| vector?: Buffer; | ||
| generated_at: string; | ||
| }> }; | ||
| }; | ||
|
|
||
| try { | ||
| if (this.usingSqliteVec) { | ||
| const rows = db.prepare(` | ||
| SELECT skill_name, text_content, generated_at | ||
| FROM ${this.config.tableName}_meta | ||
| `).all(); | ||
|
|
||
| for (const row of rows) { | ||
| this.embeddings.set(row.skill_name, { | ||
| skillName: row.skill_name, | ||
| textContent: row.text_content, | ||
| vector: new Float32Array(this.config.dimensions), | ||
| generatedAt: row.generated_at, | ||
| }); | ||
| } | ||
| } else { | ||
| const rows = db.prepare(` | ||
| SELECT skill_name, text_content, vector, generated_at | ||
| FROM ${this.config.tableName} | ||
| `).all(); | ||
|
|
||
| for (const row of rows) { | ||
| if (row.vector) { | ||
| this.embeddings.set(row.skill_name, { | ||
| skillName: row.skill_name, | ||
| textContent: row.text_content, | ||
| vector: new Float32Array(row.vector.buffer), | ||
| generatedAt: row.generated_at, | ||
| }); | ||
| } | ||
| } | ||
| } | ||
| } catch { | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Silent catch swallows all errors during DB load.
Line 189's empty catch block silently ignores any errors when loading embeddings from the database. This could mask legitimate issues like corrupted data or permission errors, making debugging difficult.
🐛 Proposed fix: log or surface errors
} catch (error) {
+ // Log the error but don't fail initialization - allows fallback to empty state
+ console.warn('Failed to load embeddings from database:', error);
}🤖 Prompt for AI Agents
In `@packages/core/src/search/vector-store.ts` around lines 145 - 191, The empty
catch in loadEmbeddingsFromDb silently swallows DB errors; update the catch to
surface or log the error (e.g., via this.logger.error or rethrow) so failures
when reading from this.db (used with usingSqliteVec and the prepared queries on
this.config.tableName/_meta) are visible; ensure the error message includes
context (function name, tableName, skill_name or row context) and preserve
existing behavior for successful rows while not masking exceptions thrown during
the db.prepare/.all or vector conversion.
| for (const row of rows) { | ||
| this.embeddings.set(row.skill_name, { | ||
| skillName: row.skill_name, | ||
| textContent: row.text_content, | ||
| vector: new Float32Array(this.config.dimensions), | ||
| generatedAt: row.generated_at, | ||
| }); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Loaded embeddings have zeroed vectors in sqlite-vec mode.
When loading from the sqlite-vec backend (Lines 164-171), vectors are initialized as new Float32Array(this.config.dimensions) (all zeros) rather than loaded from the _vec table. This means subsequent in-memory searches would use zero vectors, producing incorrect similarity results.
🐛 Proposed fix: load vectors from vec table
if (this.usingSqliteVec) {
+ const vecDb = this.db as {
+ prepare(sql: string): { get(...args: unknown[]): { embedding: Buffer } | undefined };
+ };
const rows = db.prepare(`
SELECT skill_name, text_content, generated_at
FROM ${this.config.tableName}_meta
`).all();
for (const row of rows) {
+ const vecRow = vecDb.prepare(`
+ SELECT embedding FROM ${this.config.tableName}_vec WHERE skill_name = ?
+ `).get(row.skill_name);
+ const vector = vecRow?.embedding
+ ? new Float32Array(vecRow.embedding.buffer)
+ : new Float32Array(this.config.dimensions);
this.embeddings.set(row.skill_name, {
skillName: row.skill_name,
textContent: row.text_content,
- vector: new Float32Array(this.config.dimensions),
+ vector,
generatedAt: row.generated_at,
});
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for (const row of rows) { | |
| this.embeddings.set(row.skill_name, { | |
| skillName: row.skill_name, | |
| textContent: row.text_content, | |
| vector: new Float32Array(this.config.dimensions), | |
| generatedAt: row.generated_at, | |
| }); | |
| } | |
| const vecDb = this.db as { | |
| prepare(sql: string): { get(...args: unknown[]): { embedding: Buffer } | undefined }; | |
| }; | |
| for (const row of rows) { | |
| const vecRow = vecDb.prepare(` | |
| SELECT embedding FROM ${this.config.tableName}_vec WHERE skill_name = ? | |
| `).get(row.skill_name); | |
| const vector = vecRow?.embedding | |
| ? new Float32Array(vecRow.embedding.buffer) | |
| : new Float32Array(this.config.dimensions); | |
| this.embeddings.set(row.skill_name, { | |
| skillName: row.skill_name, | |
| textContent: row.text_content, | |
| vector, | |
| generatedAt: row.generated_at, | |
| }); | |
| } |
🤖 Prompt for AI Agents
In `@packages/core/src/search/vector-store.ts` around lines 164 - 171, The loop
that populates this.embeddings currently sets each entry's vector to new
Float32Array(this.config.dimensions) (all zeros) instead of reading the actual
vector bytes from the sqlite-vec storage; update the loader in vector-store.ts
(the block iterating over rows and setting this.embeddings) to fetch the
corresponding vector data from the _vec table (or the query result that contains
the vector column), decode/convert it into a Float32Array of length
this.config.dimensions, and assign that to the vector field for each entry
(preserve skillName, textContent, generatedAt); ensure you reference the same
key (row.skill_name) when joining or looking up the _vec data so in-memory
searches use the real vectors rather than zeroed arrays.
| return rows.map((row) => ({ | ||
| skillName: row.skill_name, | ||
| similarity: 1 - row.distance, | ||
| })); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
sqlite-vec vec0 distance metric cosine euclidean
💡 Result:
In sqlite-vec’s vec0 virtual table, the distance metric is chosen per vector column via the column option distance_metric=....
- Default:
l2(Euclidean / L2 distance) (alexgarcia.xyz) - Cosine: set
distance_metric=cosine(alexgarcia.xyz) - Hamming:
distance_metric=hamming(only forbit[...]vectors) (agent-skills.md)
Example (cosine instead of default L2):
create virtual table vec_documents using vec0(
document_id integer primary key,
embedding float[768] distance_metric=cosine
);
select document_id, distance
from vec_documents
where embedding match :query
and k = 10;If you’re computing distances outside vec0, sqlite-vec also exposes scalar functions such as vec_distance_l2(a,b) and vec_distance_cosine(a,b). (stephencollins.tech)
Citations:
- 1: https://alexgarcia.xyz/sqlite-vec/features/knn.html?utm_source=openai
- 2: https://alexgarcia.xyz/sqlite-vec/features/knn.html?utm_source=openai
- 3: https://agent-skills.md/skills/existential-birds/beagle/sqlite-vec?utm_source=openai
- 4: https://alexgarcia.xyz/sqlite-vec/features/knn.html?utm_source=openai
- 5: https://stephencollins.tech/posts/how-sqlite-vec-works-for-storing-and-querying-vector-embeddings?utm_source=openai
🏁 Script executed:
rg -i "vec0|distance_metric" --type ts --type js --type sql -A 3 -B 3Repository: rohitg00/skillkit
Length of output: 1047
🏁 Script executed:
rg -A 10 "\.distance" packages/core/src/search/vector-store.tsRepository: rohitg00/skillkit
Length of output: 320
🏁 Script executed:
rg -B 15 "similarity: 1 - row.distance" packages/core/src/search/vector-store.tsRepository: rohitg00/skillkit
Length of output: 515
Use cosine distance instead of default L2, or normalize similarity calculation.
Line 339 computes similarity: 1 - row.distance, which assumes distance is normalized to [0, 1]. However, the vec0 table is created without specifying distance_metric, defaulting to L2 (Euclidean) distance—which is unbounded. This causes similarity to become negative for distant vectors, breaking the expected [0, 1] range. Configure distance_metric=cosine in the vec0 table creation, or adjust the similarity formula to handle unbounded distances.
🤖 Prompt for AI Agents
In `@packages/core/src/search/vector-store.ts` around lines 337 - 341, The
similarity calculation currently uses "similarity: 1 - row.distance" (in the
rows.map mapping) but the vec0 table was created without a distance_metric so it
defaults to L2 (unbounded), causing negative similarities; either change the
vec0 table creation to use distance_metric='cosine' so row.distance is a cosine
distance in [0,1], or keep the L2 metric and replace the mapping logic with a
proper normalization (e.g., convert cosine similarity from dot product or
clamp/scale distances into [0,1]) so the result of the rows.map (where
similarity is computed) always yields a 0..1 similarity; update the vec0 table
creation code or the rows.map similarity expression accordingly.
- Add missing beforeEach import in embeddings.test.ts - Update optional dependencies to current versions - Fix weights/variations length mismatch in expansion.ts - Avoid mutating input arrays in rrf.ts mergeRankings - Delete orphaned chunk vectors in vector-store.ts delete() and clear() - Add comment for empty catch block in vector-store.ts - Make hybridSearch helper use buildIndex for true hybrid search - Handle graceful degradation when embeddings unavailable - Add node-llama-cpp mock in hybrid.test.ts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@packages/core/src/search/hybrid.ts`:
- Around line 367-384: The hybridSearch helper can leak resources if
pipeline.search throws; wrap the sequence that uses the pipeline in a
try/finally so pipeline.dispose() always runs: call createHybridSearchPipeline()
and await pipeline.buildIndex(...) before entering a try block that performs
await pipeline.search({ query, ...options }) and returns the response, and call
await pipeline.dispose() in the finally block to ensure disposal even on errors
(references: hybridSearch, createHybridSearchPipeline, pipeline.buildIndex,
pipeline.search, pipeline.dispose).
🧹 Nitpick comments (9)
packages/core/src/search/expansion.ts (3)
10-10: Module-level cache is shared across allQueryExpanderinstances.The
expansionCacheis declared at module scope (Line 10), meaning all instances ofQueryExpandershare the same cache, even if they have differentcacheTtlMsconfigurations. This could lead to unexpected behavior when multiple pipelines are used with different TTL settings.♻️ Proposed fix: move cache to instance scope
-const expansionCache = new Map<string, CacheEntry>(); - export class QueryExpander { private modelManager: LocalModelManager; private model: unknown = null; private context: unknown = null; private initialized = false; private cacheTtlMs: number; + private cache: Map<string, CacheEntry> = new Map(); // Update all cache references from expansionCache to this.cache
237-240: Accessing private method via bracket notation bypasses encapsulation.
expandQuerySimpleaccesses the private methodexpandWithoutLLMusing bracket notation (expander['expandWithoutLLM']). This is a code smell that circumvents TypeScript's access modifiers. Consider either making the method public or extracting the synonym-based expansion logic into a standalone exported function.♻️ Proposed fix: extract standalone function
+// Standalone synonym-based expansion (no LLM required) +function expandWithSynonyms(query: string): ExpandedQuery { + const variations: string[] = []; + const queryLower = query.toLowerCase(); + // ... synonym logic from expandWithoutLLM ... + return { original: query, variations: limitedVariations, weights }; +} + export function expandQuerySimple(query: string): ExpandedQuery { - const expander = new QueryExpander(); - return expander['expandWithoutLLM'](query); + return expandWithSynonyms(query); }
194-207: Cache eviction removes oldest inserted key, not oldest by timestamp.
Map.keys().next().valuereturns the first key by insertion order, but entries can have vastly different timestamps due to TTL-based expiration checks. This means a recently accessed (and still valid) entry could be evicted while stale entries remain.♻️ Proposed fix: evict by oldest timestamp
private addToCache(query: string, expanded: ExpandedQuery): void { const maxCacheSize = 100; if (expansionCache.size >= maxCacheSize) { - const oldestKey = expansionCache.keys().next().value; - if (oldestKey) { - expansionCache.delete(oldestKey); + let oldestKey: string | null = null; + let oldestTime = Infinity; + for (const [key, entry] of expansionCache) { + if (entry.timestamp < oldestTime) { + oldestTime = entry.timestamp; + oldestKey = key; + } + } + if (oldestKey !== null) { + expansionCache.delete(oldestKey); } }packages/core/src/search/__tests__/hybrid.test.ts (2)
100-104:afterEachmay calldispose()on undefined pipeline ifbeforeEachfails.If the dynamic import or pipeline creation throws in
beforeEach,pipelinewill be undefined whenafterEachruns. The conditional check (if (pipeline)) handles this, but using optional chaining would be more defensive.♻️ Proposed improvement
afterEach(async () => { - if (pipeline) { - await pipeline.dispose(); - } + await pipeline?.dispose(); });
191-201: Test assertion usesifguard, making it non-deterministic.The test at Lines 191-201 wraps assertions in
if (response.query.expanded), which means if expansion fails silently, the test passes without verifying the expected behavior. Consider usingexpect(...).toBeDefined()followed by unconditional assertions, or use a non-null assertion after the check.♻️ Proposed fix: make assertion unconditional
it('should include expanded terms in response', async () => { const response = await pipeline.search({ query: 'auth', enableExpansion: true, }); - if (response.query.expanded) { - expect(response.query.expanded.original).toBe('auth'); - expect(response.query.expanded.weights[0]).toBe(2.0); - } + expect(response.query.expanded).toBeDefined(); + expect(response.query.expanded!.original).toBe('auth'); + expect(response.query.expanded!.weights[0]).toBe(2.0); });packages/core/src/search/hybrid.ts (2)
142-143: Empty catch block silently ignores embedding errors.Line 142-143 catches and ignores all errors during query embedding. While graceful degradation is good, consider logging a debug message to aid troubleshooting when embeddings unexpectedly fail.
♻️ Proposed fix: add debug logging
try { if (this.embeddingService.isInitialized()) { queryVector = await this.embeddingService.embed(query); } } catch { + // Embedding failed - will proceed with keyword-only search }
303-327: Reranker always adds 0.3 base score, inflating all scores.The
rerank()method adds an unconditional+ 0.3tobaseScore(Line 322), meaning even skills with zero matches get a score of 0.3. Combined withMath.min(1, ...), a skill matching all three criteria (description + tags + name) gets0.5 + 0.3 + 0.2 + 0.3 = 1.3→ clamped to1.0, while a partial match might get0.5 + 0.3 = 0.8. This reduces score differentiation.♻️ Proposed improvement: remove unconditional boost or make it configurable
const baseScore = (input.skillDescription.toLowerCase().includes(input.query.toLowerCase()) ? 0.5 : 0) + (input.skillTags.some((t) => t.toLowerCase().includes(input.query.toLowerCase()) ) ? 0.3 : 0) + (input.skillName.toLowerCase().includes(input.query.toLowerCase()) ? 0.2 : 0); outputs.push({ skillName: input.skillName, - score: Math.min(1, baseScore + 0.3), + score: Math.min(1, baseScore), });packages/core/src/search/vector-store.ts (2)
456-480:getStats()returns current timestamp aslastIndexedAt, not actual last index time.Line 476 always returns
new Date().toISOString()aslastIndexedAt, regardless of when indexing actually occurred. This could be misleading for monitoring/debugging purposes.♻️ Proposed fix: track actual last indexed time
export class VectorStore { // ... existing fields ... + private lastIndexedAt: string | null = null; async store(embedding: SkillEmbedding): Promise<void> { // ... existing code ... + this.lastIndexedAt = new Date().toISOString(); } getStats(): EmbeddingServiceStats { // ... return { totalSkillsIndexed: this.embeddings.size, totalChunks, indexSizeBytes, - lastIndexedAt: new Date().toISOString(), + lastIndexedAt: this.lastIndexedAt ?? new Date().toISOString(), modelName: 'nomic-embed-text-v1.5', embeddingDimensions: this.config.dimensions, }; }
477-477: Hardcoded model name ingetStats().
modelNameis hardcoded to'nomic-embed-text-v1.5'but the actual model used depends onEmbeddingServiceconfiguration. Consider retrieving this from the embedding service or making it configurable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| vectorSimilarity: vectorInput?.results.find((r) => r.skillName === skillName)?.score, | ||
| keywordScore: keywordInput?.results.find((r) => r.skillName === skillName)?.score, | ||
| rrfScore: rrfRankings.find((r) => r.skillName === skillName)?.rrfScore, | ||
| matchedTerms: [], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟡 matchedTerms always empty in HybridSearchResult
The matchedTerms field is always set to an empty array [] in hybrid search results, meaning users never see which search terms matched their query.
Click to expand
How the bug occurs
In hybrid.ts at line 285, the HybridSearchResult is constructed with:
matchedTerms: [],The keyword search via RecommendationEngine.search() at line 171-175 does compute matched terms, but this information is never extracted and passed through to the final results.
Impact
The CLI code at recommend.ts:553-556 checks result.matchedTerms.length > 0 to display matched terms, but this condition will always be false for hybrid search results:
if (result.matchedTerms.length > 0) {
console.log(` ${colors.dim('Matched:')} ${result.matchedTerms.join(', ')}`);
}Users will never see what terms matched their query, reducing the usefulness of verbose output.
Recommendation: Extract matchedTerms from the keyword search results. For each skill in the final results, look up the corresponding keyword search result and use its matchedTerms array.
Was this helpful? React with 👍 or 👎 to provide feedback.
|
|
||
| hybridResults.push({ | ||
| skill, | ||
| relevance: Math.round(hybridScore * 100), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟡 Relevance scores are extremely low (1-3%) without reranking due to raw RRF scores
When reranking is disabled, the relevance percentage shown to users is computed from raw RRF scores, resulting in misleadingly low values like 1-3% even for the best matches.
Click to expand
How the bug occurs
At line 207-210, raw RRF scores are stored in finalScores:
let finalScores = new Map<string, number>();
for (const ranking of rrfRankings) {
finalScores.set(ranking.skillName, ranking.rrfScore);
}At line 280, this score is multiplied by 100 to get a percentage:
relevance: Math.round(hybridScore * 100),The RRF formula is sum(1/(k+rank)) with k=60 by default. For a top-ranked result (rank=1) appearing in both vector and keyword rankers:
- RRF score = 2 × (1/61) ≈ 0.0328
- relevance = Math.round(0.0328 × 100) = 3%
Actual vs Expected
- Actual: Top results show 1-3% relevance
- Expected: Top results should show higher relevance scores (70-100%) to indicate good matches
Impact
Users will see confusingly low relevance percentages that don't reflect actual match quality. A perfect match might show as "3%" which suggests poor relevance.
Note: When reranking IS enabled, the scores are normalized (lines 235-248), partially mitigating this issue.
Recommendation: Normalize the RRF scores before converting to percentages. Either normalize to 0-1 range by dividing by the maximum RRF score, or use a different scoring approach that produces intuitive percentages.
Was this helpful? React with 👍 or 👎 to provide feedback.
- Fix gemma GGUF URL to point to valid community repo (bartowski) - Add SQL injection protection for table names in vector-store.ts - Fix resource leak with try/finally in hybridSearch helper - Add backpressure handling for large model downloads - Clamp startLine to non-negative in chunk calculation - Add runtime validation warning for --expand/--rerank without --hybrid - Update node-llama-cpp API usage (createEmbeddingContext, createContext) - Update pnpm-lock.yaml for new optional dependency versions
CI doesn't install optional dependencies, so @ts-ignore is needed instead of @ts-expect-error to suppress module not found errors.
Summary
New Search Module
embeddings.tsvector-store.tsrrf.tsexpansion.tshybrid.tslocal-models.tsCLI Integration
Test plan
skillkit recommend "auth" --hybridskillkit recommend --build-indexSummary by CodeRabbit
New Features
Tests