Skip to content

Conversation

@rohitg00
Copy link
Owner

@rohitg00 rohitg00 commented Feb 2, 2026

Summary

  • Implement hybrid search combining vector embeddings with keyword search
  • Add RRF (Reciprocal Rank Fusion) for combining multiple rankers
  • Add query expansion with LLM and synonym fallback
  • Add position-aware score blending for better ranking accuracy
  • Support local GGUF models (nomic-embed-text, gemma-2b) with auto-download

New Search Module

File Description
embeddings.ts Vector embedding generation using node-llama-cpp
vector-store.ts SQLite storage with sqlite-vec (in-memory fallback)
rrf.ts RRF fusion algorithm implementation
expansion.ts Query expansion with LLM and caching
hybrid.ts Main hybrid search pipeline
local-models.ts GGUF model management and auto-download

CLI Integration

# Enable hybrid search
skillkit recommend "authentication" --hybrid

# With query expansion
skillkit recommend "auth" --hybrid --expand

# With LLM re-ranking
skillkit recommend "form validation" --hybrid --rerank

# Build embedding index
skillkit recommend --build-index

Test plan

  • 56 new unit tests for search module (all passing)
  • Build succeeds for @skillkit/core and @skillkit/cli
  • All 796 existing tests pass
  • Manual test: skillkit recommend "auth" --hybrid
  • Manual test: skillkit recommend --build-index

Open with Devin

Summary by CodeRabbit

  • New Features

    • Hybrid search (vector + keyword) with optional query expansion and LLM reranking.
    • CLI index build command to create/rebuild hybrid search indexes and show progress.
    • Public search APIs and local model management for embeddings and query expansion.
  • Tests

    • Extensive unit and integration tests covering embeddings, expansion, hybrid search pipeline, ranking (RRF), vector store, and utilities.

Implement a hybrid search system combining vector embeddings with keyword search:

- Vector embeddings using node-llama-cpp with GGUF models (nomic-embed-text)
- RRF (Reciprocal Rank Fusion) for combining multiple rankers
- Query expansion with LLM and synonym fallback
- Position-aware score blending (different weights for top3, top10, rest)
- SQLite storage with sqlite-vec extension (falls back to in-memory)
- Auto-download of local GGUF models (~2GB)

New search module:
- embeddings.ts: Vector embedding generation service
- vector-store.ts: SQLite-based vector storage
- rrf.ts: RRF fusion algorithm
- expansion.ts: Query expansion with caching
- hybrid.ts: Main hybrid search pipeline
- local-models.ts: GGUF model management

CLI integration:
- skillkit recommend --hybrid: Enable hybrid search
- skillkit recommend --expand: Enable query expansion
- skillkit recommend --rerank: Enable LLM re-ranking
- skillkit recommend --build-index: Build embedding index

Tests: 56 new tests for search module (all passing)
@vercel
Copy link

vercel bot commented Feb 2, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
skillkit Ready Ready Preview, Comment Feb 2, 2026 11:24pm
skillkit-docs Ready Ready Preview, Comment Feb 2, 2026 11:24pm

@coderabbitai
Copy link

coderabbitai bot commented Feb 2, 2026

Warning

Rate limit exceeded

@rohitg00 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 42 minutes and 5 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📝 Walkthrough

Walkthrough

Implements a hybrid search subsystem: local model management, embedding service, SQLite-backed vector store (optional sqlite-vec), query expansion (LLM-backed with fallback), RRF fusion and reranking, a HybridSearchPipeline plus helpers, integration into RecommendationEngine and CLI, and comprehensive tests for the new search components.

Changes

Cohort / File(s) Summary
CLI Integration
packages/cli/src/commands/recommend.ts
Added --hybrid, --expand, --rerank, --build-index flags; added handleHybridSearch() and buildHybridIndex() methods and control flow to route hybrid queries/index builds.
Package exports & manifest
packages/core/package.json, packages/core/src/index.ts
Added ./search export; added optionalDependencies (node-llama-cpp, better-sqlite3, sqlite-vec) and dev type for better-sqlite3. Re-exported search from core index.
Search public barrel
packages/core/src/search/index.ts
Created barrel re-exporting types, local-models, embeddings, vector-store, rrf, expansion, and hybrid modules.
Types & config
packages/core/src/search/types.ts
New comprehensive type definitions and constants: LocalModelConfig, SkillChunk, SkillEmbedding, ExpandedQuery, RRFRanking, HybridSearchOptions/Result/Response, EmbeddingServiceStats, MODEL_REGISTRY, DEFAULT_CHUNKING_CONFIG, DEFAULT_HYBRID_CONFIG, etc.
Local model manager
packages/core/src/search/local-models.ts
Introduced LocalModelManager with model path resolution, availability checks, download workflow, ensureEmbed/ensureLlm, getPublicModelInfo, clearModels, and related utilities.
Embedding service
packages/core/src/search/embeddings.ts
Added EmbeddingService managing local embedding model load, embed/embedBatch, chunking, embedSkill(s), cosineSimilarity, lifecycle and error handling for missing optional dependency.
Vector store
packages/core/src/search/vector-store.ts
New VectorStore with SQLite-backed storage, optional sqlite-vec acceleration, fallback in-memory cosine search, store/storeBatch, search/searchChunks, management (get, delete, clear), stats, and lifecycle methods.
Query expansion
packages/core/src/search/expansion.ts
Added QueryExpander with LLM-backed generation, deterministic fallback, TTL cache, initialize/expand/clearCache/dispose APIs and createQueryExpander/expandQuerySimple helpers.
RRF ranking
packages/core/src/search/rrf.ts
Implemented RRF utilities: computeRRFScore, fuseWithRRF, normalizeScores, weightedCombine, position-aware blending, getRankFromScore, mergeRankings.
Hybrid pipeline
packages/core/src/search/hybrid.ts
New HybridSearchPipeline coordinating embeddingService, vectorStore, queryExpander, recommendation engine; supports initialize, buildIndex, search (vector+keyword fusion), optional rerank, stats, and helper hybridSearch().
Recommendation engine integration
packages/core/src/recommend/engine.ts, packages/core/src/recommend/types.ts
Extended RecommendationEngine with hybridPipeline field, initHybridSearch(), isHybridSearchAvailable(), hybridSearch(), buildHybridIndex(); added RecommendHybridSearchOptions/Result types.
Tests
packages/core/src/search/__tests__/*.test.ts
Added extensive tests for embeddings, expansion, hybrid pipeline, and RRF (unit and integration-like tests covering initialization, behavior, edge cases, caching, and scoring).

Sequence Diagram(s)

sequenceDiagram
    participant CLI as CLI
    participant Engine as RecommendationEngine
    participant Pipeline as HybridSearchPipeline
    participant Embed as EmbeddingService
    participant VecStore as VectorStore
    participant Expand as QueryExpander
    participant RRF as RRF

    CLI->>Engine: recommend --hybrid query
    Engine->>Pipeline: ensure initialized
    Pipeline->>Embed: initialize()
    Pipeline->>VecStore: initialize()
    Pipeline->>Expand: initialize() (if expand enabled)
    Pipeline->>Expand: expand(query) (optional)
    Expand-->>Pipeline: ExpandedQuery
    Pipeline->>Embed: embed(query)
    Embed-->>Pipeline: queryVector
    Pipeline->>VecStore: search(queryVector)
    VecStore-->>Pipeline: vectorResults
    Pipeline->>Pipeline: keywordSearch(query) → keywordResults
    Pipeline->>RRF: fuseWithRRF(vectorResults, keywordResults)
    RRF-->>Pipeline: fusedResults
    Pipeline->>Pipeline: rerank(topResults) (optional)
    Pipeline-->>Engine: HybridSearchResponse
    Engine-->>CLI: print results
Loading
sequenceDiagram
    participant CLI as CLI
    participant Engine as RecommendationEngine
    participant Pipeline as HybridSearchPipeline
    participant Embed as EmbeddingService
    participant VecStore as VectorStore

    CLI->>Engine: recommend --build-index
    Engine->>Pipeline: ensure initialized
    Pipeline->>Embed: initialize(onProgress)
    Embed-->>Pipeline: ready
    Pipeline->>Engine: load current skill index
    Pipeline->>Embed: embedSkills(allSkills, onProgress)
    Embed-->>Pipeline: SkillEmbeddings
    Pipeline->>VecStore: initialize()
    VecStore->>Pipeline: ready
    Pipeline->>VecStore: storeBatch(embeddings, onProgress)
    VecStore-->>Pipeline: persisted
    Pipeline-->>Engine: build complete
    Engine-->>CLI: report output location
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐰 In SQLite meadows vectors nest,

I nibble queries, expand and test,
Fusion stirs where keywords meet,
Rerank hops, results taste sweet,
Hybrid carrots—search complete! 🌿

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(core): add QMD-inspired hybrid search pipeline' clearly summarizes the main change, describing the addition of a QMD-inspired hybrid search feature to the core package.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/hybrid-search

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View issues and 5 additional flags in Devin Review.

Open in Devin Review

Comment on lines +223 to +242
if (embedding.chunks) {
db.prepare(`
DELETE FROM ${this.config.tableName}_chunks WHERE skill_name = ?
`).run(embedding.skillName);

for (const chunk of embedding.chunks) {
const result = db.prepare(`
INSERT INTO ${this.config.tableName}_chunks
(skill_name, content, start_line, end_line)
VALUES (?, ?, ?, ?)
`).run(embedding.skillName, chunk.content, chunk.startLine, chunk.endLine);

const rowId = result.lastInsertRowid ?? 0;
db.prepare(`
INSERT INTO ${this.config.tableName}_chunks_vec
(chunk_id, embedding)
VALUES (?, ?)
`).run(rowId, Buffer.from(chunk.vector.buffer));
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Orphaned chunk vector records when updating/deleting skills in sqlite-vec mode

When using sqlite-vec mode, chunk vector embeddings in the _chunks_vec table are never cleaned up during store, delete, or clear operations, leading to orphaned records and potential database bloat.

Click to expand

Issue Details

The VectorStore class creates a _chunks_vec virtual table (line 109) to store chunk embeddings when using sqlite-vec. However:

  1. In store() (lines 223-241), when re-storing a skill with chunks, only the _chunks table is deleted before re-inserting:

    db.prepare(`
      DELETE FROM ${this.config.tableName}_chunks WHERE skill_name = ?
    `).run(embedding.skillName);

    The _chunks_vec table entries are not deleted, causing orphaned vector records.

  2. In delete() (lines 420-427), the _chunks_vec table is never cleaned up.

  3. In clear() (lines 438-445), the _chunks_vec table is never cleaned up.

Impact

  • Database grows unbounded with orphaned chunk vector records
  • Potential incorrect search results if orphaned vectors are matched
  • Storage waste over time as skills are updated/deleted
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 15

🤖 Fix all issues with AI agents
In `@packages/core/package.json`:
- Around line 60-67: Tighten the node-llama-cpp dependency and align fallback
behavior: change the node-llama-cpp semver in package.json to restrict upgrades
(e.g., ^3.0.0 <3.7.0) unless you intend to raise the engines field to ">=20.0.0"
and document that breaking change; then update expansion.ts and embeddings.ts to
mirror the safe import/fallback pattern used in vector-store.ts (i.e., attempt
to require/import node-llama-cpp inside a try/catch or guarded lazy loader, set
a null/undefined client when unavailable, and throw only when actually calling
functionality that requires it) so missing node-llama-cpp degrades gracefully
like better-sqlite3 and sqlite-vec.

In `@packages/core/src/search/__tests__/embeddings.test.ts`:
- Around line 1-11: The test file uses beforeEach in the EmbeddingService tests
but does not import it from Vitest; update the import statement at the top of
packages/core/src/search/__tests__/embeddings.test.ts to include beforeEach
alongside describe, it, and expect so the test suite's beforeEach hook for
creating a new EmbeddingService instance works without a ReferenceError.

In `@packages/core/src/search/__tests__/expansion.test.ts`:
- Around line 76-82: The test currently asserts result1.original !==
result2.original but should assert they are the same to confirm case-insensitive
caching; update the test in expansion.test.ts to call expander.expand('Auth')
then expander.expand('AUTH') and assert the second call returns the same cached
ExpandedQuery (same object or same .original value) as the first—this verifies
getFromCache (which uses query.toLowerCase()) correctly returns the cached entry
from expansion.ts.

In `@packages/core/src/search/embeddings.ts`:
- Around line 227-229: The calculation of startLine can produce a negative value
when overlapLines.length > i; update the assignment in the overlap handling
(where currentChunk, currentChars and startLine are set) to clamp startLine to a
minimum of 0 (e.g., startLine = Math.max(0, i - overlapLines.length)) and ensure
any downstream logic that uses startLine (references to currentChunk,
currentChars, overlapLines, overlapCharsCount, and i) still behaves correctly
with the clamped value.

In `@packages/core/src/search/expansion.ts`:
- Around line 171-178: The weights array can be longer than the returned
variations because weights is built from the full variations list before
slicing; update the logic in the function that builds the ExpandedQuery so you
first compute a slicedVariations = variations.slice(0, 3) (or reuse a similar
variable) and then construct weights as [2.0, ...slicedVariations.map(() =>
1.0)], and return original: query, variations: slicedVariations, weights so the
weights length always matches the returned variations.
- Around line 159-169: The loop that builds variations uses new RegExp(term,
'i') with user-controlled term (variables: term, terms, query, synonyms,
variations), creating a ReDoS risk; fix by escaping regex metacharacters in term
before constructing the RegExp (implement/inline an escapeRegExp function that
replaces characters like . * + ? ^ $ { } ( ) | [ ] \ / with backslash-escaped
versions) and then use new RegExp(escapedTerm, 'i') when creating the variation;
ensure the rest of the logic (slice(0,2), variation !== query,
!variations.includes) remains unchanged.

In `@packages/core/src/search/hybrid.ts`:
- Around line 270-272: The relevance field in the object pushed to hybridResults
can exceed 100 because hybridScore may be >1.0; clamp the computed relevance to
the 0–100 range before pushing. Locate the hybridResults.push call that uses
hybridScore (the object with keys skill and relevance) and replace the raw
Math.round(hybridScore * 100) with a clamped value (e.g., compute const
relevance = Math.min(100, Math.max(0, Math.round(hybridScore * 100))) and use
that) so relevance never falls outside 0–100.

In `@packages/core/src/search/rrf.ts`:
- Around line 153-163: The code mutates input arrays by pushing into
existing.ranks (from mergedScores) which may reference original ranking.ranks;
update the merge logic to never mutate caller data: when creating a new
mergedScores entry use ranks: [...ranking.ranks] and when appending to an
existing entry replace its ranks with a new array (e.g., existing.ranks =
[...existing.ranks, ...ranking.ranks]) instead of
existing.ranks.push(...ranking.ranks); update the block handling
secondaryRankings / mergedScores to always copy ranking.ranks into new arrays.

In `@packages/core/src/search/types.ts`:
- Around line 129-145: The gemma-2b-it entry in MODEL_REGISTRY points to a gated
Hugging Face URL causing 401s; update the registry and download flow to handle
gated models by (1) replacing 'gemma-2b-it-Q4_K_M.gguf' URL with an accessible
community GGUF mirror or mark the entry as gated (e.g., add a gated: true flag)
in MODEL_REGISTRY, and (2) enhance the downloader in local-models.ts to support
authenticated downloads (read HF token from env/credentials and send
Authorization header), add simple retry logic for transient HTTP errors, and
implement a fallback sequence that tries the primary URL then the community
mirror when gated or unauthorized. Reference MODEL_REGISTRY
(llm['gemma-2b-it-Q4_K_M.gguf']) and the downloader in local-models.ts to locate
where to apply these changes.

In `@packages/core/src/search/vector-store.ts`:
- Around line 145-191: The empty catch in loadEmbeddingsFromDb silently swallows
DB errors; update the catch to surface or log the error (e.g., via
this.logger.error or rethrow) so failures when reading from this.db (used with
usingSqliteVec and the prepared queries on this.config.tableName/_meta) are
visible; ensure the error message includes context (function name, tableName,
skill_name or row context) and preserve existing behavior for successful rows
while not masking exceptions thrown during the db.prepare/.all or vector
conversion.
- Around line 430-446: The clear() method fails to remove entries from the
chunks vector table in sqlite-vec mode; update the logic in clear() (the method
on the vector store class using this.usingSqliteVec and this.config.tableName)
so that when this.usingSqliteVec is true you also execute a DELETE on
`${this.config.tableName}_chunks_vec` (in addition to `_meta`, `_vec`, and
`_chunks`), ensuring all sqlite-vec related tables are cleared; keep the
existing non-sqlite branch unchanged.
- Around line 410-428: The delete method (async delete(skillName: string)) fails
to remove chunk vector rows when this.usingSqliteVec is true, leaving orphaned
entries in the _chunks_vec table; update the sqlite-vec branch inside delete to
also run a DELETE against `${this.config.tableName}_chunks_vec` (similar to the
existing deletes for `_meta` and `_vec`), ensuring you call
db.prepare(...).run(skillName) for that table when this.usingSqliteVec is true
so all related tables (`_meta`, `_vec`, `_chunks`, and `_chunks_vec`) are
cleaned up.
- Around line 75-143: The SQL injection risk comes from interpolating
this.config.tableName into SQL in initializeSqliteVecTables and
initializeFallbackTables; before using it, validate or sanitize the table name
(e.g., enforce an allowlist or a strict identifier regex like
/^[A-Za-z_][A-Za-z0-9_]*$/) and throw an error if it doesn't match, so only safe
SQL identifiers are used; alternatively restrict tableName to configured
internal values or map user input to predetermined safe names rather than
directly interpolating user-provided strings.
- Around line 164-171: The loop that populates this.embeddings currently sets
each entry's vector to new Float32Array(this.config.dimensions) (all zeros)
instead of reading the actual vector bytes from the sqlite-vec storage; update
the loader in vector-store.ts (the block iterating over rows and setting
this.embeddings) to fetch the corresponding vector data from the _vec table (or
the query result that contains the vector column), decode/convert it into a
Float32Array of length this.config.dimensions, and assign that to the vector
field for each entry (preserve skillName, textContent, generatedAt); ensure you
reference the same key (row.skill_name) when joining or looking up the _vec data
so in-memory searches use the real vectors rather than zeroed arrays.
- Around line 337-341: The similarity calculation currently uses "similarity: 1
- row.distance" (in the rows.map mapping) but the vec0 table was created without
a distance_metric so it defaults to L2 (unbounded), causing negative
similarities; either change the vec0 table creation to use
distance_metric='cosine' so row.distance is a cosine distance in [0,1], or keep
the L2 metric and replace the mapping logic with a proper normalization (e.g.,
convert cosine similarity from dot product or clamp/scale distances into [0,1])
so the result of the rows.map (where similarity is computed) always yields a
0..1 similarity; update the vec0 table creation code or the rows.map similarity
expression accordingly.
🧹 Nitpick comments (19)
packages/core/src/recommend/types.ts (1)

330-349: Consider aliasing to the search module’s hybrid types to avoid drift.

If these mirror HybridSearchOptions/Result in packages/core/src/search/types.ts, a type alias (or explicit doc note) will prevent divergence over time.

packages/core/src/search/__tests__/embeddings.test.ts (1)

73-216: Avoid re-implementing production logic inside tests.

buildSkillText and chunkText are recreated locally, so these tests won’t catch regressions in EmbeddingService. Consider asserting via the public API (e.g., methods that invoke these helpers) or exporting the helpers for test use.

packages/core/src/search/__tests__/expansion.test.ts (1)

85-92: Test for clearCache does not verify the cache was actually cleared.

The test only checks that expand still works after clearing. Consider verifying the cache was cleared by checking that a subsequent call produces a fresh result (e.g., by comparing object identity or confirming the expansion logic ran again).

💡 Suggested improvement
   describe('clearCache', () => {
     it('should clear the expansion cache', async () => {
-      await expander.expand('auth');
+      const result1 = await expander.expand('auth');
       expander.clearCache();
 
       const result = await expander.expand('auth');
       expect(result).toBeDefined();
+      // After clearing, we should get a new object, not the cached one
+      expect(result).not.toBe(result1);
     });
   });
packages/cli/src/commands/recommend.ts (1)

571-604: Hardcoded storage path may not match actual implementation.

Line 595 displays ~/.skillkit/search.db as the index location, but this should ideally be retrieved from the actual configuration or LocalModelManager to ensure consistency.

packages/core/src/search/embeddings.ts (1)

79-85: Sequential embedding in embedBatch may be slow for large batches.

The current implementation processes texts one at a time. If node-llama-cpp supports batch embedding, consider leveraging it for better performance. Otherwise, this is acceptable as a baseline implementation.

packages/core/src/recommend/engine.ts (1)

665-677: Post-filtering may return fewer results than requested limit.

Filters are applied after the hybrid pipeline returns results, which may reduce the count below the requested limit. Consider passing filters to the pipeline or fetching extra candidates to compensate.

This is a minor UX concern rather than a bug, as the user may not get the expected number of results when filters are active.

packages/core/src/search/rrf.ts (1)

129-136: getRankFromScore re-sorts on every call.

If this function is called frequently in a loop, consider pre-sorting once and passing the sorted array, or caching the sorted result.

packages/core/src/search/hybrid.ts (3)

130-136: Silent catch block swallows embedding errors.

If embedding fails, the code silently continues without the vector search component. Consider logging the error or surfacing it in stats for debugging purposes.

💡 Suggested improvement
     try {
       if (this.embeddingService.isInitialized()) {
         queryVector = await this.embeddingService.embed(query);
       }
-    } catch {
+    } catch (error) {
+      // Log for debugging - vector search will be skipped
+      console.debug?.('Embedding failed, falling back to keyword-only search:', error);
     }

359-384: Helper function creates and disposes pipeline per call.

The hybridSearch convenience function creates a new pipeline for each call, which is inefficient for repeated searches. Consider documenting this or providing guidance to use HybridSearchPipeline directly for batch operations.


295-319: Reranker uses a simple heuristic rather than an LLM.

The rerank method uses substring matching with fixed weights (+0.3 base score). This is a reasonable placeholder but differs from the PR description mentioning "LLM reranking". Consider adding a comment or updating documentation to clarify this is a heuristic-based approach.

packages/core/src/search/types.ts (1)

13-19: tokenCount field in SkillChunkSchema is not populated by chunkText.

The schema includes tokenCount but the chunking logic in embeddings.ts doesn't calculate or include this field in the returned chunks. Consider either removing it from the schema or adding token counting to chunkText.

packages/core/src/search/local-models.ts (3)

109-139: Consider cleaning up the reader on error.

When an error occurs during the download loop, the reader is not released before cleanup. While this likely gets garbage collected, explicitly canceling the reader is a best practice for resource cleanup.

♻️ Proposed fix
     } catch (error) {
+      reader.cancel().catch(() => {}); // Best-effort cleanup
       if (fs.existsSync(tempPath)) {
         await fs.promises.unlink(tempPath);
       }
       throw error;
     }

141-159: Potential stale cache if the model file is deleted externally.

If this.embedModelPath is cached but the file is later deleted externally, ensureEmbedModel returns the stale path because Line 142 checks fs.existsSync(this.embedModelPath) first. However, Line 146 rechecks isEmbedModelAvailable() which would catch this. The logic is correct but slightly redundant. Consider simplifying to always use isEmbedModelAvailable().

♻️ Simplified approach
   async ensureEmbedModel(onProgress?: IndexBuildCallback): Promise<string> {
-    if (this.embedModelPath && fs.existsSync(this.embedModelPath)) {
-      return this.embedModelPath;
-    }
-
-    if (!this.isEmbedModelAvailable()) {
+    if (!this.isEmbedModelAvailable()) {
       if (!this.config.autoDownload) {
         throw new Error(
           `Embedding model not found at ${this.getEmbedModelPath()}. ` +
           'Set autoDownload: true or manually download the model.'
         );
       }
       this.embedModelPath = await this.downloadModel('embed', onProgress);
     } else {
       this.embedModelPath = this.getEmbedModelPath();
     }

     return this.embedModelPath;
   }

230-236: Edge case: formatBytes returns Infinity GB for very large values.

If bytes exceeds the range where Math.log(bytes) / Math.log(k) yields an index beyond the sizes array (e.g., terabytes), sizes[i] will be undefined. While unlikely for model files, consider adding bounds checking or extending the sizes array.

♻️ Proposed fix with bounds check
 export function formatBytes(bytes: number): string {
   if (bytes === 0) return '0 Bytes';
   const k = 1024;
-  const sizes = ['Bytes', 'KB', 'MB', 'GB'];
-  const i = Math.floor(Math.log(bytes) / Math.log(k));
+  const sizes = ['Bytes', 'KB', 'MB', 'GB', 'TB'];
+  const i = Math.min(Math.floor(Math.log(bytes) / Math.log(k)), sizes.length - 1);
   return `${parseFloat((bytes / Math.pow(k, i)).toFixed(2))} ${sizes[i]}`;
 }
packages/core/src/search/expansion.ts (3)

1-11: Module-level cache persists across QueryExpander instances.

expansionCache is a module-level Map, meaning all QueryExpander instances share the same cache. This could lead to unexpected behavior if different instances use different TTLs or if one instance's clearCache() affects others. Consider making the cache instance-specific.

♻️ Make cache instance-specific
-const expansionCache = new Map<string, CacheEntry>();
-
 export class QueryExpander {
   private modelManager: LocalModelManager;
   private model: unknown = null;
   private context: unknown = null;
   private initialized = false;
   private cacheTtlMs: number;
+  private cache = new Map<string, CacheEntry>();

   // Then update all references from expansionCache to this.cache

193-206: Cache eviction removes oldest inserted, not oldest accessed (not LRU).

The current strategy removes the first key inserted when the cache is full. This is FIFO, not LRU. If a frequently accessed entry was inserted first, it will be evicted even if recently used. Consider tracking access time or using an LRU structure if access patterns matter.


236-239: expandQuerySimple accesses a private method via bracket notation.

Using expander['expandWithoutLLM'](query) bypasses TypeScript's access control. This is a code smell that could break if the private method is renamed. Consider making expandWithoutLLM a standalone exported function or exposing it through a public method.

♻️ Extract as standalone function
+// Move expandWithoutLLM logic to a standalone function
+function expandWithoutLLMImpl(query: string): ExpandedQuery {
+  // ... existing expandWithoutLLM logic ...
+}
+
 export class QueryExpander {
   // ...
   private expandWithoutLLM(query: string): ExpandedQuery {
-    // ... existing implementation ...
+    return expandWithoutLLMImpl(query);
   }
 }

 export function expandQuerySimple(query: string): ExpandedQuery {
-  const expander = new QueryExpander();
-  return expander['expandWithoutLLM'](query);
+  return expandWithoutLLMImpl(query);
 }
packages/core/src/search/vector-store.ts (2)

277-304: storeBatch performs sequential stores without transaction.

Each store call is executed sequentially and independently. If an error occurs mid-batch, partial data remains committed. Consider wrapping in a transaction for atomicity, especially for batch operations.

♻️ Wrap batch operations in a transaction
   async storeBatch(
     embeddings: SkillEmbedding[],
     onProgress?: IndexBuildCallback
   ): Promise<void> {
     await this.initialize();

+    if (this.db) {
+      const db = this.db as { exec(sql: string): void };
+      db.exec('BEGIN TRANSACTION');
+    }
+
+    try {
       const total = embeddings.length;
       for (let i = 0; i < embeddings.length; i++) {
         // ... existing logic ...
       }
+      if (this.db) {
+        (this.db as { exec(sql: string): void }).exec('COMMIT');
+      }
+    } catch (error) {
+      if (this.db) {
+        (this.db as { exec(sql: string): void }).exec('ROLLBACK');
+      }
+      throw error;
+    }
   }

448-472: getStats() hardcodes model name.

Line 469 returns modelName: 'nomic-embed-text-v1.5' regardless of the actual model used. This should ideally come from the EmbeddingService or be configurable.

Comment on lines 60 to 67
"optionalDependencies": {
"node-llama-cpp": "^3.0.0",
"better-sqlite3": "^11.0.0",
"sqlite-vec": "^0.1.0"
},
"devDependencies": {
"@types/better-sqlite3": "^7.6.11",
"@types/node": "^22.10.5",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Check metadata for optional native deps
npm view node-llama-cpp@^3.0.0 engines os cpu 2>&1 | head -50
echo "---"
npm view better-sqlite3@^11.0.0 engines os cpu 2>&1 | head -50
echo "---"
npm view sqlite-vec@^0.1.0 engines os cpu 2>&1 | head -50

Repository: rohitg00/skillkit

Length of output: 1553


🏁 Script executed:

# Search for imports and usage of these optional dependencies
rg "node-llama-cpp|better-sqlite3|sqlite-vec" --type js --type ts --type json -B 2 -A 2

Repository: rohitg00/skillkit

Length of output: 5201


🏁 Script executed:

# Check if there are conditional imports or try-catch blocks for these packages
rg "require.*node-llama-cpp|require.*better-sqlite3|require.*sqlite-vec|import.*node-llama-cpp|import.*better-sqlite3|import.*sqlite-vec" --type js --type ts -B 3 -A 3

Repository: rohitg00/skillkit

Length of output: 2141


🏁 Script executed:

# Check the package.json file for context
cat -n packages/core/package.json | sed -n '55,75p'

Repository: rohitg00/skillkit

Length of output: 669


🏁 Script executed:

# Get the full error handling in vector-store.ts
cat -n packages/core/src/search/vector-store.ts | grep -A 30 "sqlite-vec.*optional"

Repository: rohitg00/skillkit

Length of output: 1235


🌐 Web query:

better-sqlite3 Node.js version requirements engines compatibility

💡 Result:

  • Current better-sqlite3 (v12.4.6) Node.js engine constraint (npm engines): node: "20.x || 22.x || 23.x || 24.x || 25.x" — notably Node 21.x is not included. [1]
  • Project docs/README minimum: states “Requires Node.js v14.21.1 or later”, and notes prebuilt binaries are available for LTS versions (but the actual install-time compatibility is governed by the published version’s package.json / native ABI). [2]
  • Older major line (example: v11.9.0): its published package.json on unpkg shows no engines field, so npm won’t enforce a Node version via engines for that release line (though it still may or may not build/run on your Node due to native ABI). [3]

Sources:
[1] better-sqlite3@12.4.6 package.json (engines) (app.unpkg.com)
[2] better-sqlite3 README / install notes (github.com)
[3] better-sqlite3@11.9.0 package.json (no engines) (app.unpkg.com)

Citations:


🌐 Web query:

sqlite-vec Node.js version requirements engines

💡 Result:

The sqlite-vec npm distribution does not declare a Node.js version requirement via package.json#engines (at least in the published 0.1.7-alpha.2 artifacts). For example, the platform packages’ package.json files contain no engines field—only metadata like os/cpu and entrypoints. [1], [2]

Practically, that means npm won’t enforce a minimum Node version for sqlite-vec itself; your effective Node version requirement will come from whichever SQLite driver you use to load the extension (e.g., better-sqlite3 / sqlite3). The upstream project documents installing via npm install sqlite-vec for Node.js. [3]

Sources: [1], [2], [3]


Tighten node-llama-cpp version constraint and ensure consistent fallback behavior.

The node-llama-cpp range ^3.0.0 can resolve to versions 3.7.0+ which require Node >=20.0.0, but the package declares engines Node >=18.0.0. This creates an incompatibility: installation will fail on Node 18.x.

Additionally, fallback behavior is inconsistent—better-sqlite3 and sqlite-vec gracefully degrade when missing, but node-llama-cpp throws an error in expansion.ts and embeddings.ts rather than following the same fallback pattern used in vector-store.ts.

Either:

  • Narrow node-llama-cpp to ^3.0.0 <3.7.0 to maintain Node 18 compatibility, or
  • Update the engines field to >=20.0.0 and document the breaking change.
🤖 Prompt for AI Agents
In `@packages/core/package.json` around lines 60 - 67, Tighten the node-llama-cpp
dependency and align fallback behavior: change the node-llama-cpp semver in
package.json to restrict upgrades (e.g., ^3.0.0 <3.7.0) unless you intend to
raise the engines field to ">=20.0.0" and document that breaking change; then
update expansion.ts and embeddings.ts to mirror the safe import/fallback pattern
used in vector-store.ts (i.e., attempt to require/import node-llama-cpp inside a
try/catch or guarded lazy loader, set a null/undefined client when unavailable,
and throw only when actually calling functionality that requires it) so missing
node-llama-cpp degrades gracefully like better-sqlite3 and sqlite-vec.

Comment on lines +76 to +82
it('should handle case-insensitive caching', async () => {
const result1 = await expander.expand('Auth');
const result2 = await expander.expand('AUTH');

expect(result1.original).toBe('Auth');
expect(result2.original).toBe('AUTH');
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Case-insensitive caching test doesn't verify caching behavior correctly.

The test checks that result1.original differs from result2.original, but this actually demonstrates that the cache is returning different results for the same normalized key (both "Auth" and "AUTH" normalize to "auth"). Per the implementation in expansion.ts, getFromCache uses query.toLowerCase() as the key, so the second call should return the cached ExpandedQuery from the first call.

If the intent is to verify case-insensitive caching works, the assertion should verify both calls return the same cached object:

💡 Suggested fix
   it('should handle case-insensitive caching', async () => {
     const result1 = await expander.expand('Auth');
     const result2 = await expander.expand('AUTH');

-    expect(result1.original).toBe('Auth');
-    expect(result2.original).toBe('AUTH');
+    // Cache key is normalized to lowercase, so both should return the same cached result
+    expect(result1).toBe(result2);
+    expect(result1.original).toBe('Auth'); // Original case preserved from first call
   });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
it('should handle case-insensitive caching', async () => {
const result1 = await expander.expand('Auth');
const result2 = await expander.expand('AUTH');
expect(result1.original).toBe('Auth');
expect(result2.original).toBe('AUTH');
});
it('should handle case-insensitive caching', async () => {
const result1 = await expander.expand('Auth');
const result2 = await expander.expand('AUTH');
// Cache key is normalized to lowercase, so both should return the same cached result
expect(result1).toBe(result2);
expect(result1.original).toBe('Auth'); // Original case preserved from first call
});
🤖 Prompt for AI Agents
In `@packages/core/src/search/__tests__/expansion.test.ts` around lines 76 - 82,
The test currently asserts result1.original !== result2.original but should
assert they are the same to confirm case-insensitive caching; update the test in
expansion.test.ts to call expander.expand('Auth') then expander.expand('AUTH')
and assert the second call returns the same cached ExpandedQuery (same object or
same .original value) as the first—this verifies getFromCache (which uses
query.toLowerCase()) correctly returns the cached entry from expansion.ts.

Comment on lines +159 to +169
for (const term of terms) {
const termSynonyms = synonyms[term];
if (termSynonyms) {
for (const syn of termSynonyms.slice(0, 2)) {
const variation = query.replace(new RegExp(term, 'i'), syn);
if (variation !== query && !variations.includes(variation)) {
variations.push(variation);
}
}
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential ReDoS vulnerability with user-controlled regex pattern.

The term variable comes from user input (query.split()), which is used directly in new RegExp(term, 'i'). Malicious input with special regex characters could cause catastrophic backtracking.

🛡️ Proposed fix to escape regex special characters
+function escapeRegExp(str: string): string {
+  return str.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
+}
+
 private expandWithoutLLM(query: string): ExpandedQuery {
   // ... existing code ...
   for (const term of terms) {
     const termSynonyms = synonyms[term];
     if (termSynonyms) {
       for (const syn of termSynonyms.slice(0, 2)) {
-        const variation = query.replace(new RegExp(term, 'i'), syn);
+        const variation = query.replace(new RegExp(escapeRegExp(term), 'i'), syn);
         if (variation !== query && !variations.includes(variation)) {
           variations.push(variation);
         }
       }
     }
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for (const term of terms) {
const termSynonyms = synonyms[term];
if (termSynonyms) {
for (const syn of termSynonyms.slice(0, 2)) {
const variation = query.replace(new RegExp(term, 'i'), syn);
if (variation !== query && !variations.includes(variation)) {
variations.push(variation);
}
}
}
}
function escapeRegExp(str: string): string {
return str.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
}
for (const term of terms) {
const termSynonyms = synonyms[term];
if (termSynonyms) {
for (const syn of termSynonyms.slice(0, 2)) {
const variation = query.replace(new RegExp(escapeRegExp(term), 'i'), syn);
if (variation !== query && !variations.includes(variation)) {
variations.push(variation);
}
}
}
}
🧰 Tools
🪛 ast-grep (0.40.5)

[warning] 162-162: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(term, 'i')
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html

(regexp-from-variable)

🤖 Prompt for AI Agents
In `@packages/core/src/search/expansion.ts` around lines 159 - 169, The loop that
builds variations uses new RegExp(term, 'i') with user-controlled term
(variables: term, terms, query, synonyms, variations), creating a ReDoS risk;
fix by escaping regex metacharacters in term before constructing the RegExp
(implement/inline an escapeRegExp function that replaces characters like . * + ?
^ $ { } ( ) | [ ] \ / with backslash-escaped versions) and then use new
RegExp(escapedTerm, 'i') when creating the variation; ensure the rest of the
logic (slice(0,2), variation !== query, !variations.includes) remains unchanged.

Comment on lines 145 to 191
private async loadEmbeddingsFromDb(): Promise<void> {
if (!this.db) return;

const db = this.db as {
prepare(sql: string): { all(): Array<{
skill_name: string;
text_content: string;
vector?: Buffer;
generated_at: string;
}> };
};

try {
if (this.usingSqliteVec) {
const rows = db.prepare(`
SELECT skill_name, text_content, generated_at
FROM ${this.config.tableName}_meta
`).all();

for (const row of rows) {
this.embeddings.set(row.skill_name, {
skillName: row.skill_name,
textContent: row.text_content,
vector: new Float32Array(this.config.dimensions),
generatedAt: row.generated_at,
});
}
} else {
const rows = db.prepare(`
SELECT skill_name, text_content, vector, generated_at
FROM ${this.config.tableName}
`).all();

for (const row of rows) {
if (row.vector) {
this.embeddings.set(row.skill_name, {
skillName: row.skill_name,
textContent: row.text_content,
vector: new Float32Array(row.vector.buffer),
generatedAt: row.generated_at,
});
}
}
}
} catch {
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Silent catch swallows all errors during DB load.

Line 189's empty catch block silently ignores any errors when loading embeddings from the database. This could mask legitimate issues like corrupted data or permission errors, making debugging difficult.

🐛 Proposed fix: log or surface errors
     } catch (error) {
+      // Log the error but don't fail initialization - allows fallback to empty state
+      console.warn('Failed to load embeddings from database:', error);
     }
🤖 Prompt for AI Agents
In `@packages/core/src/search/vector-store.ts` around lines 145 - 191, The empty
catch in loadEmbeddingsFromDb silently swallows DB errors; update the catch to
surface or log the error (e.g., via this.logger.error or rethrow) so failures
when reading from this.db (used with usingSqliteVec and the prepared queries on
this.config.tableName/_meta) are visible; ensure the error message includes
context (function name, tableName, skill_name or row context) and preserve
existing behavior for successful rows while not masking exceptions thrown during
the db.prepare/.all or vector conversion.

Comment on lines +164 to +171
for (const row of rows) {
this.embeddings.set(row.skill_name, {
skillName: row.skill_name,
textContent: row.text_content,
vector: new Float32Array(this.config.dimensions),
generatedAt: row.generated_at,
});
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Loaded embeddings have zeroed vectors in sqlite-vec mode.

When loading from the sqlite-vec backend (Lines 164-171), vectors are initialized as new Float32Array(this.config.dimensions) (all zeros) rather than loaded from the _vec table. This means subsequent in-memory searches would use zero vectors, producing incorrect similarity results.

🐛 Proposed fix: load vectors from vec table
       if (this.usingSqliteVec) {
+        const vecDb = this.db as {
+          prepare(sql: string): { get(...args: unknown[]): { embedding: Buffer } | undefined };
+        };
         const rows = db.prepare(`
           SELECT skill_name, text_content, generated_at
           FROM ${this.config.tableName}_meta
         `).all();

         for (const row of rows) {
+          const vecRow = vecDb.prepare(`
+            SELECT embedding FROM ${this.config.tableName}_vec WHERE skill_name = ?
+          `).get(row.skill_name);
+          const vector = vecRow?.embedding
+            ? new Float32Array(vecRow.embedding.buffer)
+            : new Float32Array(this.config.dimensions);
           this.embeddings.set(row.skill_name, {
             skillName: row.skill_name,
             textContent: row.text_content,
-            vector: new Float32Array(this.config.dimensions),
+            vector,
             generatedAt: row.generated_at,
           });
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for (const row of rows) {
this.embeddings.set(row.skill_name, {
skillName: row.skill_name,
textContent: row.text_content,
vector: new Float32Array(this.config.dimensions),
generatedAt: row.generated_at,
});
}
const vecDb = this.db as {
prepare(sql: string): { get(...args: unknown[]): { embedding: Buffer } | undefined };
};
for (const row of rows) {
const vecRow = vecDb.prepare(`
SELECT embedding FROM ${this.config.tableName}_vec WHERE skill_name = ?
`).get(row.skill_name);
const vector = vecRow?.embedding
? new Float32Array(vecRow.embedding.buffer)
: new Float32Array(this.config.dimensions);
this.embeddings.set(row.skill_name, {
skillName: row.skill_name,
textContent: row.text_content,
vector,
generatedAt: row.generated_at,
});
}
🤖 Prompt for AI Agents
In `@packages/core/src/search/vector-store.ts` around lines 164 - 171, The loop
that populates this.embeddings currently sets each entry's vector to new
Float32Array(this.config.dimensions) (all zeros) instead of reading the actual
vector bytes from the sqlite-vec storage; update the loader in vector-store.ts
(the block iterating over rows and setting this.embeddings) to fetch the
corresponding vector data from the _vec table (or the query result that contains
the vector column), decode/convert it into a Float32Array of length
this.config.dimensions, and assign that to the vector field for each entry
(preserve skillName, textContent, generatedAt); ensure you reference the same
key (row.skill_name) when joining or looking up the _vec data so in-memory
searches use the real vectors rather than zeroed arrays.

Comment on lines +337 to +341
return rows.map((row) => ({
skillName: row.skill_name,
similarity: 1 - row.distance,
}));
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

sqlite-vec vec0 distance metric cosine euclidean

💡 Result:

In sqlite-vec’s vec0 virtual table, the distance metric is chosen per vector column via the column option distance_metric=....

Example (cosine instead of default L2):

create virtual table vec_documents using vec0(
  document_id integer primary key,
  embedding float[768] distance_metric=cosine
);

select document_id, distance
from vec_documents
where embedding match :query
  and k = 10;

(alexgarcia.xyz)

If you’re computing distances outside vec0, sqlite-vec also exposes scalar functions such as vec_distance_l2(a,b) and vec_distance_cosine(a,b). (stephencollins.tech)

Citations:


🏁 Script executed:

rg -i "vec0|distance_metric" --type ts --type js --type sql -A 3 -B 3

Repository: rohitg00/skillkit

Length of output: 1047


🏁 Script executed:

rg -A 10 "\.distance" packages/core/src/search/vector-store.ts

Repository: rohitg00/skillkit

Length of output: 320


🏁 Script executed:

rg -B 15 "similarity: 1 - row.distance" packages/core/src/search/vector-store.ts

Repository: rohitg00/skillkit

Length of output: 515


Use cosine distance instead of default L2, or normalize similarity calculation.

Line 339 computes similarity: 1 - row.distance, which assumes distance is normalized to [0, 1]. However, the vec0 table is created without specifying distance_metric, defaulting to L2 (Euclidean) distance—which is unbounded. This causes similarity to become negative for distant vectors, breaking the expected [0, 1] range. Configure distance_metric=cosine in the vec0 table creation, or adjust the similarity formula to handle unbounded distances.

🤖 Prompt for AI Agents
In `@packages/core/src/search/vector-store.ts` around lines 337 - 341, The
similarity calculation currently uses "similarity: 1 - row.distance" (in the
rows.map mapping) but the vec0 table was created without a distance_metric so it
defaults to L2 (unbounded), causing negative similarities; either change the
vec0 table creation to use distance_metric='cosine' so row.distance is a cosine
distance in [0,1], or keep the L2 metric and replace the mapping logic with a
proper normalization (e.g., convert cosine similarity from dot product or
clamp/scale distances into [0,1]) so the result of the rows.map (where
similarity is computed) always yields a 0..1 similarity; update the vec0 table
creation code or the rows.map similarity expression accordingly.

- Add missing beforeEach import in embeddings.test.ts
- Update optional dependencies to current versions
- Fix weights/variations length mismatch in expansion.ts
- Avoid mutating input arrays in rrf.ts mergeRankings
- Delete orphaned chunk vectors in vector-store.ts delete() and clear()
- Add comment for empty catch block in vector-store.ts
- Make hybridSearch helper use buildIndex for true hybrid search
- Handle graceful degradation when embeddings unavailable
- Add node-llama-cpp mock in hybrid.test.ts
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@packages/core/src/search/hybrid.ts`:
- Around line 367-384: The hybridSearch helper can leak resources if
pipeline.search throws; wrap the sequence that uses the pipeline in a
try/finally so pipeline.dispose() always runs: call createHybridSearchPipeline()
and await pipeline.buildIndex(...) before entering a try block that performs
await pipeline.search({ query, ...options }) and returns the response, and call
await pipeline.dispose() in the finally block to ensure disposal even on errors
(references: hybridSearch, createHybridSearchPipeline, pipeline.buildIndex,
pipeline.search, pipeline.dispose).
🧹 Nitpick comments (9)
packages/core/src/search/expansion.ts (3)

10-10: Module-level cache is shared across all QueryExpander instances.

The expansionCache is declared at module scope (Line 10), meaning all instances of QueryExpander share the same cache, even if they have different cacheTtlMs configurations. This could lead to unexpected behavior when multiple pipelines are used with different TTL settings.

♻️ Proposed fix: move cache to instance scope
-const expansionCache = new Map<string, CacheEntry>();
-
 export class QueryExpander {
   private modelManager: LocalModelManager;
   private model: unknown = null;
   private context: unknown = null;
   private initialized = false;
   private cacheTtlMs: number;
+  private cache: Map<string, CacheEntry> = new Map();

   // Update all cache references from expansionCache to this.cache

237-240: Accessing private method via bracket notation bypasses encapsulation.

expandQuerySimple accesses the private method expandWithoutLLM using bracket notation (expander['expandWithoutLLM']). This is a code smell that circumvents TypeScript's access modifiers. Consider either making the method public or extracting the synonym-based expansion logic into a standalone exported function.

♻️ Proposed fix: extract standalone function
+// Standalone synonym-based expansion (no LLM required)
+function expandWithSynonyms(query: string): ExpandedQuery {
+  const variations: string[] = [];
+  const queryLower = query.toLowerCase();
+  // ... synonym logic from expandWithoutLLM ...
+  return { original: query, variations: limitedVariations, weights };
+}
+
 export function expandQuerySimple(query: string): ExpandedQuery {
-  const expander = new QueryExpander();
-  return expander['expandWithoutLLM'](query);
+  return expandWithSynonyms(query);
 }

194-207: Cache eviction removes oldest inserted key, not oldest by timestamp.

Map.keys().next().value returns the first key by insertion order, but entries can have vastly different timestamps due to TTL-based expiration checks. This means a recently accessed (and still valid) entry could be evicted while stale entries remain.

♻️ Proposed fix: evict by oldest timestamp
   private addToCache(query: string, expanded: ExpandedQuery): void {
     const maxCacheSize = 100;
     if (expansionCache.size >= maxCacheSize) {
-      const oldestKey = expansionCache.keys().next().value;
-      if (oldestKey) {
-        expansionCache.delete(oldestKey);
+      let oldestKey: string | null = null;
+      let oldestTime = Infinity;
+      for (const [key, entry] of expansionCache) {
+        if (entry.timestamp < oldestTime) {
+          oldestTime = entry.timestamp;
+          oldestKey = key;
+        }
+      }
+      if (oldestKey !== null) {
+        expansionCache.delete(oldestKey);
       }
     }
packages/core/src/search/__tests__/hybrid.test.ts (2)

100-104: afterEach may call dispose() on undefined pipeline if beforeEach fails.

If the dynamic import or pipeline creation throws in beforeEach, pipeline will be undefined when afterEach runs. The conditional check (if (pipeline)) handles this, but using optional chaining would be more defensive.

♻️ Proposed improvement
   afterEach(async () => {
-    if (pipeline) {
-      await pipeline.dispose();
-    }
+    await pipeline?.dispose();
   });

191-201: Test assertion uses if guard, making it non-deterministic.

The test at Lines 191-201 wraps assertions in if (response.query.expanded), which means if expansion fails silently, the test passes without verifying the expected behavior. Consider using expect(...).toBeDefined() followed by unconditional assertions, or use a non-null assertion after the check.

♻️ Proposed fix: make assertion unconditional
     it('should include expanded terms in response', async () => {
       const response = await pipeline.search({
         query: 'auth',
         enableExpansion: true,
       });

-      if (response.query.expanded) {
-        expect(response.query.expanded.original).toBe('auth');
-        expect(response.query.expanded.weights[0]).toBe(2.0);
-      }
+      expect(response.query.expanded).toBeDefined();
+      expect(response.query.expanded!.original).toBe('auth');
+      expect(response.query.expanded!.weights[0]).toBe(2.0);
     });
packages/core/src/search/hybrid.ts (2)

142-143: Empty catch block silently ignores embedding errors.

Line 142-143 catches and ignores all errors during query embedding. While graceful degradation is good, consider logging a debug message to aid troubleshooting when embeddings unexpectedly fail.

♻️ Proposed fix: add debug logging
     try {
       if (this.embeddingService.isInitialized()) {
         queryVector = await this.embeddingService.embed(query);
       }
     } catch {
+      // Embedding failed - will proceed with keyword-only search
     }

303-327: Reranker always adds 0.3 base score, inflating all scores.

The rerank() method adds an unconditional + 0.3 to baseScore (Line 322), meaning even skills with zero matches get a score of 0.3. Combined with Math.min(1, ...), a skill matching all three criteria (description + tags + name) gets 0.5 + 0.3 + 0.2 + 0.3 = 1.3 → clamped to 1.0, while a partial match might get 0.5 + 0.3 = 0.8. This reduces score differentiation.

♻️ Proposed improvement: remove unconditional boost or make it configurable
       const baseScore =
         (input.skillDescription.toLowerCase().includes(input.query.toLowerCase())
           ? 0.5
           : 0) +
         (input.skillTags.some((t) =>
           t.toLowerCase().includes(input.query.toLowerCase())
         )
           ? 0.3
           : 0) +
         (input.skillName.toLowerCase().includes(input.query.toLowerCase())
           ? 0.2
           : 0);

       outputs.push({
         skillName: input.skillName,
-        score: Math.min(1, baseScore + 0.3),
+        score: Math.min(1, baseScore),
       });
packages/core/src/search/vector-store.ts (2)

456-480: getStats() returns current timestamp as lastIndexedAt, not actual last index time.

Line 476 always returns new Date().toISOString() as lastIndexedAt, regardless of when indexing actually occurred. This could be misleading for monitoring/debugging purposes.

♻️ Proposed fix: track actual last indexed time
 export class VectorStore {
   // ... existing fields ...
+  private lastIndexedAt: string | null = null;
   
   async store(embedding: SkillEmbedding): Promise<void> {
     // ... existing code ...
+    this.lastIndexedAt = new Date().toISOString();
   }

   getStats(): EmbeddingServiceStats {
     // ...
     return {
       totalSkillsIndexed: this.embeddings.size,
       totalChunks,
       indexSizeBytes,
-      lastIndexedAt: new Date().toISOString(),
+      lastIndexedAt: this.lastIndexedAt ?? new Date().toISOString(),
       modelName: 'nomic-embed-text-v1.5',
       embeddingDimensions: this.config.dimensions,
     };
   }

477-477: Hardcoded model name in getStats().

modelName is hardcoded to 'nomic-embed-text-v1.5' but the actual model used depends on EmbeddingService configuration. Consider retrieving this from the embedding service or making it configurable.

Copy link

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 new potential issues.

View issues and 8 additional flags in Devin Review.

Open in Devin Review

vectorSimilarity: vectorInput?.results.find((r) => r.skillName === skillName)?.score,
keywordScore: keywordInput?.results.find((r) => r.skillName === skillName)?.score,
rrfScore: rrfRankings.find((r) => r.skillName === skillName)?.rrfScore,
matchedTerms: [],

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 matchedTerms always empty in HybridSearchResult

The matchedTerms field is always set to an empty array [] in hybrid search results, meaning users never see which search terms matched their query.

Click to expand

How the bug occurs

In hybrid.ts at line 285, the HybridSearchResult is constructed with:

matchedTerms: [],

The keyword search via RecommendationEngine.search() at line 171-175 does compute matched terms, but this information is never extracted and passed through to the final results.

Impact

The CLI code at recommend.ts:553-556 checks result.matchedTerms.length > 0 to display matched terms, but this condition will always be false for hybrid search results:

if (result.matchedTerms.length > 0) {
  console.log(`      ${colors.dim('Matched:')} ${result.matchedTerms.join(', ')}`);
}

Users will never see what terms matched their query, reducing the usefulness of verbose output.

Recommendation: Extract matchedTerms from the keyword search results. For each skill in the final results, look up the corresponding keyword search result and use its matchedTerms array.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.


hybridResults.push({
skill,
relevance: Math.round(hybridScore * 100),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Relevance scores are extremely low (1-3%) without reranking due to raw RRF scores

When reranking is disabled, the relevance percentage shown to users is computed from raw RRF scores, resulting in misleadingly low values like 1-3% even for the best matches.

Click to expand

How the bug occurs

At line 207-210, raw RRF scores are stored in finalScores:

let finalScores = new Map<string, number>();
for (const ranking of rrfRankings) {
  finalScores.set(ranking.skillName, ranking.rrfScore);
}

At line 280, this score is multiplied by 100 to get a percentage:

relevance: Math.round(hybridScore * 100),

The RRF formula is sum(1/(k+rank)) with k=60 by default. For a top-ranked result (rank=1) appearing in both vector and keyword rankers:

  • RRF score = 2 × (1/61) ≈ 0.0328
  • relevance = Math.round(0.0328 × 100) = 3%

Actual vs Expected

  • Actual: Top results show 1-3% relevance
  • Expected: Top results should show higher relevance scores (70-100%) to indicate good matches

Impact

Users will see confusingly low relevance percentages that don't reflect actual match quality. A perfect match might show as "3%" which suggests poor relevance.

Note: When reranking IS enabled, the scores are normalized (lines 235-248), partially mitigating this issue.

Recommendation: Normalize the RRF scores before converting to percentages. Either normalize to 0-1 range by dividing by the maximum RRF score, or use a different scoring approach that produces intuitive percentages.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

- Fix gemma GGUF URL to point to valid community repo (bartowski)
- Add SQL injection protection for table names in vector-store.ts
- Fix resource leak with try/finally in hybridSearch helper
- Add backpressure handling for large model downloads
- Clamp startLine to non-negative in chunk calculation
- Add runtime validation warning for --expand/--rerank without --hybrid
- Update node-llama-cpp API usage (createEmbeddingContext, createContext)
- Update pnpm-lock.yaml for new optional dependency versions
CI doesn't install optional dependencies, so @ts-ignore is needed
instead of @ts-expect-error to suppress module not found errors.
@rohitg00 rohitg00 merged commit 109f00c into main Feb 2, 2026
10 checks passed
@rohitg00 rohitg00 deleted the feat/hybrid-search branch February 2, 2026 23:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants