Conversation
Move @huggingface/transformers embedding inference to a child process via child_process.fork() to avoid SIGTRAP crashes from onnxruntime-node conflicting with Electron's V8. Add chunking, vector store, hybrid search, per-paper indexing status indicators, serial embedding queue, auto-indexing on save, and pre-release workflow support. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Greptile SummaryThis PR implements semantic search using embeddings generated in a child process to avoid SIGTRAP crashes. The implementation is well-architected with proper separation of concerns (worker process, service layer, vector store, hybrid search). Tests comprehensively cover the event-driven UI state machine per CLAUDE.md requirements. Major changes:
Issues found:
Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant User
participant UI as LibraryList
participant Store as paperStore
participant IPC as ipcHandlers
participant IndexSvc as indexing-service
participant EmbedSvc as embedding-service
participant Worker as embedding-worker<br/>(child process)
participant VectorDB as vector-store
participant Search as hybrid-search
Note over User,Search: Paper Save Flow
User->>Store: savePaper()
Store->>IPC: savePaper()
IPC->>IPC: save-paper.ts
IPC-->>Store: success
IPC->>IndexSvc: indexAllPapers() (fire-and-forget)
Note over IndexSvc,Worker: Indexing Flow
IndexSvc->>VectorDB: getPapersNeedingEmbedding()
VectorDB-->>IndexSvc: [paper1, paper2, ...]
loop For each paper
IndexSvc->>UI: emit INDEXING_PROGRESS (status: indexing)
UI->>UI: setActivelyIndexingPaperId
IndexSvc->>EmbedSvc: embedDocumentTexts()
EmbedSvc->>Worker: send embedDocuments message
Worker->>Worker: ONNX inference (batched)
Worker-->>EmbedSvc: embeddings result
EmbedSvc-->>IndexSvc: Float32Array[]
IndexSvc->>VectorDB: insertChunkWithEmbedding()
IndexSvc->>VectorDB: setEmbeddingStatus(complete)
IndexSvc->>UI: emit INDEXING_PROGRESS (status: indexed)
UI->>UI: loadPapers() (refresh DB status)
end
IndexSvc->>UI: emit INDEXING_PROGRESS (status: complete)
UI->>UI: clear indexing state
Note over User,Search: Semantic Search Flow
User->>UI: search query
UI->>IPC: searchLibrary(query)
IPC->>EmbedSvc: embedQuery(query)
EmbedSvc->>Worker: send embedQuery message
Worker-->>EmbedSvc: query embedding
IPC->>Search: hybridSearch(query, embedding)
Search->>VectorDB: vectorSearch() (cosine similarity)
Search->>Search: FTS5 keyword search
Search->>Search: RRF score fusion
Search-->>IPC: ranked results
IPC-->>UI: papers with match type
UI->>User: display results
Last reviewed commit: 5569932 |
| function getNodeBinaryPath(): string { | ||
| const candidates = ['/opt/homebrew/bin/node', '/usr/local/bin/node', '/usr/bin/node']; | ||
|
|
||
| try { | ||
| const nodePath = execSync('which node', { encoding: 'utf-8' }).trim(); | ||
| if (nodePath) return nodePath; | ||
| } catch { | ||
| // which failed, try known paths | ||
| } | ||
|
|
||
| for (const candidate of candidates) { | ||
| try { | ||
| execSync(`${candidate} --version`, { encoding: 'utf-8' }); | ||
| return candidate; | ||
| } catch { | ||
| // not found, try next | ||
| } | ||
| } | ||
|
|
||
| throw new Error('Could not find system Node.js binary. Install Node.js to enable semantic search.'); | ||
| } |
There was a problem hiding this comment.
which command doesn't exist on Windows
| function getNodeBinaryPath(): string { | |
| const candidates = ['/opt/homebrew/bin/node', '/usr/local/bin/node', '/usr/bin/node']; | |
| try { | |
| const nodePath = execSync('which node', { encoding: 'utf-8' }).trim(); | |
| if (nodePath) return nodePath; | |
| } catch { | |
| // which failed, try known paths | |
| } | |
| for (const candidate of candidates) { | |
| try { | |
| execSync(`${candidate} --version`, { encoding: 'utf-8' }); | |
| return candidate; | |
| } catch { | |
| // not found, try next | |
| } | |
| } | |
| throw new Error('Could not find system Node.js binary. Install Node.js to enable semantic search.'); | |
| } | |
| function getNodeBinaryPath(): string { | |
| const candidates = process.platform === 'win32' | |
| ? ['C:\\Program Files\\nodejs\\node.exe', 'C:\\Program Files (x86)\\nodejs\\node.exe'] | |
| : ['/opt/homebrew/bin/node', '/usr/local/bin/node', '/usr/bin/node']; | |
| try { | |
| const whichCmd = process.platform === 'win32' ? 'where node' : 'which node'; | |
| const nodePath = execSync(whichCmd, { encoding: 'utf-8' }).trim().split('\n')[0]; | |
| if (nodePath) return nodePath; | |
| } catch { | |
| // which/where failed, try known paths | |
| } | |
| for (const candidate of candidates) { | |
| try { | |
| execSync(`"${candidate}" --version`, { encoding: 'utf-8' }); | |
| return candidate; | |
| } catch { | |
| // not found, try next | |
| } | |
| } | |
| throw new Error('Could not find system Node.js binary. Install Node.js to enable semantic search.'); | |
| } |
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Greptile SummaryThis PR adds semantic search capabilities to PaperShelf by moving ONNX embedding inference to a child process to avoid SIGTRAP crashes in Electron. The implementation includes text chunking, vector storage with cosine similarity search, and hybrid search combining FTS5 with vector search using Reciprocal Rank Fusion. Key Changes
Issues Found
Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant User
participant Renderer
participant Main
participant IndexingService
participant EmbeddingService
participant Worker as embedding-worker.ts
participant VectorStore
participant HybridSearch
User->>Renderer: Save paper
Renderer->>Main: IPC: papers:save
Main->>VectorStore: insertPaper()
Main->>IndexingService: indexAllPapers()
IndexingService->>VectorStore: getPapersNeedingEmbedding()
VectorStore-->>IndexingService: [paper1, paper2, ...]
loop For each paper
IndexingService->>Main: emit INDEXING_PROGRESS (indexing)
Main->>Renderer: onIndexingProgress
Renderer->>Renderer: Update badge to blue pulsing
IndexingService->>EmbeddingService: embedDocumentTexts([chunks])
EmbeddingService->>EmbeddingService: enqueue (serial)
EmbeddingService->>Worker: fork child process
Worker->>Worker: Load ONNX model
Worker-->>EmbeddingService: model loaded
EmbeddingService->>Worker: embedDocuments
Worker->>Worker: Generate embeddings
Worker-->>EmbeddingService: Float32Array[]
IndexingService->>VectorStore: insertChunkWithEmbedding()
IndexingService->>VectorStore: setEmbeddingStatus(complete)
IndexingService->>Main: emit INDEXING_PROGRESS (indexed)
Main->>Renderer: onIndexingProgress
Renderer->>Renderer: Refresh paper list
end
IndexingService->>Main: emit INDEXING_PROGRESS (complete)
Main->>Renderer: onIndexingProgress
Renderer->>Renderer: Clear active state
User->>Renderer: Search library
Renderer->>Main: IPC: search:semantic
Main->>EmbeddingService: embedQuery(query)
EmbeddingService->>Worker: embedQuery
Worker-->>EmbeddingService: Float32Array
Main->>HybridSearch: hybridSearch(query, embedding)
HybridSearch->>VectorStore: vectorSearch(embedding)
HybridSearch->>VectorStore: FTS5 search
HybridSearch->>HybridSearch: Reciprocal Rank Fusion
HybridSearch-->>Main: Ranked results
Main-->>Renderer: SemanticSearchResult[]
Renderer->>User: Display results
Last reviewed commit: a336734 |
| const candidates = ['/opt/homebrew/bin/node', '/usr/local/bin/node', '/usr/bin/node']; | ||
|
|
||
| try { | ||
| const nodePath = execSync('which node', { encoding: 'utf-8' }).trim(); |
There was a problem hiding this comment.
which doesn't exist on Windows - use where instead or check process.platform
| const nodePath = execSync('which node', { encoding: 'utf-8' }).trim(); | |
| const whichCmd = process.platform === 'win32' ? 'where' : 'which'; | |
| const nodePath = execSync(`${whichCmd} node`, { encoding: 'utf-8' }).trim(); |
- Stub sharp via Module._resolveFilename in embedding worker (sharp is unused for text embeddings but imported at top level by transformers) - Use require() instead of import() for transformers to get CJS build where the stub can intercept - Follow-up indexing only picks up pending papers, not failed ones - Don't flash loading spinner during paper list refreshes - Only reload paper list on meaningful status changes (indexed/error/complete) - Add worker integration tests (spawn, sharp stub, IPC protocol) - Add TDD instruction to CLAUDE.md Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Greptile SummaryImplements semantic search for papers using ONNX embeddings in a child process to avoid Electron/V8 crashes. Adds hybrid search (FTS + vector with RRF ranking), auto-indexing on save, UI status badges, and pre-release workflow support. Key Changes
Issues Found
Architecture
Confidence Score: 3/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant UI as Renderer (UI)
participant Main as Main Process
participant Indexer as indexing-service
participant EmbedSvc as embedding-service
participant Worker as embedding-worker (child)
participant DB as vector-store
UI->>Main: savePaperFromArxiv()
Main->>DB: insertPaper()
Main->>Indexer: indexAllPapers()
Indexer->>DB: getPapersNeedingEmbedding()
DB-->>Indexer: [paper1, paper2, ...]
loop For each paper
Indexer->>Main: emit INDEXING_PROGRESS (indexing)
Main->>UI: indexing-progress event
Indexer->>DB: chunkPaper()
Indexer->>EmbedSvc: embedDocumentTexts(chunks)
EmbedSvc->>Worker: fork() if not running
EmbedSvc->>Worker: init (load model)
Worker-->>EmbedSvc: loaded
EmbedSvc->>Worker: embedDocuments
Worker->>Worker: ONNX inference (batched)
Worker-->>EmbedSvc: embeddings[]
EmbedSvc-->>Indexer: Float32Array[]
Indexer->>DB: insertChunkWithEmbedding() (transaction)
Indexer->>DB: setEmbeddingStatus('complete')
Indexer->>Main: emit INDEXING_PROGRESS (indexed)
Main->>UI: indexing-progress event
end
Indexer->>Main: emit INDEXING_PROGRESS (complete)
Main->>UI: indexing-progress event
UI->>UI: refresh badge states
Note over UI,Worker: Search Flow
UI->>Main: semanticSearch(query)
Main->>EmbedSvc: embedQuery(query)
EmbedSvc->>Worker: embedQuery
Worker-->>EmbedSvc: embedding
EmbedSvc-->>Main: Float32Array
Main->>DB: hybridSearch(query, embedding)
DB->>DB: FTS + vector search + RRF merge
DB-->>Main: SemanticSearchResult[]
Main-->>UI: results
Last reviewed commit: b65a1ec |
| } | ||
|
|
||
| // Advance with overlap | ||
| offset += breakPoint - OVERLAP_CHARS; |
There was a problem hiding this comment.
overlap can go negative if breakPoint < OVERLAP_CHARS, causing infinite loop
| offset += breakPoint - OVERLAP_CHARS; | |
| offset += Math.max(breakPoint - OVERLAP_CHARS, 1); |
| const remaining = db | ||
| .prepare( | ||
| `SELECT p.id FROM papers p | ||
| LEFT JOIN embedding_status es ON p.id = es.paper_id | ||
| WHERE es.paper_id IS NULL OR es.status = 'pending'`, | ||
| ) | ||
| .all(); | ||
| if (remaining.length > 0) { | ||
| indexAllPapers().catch((err) => { | ||
| console.warn('Follow-up indexing failed:', err); | ||
| }); | ||
| } |
There was a problem hiding this comment.
recursive call not awaited — can spawn infinite chain if papers keep arriving
| const remaining = db | |
| .prepare( | |
| `SELECT p.id FROM papers p | |
| LEFT JOIN embedding_status es ON p.id = es.paper_id | |
| WHERE es.paper_id IS NULL OR es.status = 'pending'`, | |
| ) | |
| .all(); | |
| if (remaining.length > 0) { | |
| indexAllPapers().catch((err) => { | |
| console.warn('Follow-up indexing failed:', err); | |
| }); | |
| } | |
| const remaining = db | |
| .prepare( | |
| `SELECT p.id FROM papers p | |
| LEFT JOIN embedding_status es ON p.id = es.paper_id | |
| WHERE es.paper_id IS NULL OR es.status = 'pending'`, | |
| ) | |
| .all(); | |
| if (remaining.length > 0) { | |
| await indexAllPapers(); | |
| } |
|
|
||
| export function shutdownEmbeddingService(): void { | ||
| if (child?.connected) { | ||
| child.send({ type: 'shutdown' }); |
There was a problem hiding this comment.
child may not exist if first send fails — guard before sending shutdown
| export function vectorSearch(db: Database.Database, queryEmbedding: Float32Array, limit: number): VectorSearchResult[] { | ||
| const rows = db | ||
| .prepare('SELECT id, paper_id, chunk_text, chunk_type, embedding FROM paper_chunks WHERE embedding IS NOT NULL') | ||
| .all() as { | ||
| id: string; | ||
| paper_id: string; | ||
| chunk_text: string; | ||
| chunk_type: string; | ||
| embedding: Buffer; | ||
| }[]; | ||
|
|
||
| const scored = rows.map((row) => { | ||
| const embeddingArray = new Float32Array(row.embedding.buffer, row.embedding.byteOffset, EMBEDDING_DIMS); | ||
| const similarity = cosineSimilarity(queryEmbedding, embeddingArray); | ||
| // Convert similarity to distance (lower = more similar) for compatibility | ||
| const distance = 1 - similarity; | ||
| return { | ||
| chunkId: row.id, | ||
| paperId: row.paper_id, | ||
| distance, | ||
| chunkText: row.chunk_text, | ||
| chunkType: row.chunk_type, | ||
| }; | ||
| }); | ||
|
|
||
| scored.sort((a, b) => a.distance - b.distance); | ||
| return scored.slice(0, limit); | ||
| } |
There was a problem hiding this comment.
full table scan computing cosine similarity in-memory — will be slow with thousands of chunks (>100 papers with full text)
| }); | ||
|
|
||
| ipcMain.handle('indexing:reindexAll', () => { | ||
| indexAllPapers().catch((err) => { |
There was a problem hiding this comment.
fire-and-forget pattern intentionally used here for background indexing — errors caught in .catch()
| require.cache['sharp-stub'] = { | ||
| id: 'sharp-stub', | ||
| filename: 'sharp-stub', | ||
| loaded: true, | ||
| exports: {}, | ||
| children: [], | ||
| paths: [], | ||
| path: '', | ||
| parent: null, | ||
| isPreloading: false, | ||
| require: require, | ||
| } as unknown as NodeModule; |
There was a problem hiding this comment.
stubbed module object missing some NodeModule properties — may break in future Node versions
|
|
||
| const handleClick = useCallback(async () => { | ||
| setIsIndexing(true); | ||
| await window.electronAPI.reindexAllPapers(); |
There was a problem hiding this comment.
fire-and-forget pattern used here — indexing progress tracked via onIndexingProgress events
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| insertPaper(makePaper({ arxivId: '31', title: 'Needs Index', abstract: 'About something.' })); | ||
| const db = getDb(); | ||
|
|
||
| setEmbeddingStatus(db, paper1.id, 'complete', undefined, 1); |
There was a problem hiding this comment.
test sets status to complete with chunk_count: 1 but doesn't insert actual chunks — inconsistent with real behavior
| // Trigger batch indexer (handles progress events and sequential processing) | ||
| indexAllPapers().catch((err) => { | ||
| console.warn('Background indexing failed:', err); | ||
| }); |
There was a problem hiding this comment.
triggers on every save — indexAllPapers() guards against concurrent execution with indexingInProgress flag
CI runs vitest without building first, and lacks native onnxruntime binaries. Use describe.skipIf to gracefully skip these tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Greptile SummaryImplements semantic search by moving Key changes:
Performance consideration: Vector search currently does full table scan with in-memory cosine similarity (line 96 in Test coverage: Added comprehensive tests for indexing state machine, event-driven UI updates, and worker lifecycle per CLAUDE.md requirements. Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant UI as Renderer
participant Main as Main Process
participant IndexSvc as IndexingService
participant EmbedSvc as EmbeddingService
participant Worker as Child Process
participant DB as SQLite
UI->>Main: savePaper()
Main->>DB: insertPaper()
Main->>IndexSvc: indexAllPapers() [fire-and-forget]
IndexSvc->>DB: getPapersNeedingEmbedding()
DB-->>IndexSvc: [paper1, paper2, ...]
loop For each paper
IndexSvc->>Main: emit INDEXING_PROGRESS (indexing)
Main->>UI: send indexing-progress event
IndexSvc->>EmbedSvc: embedDocumentTexts([chunks])
EmbedSvc->>Worker: send embedDocuments
alt First time
Worker->>Worker: Load ONNX model
Worker->>Main: progress events
Main->>UI: send embedding-progress
end
Worker->>Worker: Run inference (batched)
Worker-->>EmbedSvc: return embeddings
EmbedSvc-->>IndexSvc: Float32Array[]
IndexSvc->>DB: insertChunkWithEmbedding() [transaction]
IndexSvc->>DB: setEmbeddingStatus('complete')
IndexSvc->>Main: emit INDEXING_PROGRESS (indexed)
Main->>UI: send indexing-progress event
end
IndexSvc->>Main: emit INDEXING_PROGRESS (complete)
Main->>UI: send indexing-progress event
UI->>Main: semanticSearch(query)
Main->>EmbedSvc: embedQuery(query)
EmbedSvc->>Worker: send embedQuery
Worker-->>EmbedSvc: embedding
Main->>DB: hybridSearch(query, embedding)
DB->>DB: FTS + vectorSearch + RRF
DB-->>Main: ranked results
Main-->>UI: papers with scores
Last reviewed commit: 107ddbc |
| export function shutdownEmbeddingService(): void { | ||
| if (child?.connected) { | ||
| child.send({ type: 'shutdown' }); |
There was a problem hiding this comment.
check child?.connected before sending shutdown message
| export function shutdownEmbeddingService(): void { | |
| if (child?.connected) { | |
| child.send({ type: 'shutdown' }); | |
| export function shutdownEmbeddingService(): void { | |
| if (child?.connected) { | |
| child.send({ type: 'shutdown' }); |
| run: | | ||
| if [ "${{ github.event.inputs.prerelease }}" = "true" ]; then | ||
| npx electron-builder --mac --arm64 --publish always -c.publish.releaseType=prerelease | ||
| else | ||
| npx electron-builder --mac --arm64 --publish always | ||
| fi |
There was a problem hiding this comment.
verify -c.publish.releaseType=prerelease syntax with electron-builder docs - might need --config.publish.releaseType=prerelease instead
Papers stuck in 'indexing' state (from worker crash mid-indexing) are now reset to 'pending' on app startup so they get retried. Added logging throughout the indexing and embedding pipeline. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| const candidates = ['/opt/homebrew/bin/node', '/usr/local/bin/node', '/usr/bin/node']; | ||
|
|
||
| try { | ||
| const nodePath = execSync('which node', { encoding: 'utf-8' }).trim(); |
There was a problem hiding this comment.
which doesn't exist on Windows — use where on Windows or check process.platform first
| const nodePath = execSync('which node', { encoding: 'utf-8' }).trim(); | |
| const command = process.platform === 'win32' ? 'where node' : 'which node'; | |
| const nodePath = execSync(command, { encoding: 'utf-8' }).trim(); |
| indexAllPapers().catch((err) => { | ||
| console.warn('[indexing-service] Follow-up indexing failed:', err); | ||
| }); |
There was a problem hiding this comment.
recursive call not awaited — can spawn multiple concurrent indexing chains if papers keep arriving, bypassing the indexingInProgress guard
| indexAllPapers().catch((err) => { | |
| console.warn('[indexing-service] Follow-up indexing failed:', err); | |
| }); | |
| await indexAllPapers(); |
| export function vectorSearch(db: Database.Database, queryEmbedding: Float32Array, limit: number): VectorSearchResult[] { | ||
| const rows = db | ||
| .prepare('SELECT id, paper_id, chunk_text, chunk_type, embedding FROM paper_chunks WHERE embedding IS NOT NULL') | ||
| .all() as { | ||
| id: string; | ||
| paper_id: string; | ||
| chunk_text: string; | ||
| chunk_type: string; | ||
| embedding: Buffer; | ||
| }[]; | ||
|
|
||
| const scored = rows.map((row) => { | ||
| const embeddingArray = new Float32Array(row.embedding.buffer, row.embedding.byteOffset, EMBEDDING_DIMS); | ||
| const similarity = cosineSimilarity(queryEmbedding, embeddingArray); | ||
| // Convert similarity to distance (lower = more similar) for compatibility | ||
| const distance = 1 - similarity; | ||
| return { | ||
| chunkId: row.id, | ||
| paperId: row.paper_id, | ||
| distance, | ||
| chunkText: row.chunk_text, | ||
| chunkType: row.chunk_type, | ||
| }; | ||
| }); | ||
|
|
||
| scored.sort((a, b) => a.distance - b.distance); | ||
| return scored.slice(0, limit); |
There was a problem hiding this comment.
loads all chunks into memory and computes cosine similarity in JS — will be slow with >100 papers with full text (thousands of chunks)
| run: npx electron-builder --mac --arm64 --publish always | ||
| run: | | ||
| if [ "${{ github.event.inputs.prerelease }}" = "true" ]; then | ||
| npx electron-builder --mac --arm64 --publish always -c.publish.releaseType=prerelease |
There was a problem hiding this comment.
verify this flag syntax works — electron-builder docs should confirm if -c.publish.releaseType=prerelease or --config.publish.releaseType=prerelease is correct
* origin/main: Bump version to 1.2.1 Replace modal dialog with notification-based MCP tool approval Bump version to 1.2.0 Add drag-and-drop import, redesign toolbar, SF Symbol icons (#96) Bump version to 1.1.5 Dynamic MCP tool settings without server restart (#95) Bump version to 1.1.4 Use RELEASE_PAT in release workflow (#94) Fix broken sticky note icons in PDF viewer (#93) Bump version to 1.1.3 Fix ReleaseNoteInfo import from correct package (#92) # Conflicts: # package-lock.json # package.json # src/main/mcp/tools/search-tools.ts
…o feat-semantic-search * 'feat-semantic-search' of github.com:dakl/papershelf: Bump version to 2.0.0-beta.3
| } | ||
|
|
||
| // Advance with overlap | ||
| offset += breakPoint - OVERLAP_CHARS; |
There was a problem hiding this comment.
negative offset if breakPoint < OVERLAP_CHARS causes infinite loop
| offset += breakPoint - OVERLAP_CHARS; | |
| offset += Math.max(breakPoint - OVERLAP_CHARS, 1); |
| const candidates = ['/opt/homebrew/bin/node', '/usr/local/bin/node', '/usr/bin/node']; | ||
|
|
||
| try { | ||
| const nodePath = execSync('which node', { encoding: 'utf-8' }).trim(); |
There was a problem hiding this comment.
which doesn't exist on Windows
| const nodePath = execSync('which node', { encoding: 'utf-8' }).trim(); | |
| const nodePath = execSync(process.platform === 'win32' ? 'where node' : 'which node', { encoding: 'utf-8' }).trim(); |
Summary
@huggingface/transformersembedding inference to a child process (child_process.fork()) to avoid SIGTRAP crashes fromonnxruntime-nodeconflicting with Electron's V8chunker.ts), vector store with cosine similarity search, and hybrid search (FTS + semantic)prereleaseboolean input on the Release workflow, existing clients won't auto-update to beta versionsNew files
src/main/services/embedding-worker.ts— standalone Node.js child process for ONNX inferencesrc/main/services/embedding-service.ts— child process manager with serial queuesrc/main/services/indexing-service.ts— batch indexer with progress eventssrc/main/services/chunker.ts— paper text chunking (title/abstract + body)src/main/db/vector-store.ts— embedding storage, status tracking, cosine similarity searchsrc/main/db/hybrid-search.ts— combined FTS + vector searchsrc/renderer/components/IndexNewButton.tsx— button to trigger indexingsrc/renderer/__tests__/indexing-status-updates.test.ts— UI state machine testssrc/main/__tests__/indexing-service.test.ts— backend indexing flow testssrc/main/__tests__/vector-store.test.ts— vector store + status testsTest plan
npm run test)PAPERSHELF_DATA_DIR=/tmp/papershelf-dev npm run dev)🤖 Generated with Claude Code