SqliteStore backend + annotation, audit, and query result cache tools#169
SqliteStore backend + annotation, audit, and query result cache tools#169data-douser wants to merge 2 commits intomainfrom
Conversation
Replace lowdb with sql.js (asm.js build) for zero-dependency SQLite persistence. Bundle inline with esbuild — no native modules, no external deps at runtime. SqliteStore provides three tables: - sessions: session tracking (migrated from lowdb) - annotations: key-value annotation store with categories and metadata - query_result_cache: BQRS/SARIF result caching with subset retrieval New tools (gated by ENABLE_ANNOTATION_TOOLS env var): - annotation_create, annotation_list, annotation_search, annotation_delete - audit_store_findings, audit_list_findings, audit_add_notes, audit_clear_repo - query_results_cache_lookup, query_results_cache_retrieve, query_results_cache_clear, query_results_cache_compare Code refactoring for maintainability: - Extract database-resolver.ts from cli-tool-registry.ts - Extract query-resolver.ts from cli-tool-registry.ts - Extract result-processor.ts from cli-tool-registry.ts - Extract codeql-version.ts from cli-executor.ts Bug fixes: - Fix params.output not propagated to proce- Fix params.output not propagated to proce- Fix params.output not propagated txternal predicate conditions for direct query paths Closes #165
Dependency ReviewThe following issues were found:
Snapshot WarningsEnsure that dependencies are being submitted on PR branches and consider enabling retry-on-snapshot-warnings. See the documentation for more information and troubleshooting advice. License Issuesserver/package.json
OpenSSF Scorecard
Scanned Files
|
There was a problem hiding this comment.
Pull request overview
This PR migrates the server’s persistence layer from lowdb to a unified sql.js-backed SqliteStore, and introduces opt-in MCP tools for annotations, audit workflows, and query result caching while refactoring CLI-related logic into smaller modules.
Changes:
- Replace lowdb session persistence with
SqliteStore(sessions + annotations + query result cache tables). - Add new opt-in MCP tools:
annotation_*,audit_*, andquery_results_cache_*. - Refactor query/database resolution and query-result processing (interpretation + auto-caching) into dedicated modules.
Reviewed changes
Copilot reviewed 18 out of 20 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| server/test/src/tools/monitoring-tools.test.ts | Updates monitoring-tool tests to account for async store initialization and new config flag. |
| server/test/src/lib/sqlite-store.test.ts | Adds unit tests for SqliteStore sessions, annotations, persistence, and cache behaviors. |
| server/test/src/lib/session-data-manager.test.ts | Updates tests to await async initialization after migrating persistence backend. |
| server/src/types/sql-js.d.ts | Adds minimal typings for sql.js asm.js import + core DB surface. |
| server/src/types/monitoring.ts | Adds enableAnnotationTools config flag (default false). |
| server/src/tools/cache-tools.ts | Introduces query_results_cache_* tools (lookup/retrieve/clear/compare). |
| server/src/tools/audit-tools.ts | Introduces audit_* tools layered on annotations. |
| server/src/tools/annotation-tools.ts | Introduces annotation_* tools for CRUD + search. |
| server/src/lib/sqlite-store.ts | Implements the new unified SQLite persistence backend + cache subset retrieval. |
| server/src/lib/session-data-manager.ts | Migrates session persistence from lowdb to SqliteStore; adds getStore(). |
| server/src/lib/result-processor.ts | Extracts query result interpretation/evaluation and auto-cache pipeline. |
| server/src/lib/query-results-evaluator.ts | Adds query metadata caching with mtime-based invalidation. |
| server/src/lib/query-resolver.ts | Extracts query-path resolution via codeql resolve queries. |
| server/src/lib/database-resolver.ts | Extracts database-path resolution and caches results in-memory. |
| server/src/lib/codeql-version.ts | Adds target/actual CodeQL version tracking and mismatch warning. |
| server/src/lib/cli-tool-registry.ts | Integrates extracted resolvers/result processor and fixes output propagation + predicate handling. |
| server/src/lib/cli-executor.ts | Wires version detection into startup validation; re-exports version helpers. |
| server/src/codeql-development-mcp-server.ts | Registers new annotation/audit/cache tools at startup and initializes session manager. |
| server/package.json | Removes lowdb, adds sql.js. |
| package-lock.json | Updates lockfile to reflect dependency swap (remove lowdb/steno, add sql.js). |
Comments suppressed due to low confidence (2)
server/src/lib/sqlite-store.ts:425
updateAnnotationsetsupdated_at = datetime('now'), which produces a different string format than the ISO timestamps written increateAnnotation. This can cause incorrect ordering when sorting byupdated_at. Consider updating this to use the same format as inserts.
setClauses.push("updated_at = datetime('now')");
const db = this.ensureDb();
db.run(
`UPDATE annotations SET ${setClauses.join(', ')} WHERE id = $id`,
params as Record<string, string | number | null>,
);
server/src/lib/session-data-manager.ts:49
- Docstring says “sql.js WASM init is async”, but this store uses the asm.js build. Please adjust the wording so it’s accurate (async init is still true, but it’s not specifically WASM).
/**
* Initialize the database and ensure it's properly set up.
* Must be awaited before any session operations (sql.js WASM init is async).
*/
| import { describe, it, expect, beforeEach, afterEach } from 'vitest'; | ||
| import { SqliteStore } from '../../../src/lib/sqlite-store'; | ||
| import { existsSync, rmSync } from 'fs'; | ||
|
|
||
| describe('SqliteStore', () => { | ||
| let store: SqliteStore; | ||
| const testDir = '.ql-mcp-sqlite-test'; | ||
|
|
||
| beforeEach(async () => { | ||
| if (existsSync(testDir)) { | ||
| rmSync(testDir, { recursive: true, force: true }); | ||
| } |
There was a problem hiding this comment.
This file uses a fixed directory name (.ql-mcp-sqlite-test) in the repo root for test storage. In this repo’s tests, temporary data should go under the project-local .tmp/ directory via getProjectTmpBase() / createProjectTempDir() helpers to avoid polluting the workspace and reduce the risk of collisions. Consider generating a unique temp dir per test run and cleaning it up.
| import { describe, it, expect, beforeEach, afterEach } from 'vitest'; | |
| import { SqliteStore } from '../../../src/lib/sqlite-store'; | |
| import { existsSync, rmSync } from 'fs'; | |
| describe('SqliteStore', () => { | |
| let store: SqliteStore; | |
| const testDir = '.ql-mcp-sqlite-test'; | |
| beforeEach(async () => { | |
| if (existsSync(testDir)) { | |
| rmSync(testDir, { recursive: true, force: true }); | |
| } | |
| import { existsSync, rmSync } from 'fs'; | |
| import { SqliteStore } from '../../../src/lib/sqlite-store'; | |
| import { createProjectTempDir } from '../../../src/utils/temp-dir'; | |
| import { afterEach, beforeEach, describe, expect, it } from 'vitest'; | |
| describe('SqliteStore', () => { | |
| let store: SqliteStore; | |
| let testDir: string; | |
| beforeEach(async () => { | |
| testDir = createProjectTempDir(); |
| const now = new Date().toISOString(); | ||
| db.run( | ||
| `INSERT INTO annotations (category, entity_key, label, content, metadata, created_at, updated_at) | ||
| VALUES ($category, $entity_key, $label, $content, $metadata, $created_at, $updated_at)`, | ||
| { | ||
| $category: category, | ||
| $entity_key: entityKey, | ||
| $label: label ?? null, | ||
| $content: content ?? null, | ||
| $metadata: metadata ?? null, | ||
| $created_at: now, | ||
| $updated_at: now, |
There was a problem hiding this comment.
createAnnotation writes created_at/updated_at using new Date().toISOString(), but updateAnnotation later sets updated_at = datetime('now'). Mixing timestamp formats can break ORDER BY updated_at sorting because the strings aren’t comparable across formats. Use a single consistent timestamp format for both create and update (either SQLite datetime('now') everywhere or ISO strings everywhere).
This issue also appears on line 419 of the same file.
| const now = new Date().toISOString(); | |
| db.run( | |
| `INSERT INTO annotations (category, entity_key, label, content, metadata, created_at, updated_at) | |
| VALUES ($category, $entity_key, $label, $content, $metadata, $created_at, $updated_at)`, | |
| { | |
| $category: category, | |
| $entity_key: entityKey, | |
| $label: label ?? null, | |
| $content: content ?? null, | |
| $metadata: metadata ?? null, | |
| $created_at: now, | |
| $updated_at: now, | |
| db.run( | |
| `INSERT INTO annotations (category, entity_key, label, content, metadata, created_at, updated_at) | |
| VALUES ($category, $entity_key, $label, $content, $metadata, datetime('now'), datetime('now'))`, | |
| { | |
| $category: category, | |
| $entity_key: entityKey, | |
| $label: label ?? null, | |
| $content: content ?? null, | |
| $metadata: metadata ?? null, |
| flush(): void { | ||
| if (!this.db) return; | ||
| const data = this.db.export(); | ||
| const buffer = Buffer.from(data); | ||
| writeFileSync(this.dbPath, buffer); | ||
| this.dirty = false; | ||
| } |
There was a problem hiding this comment.
flush() exports the entire in-memory sql.js database and writes it to disk synchronously. Since many write operations call flush() immediately, this can become a significant performance bottleneck (and block the event loop) as the DB grows (especially with cached query results). Consider batching/debouncing flushes, flushing only on close, or making flush-on-write configurable.
| * Session Data Management | ||
| * Provides unified JSON storage and session lifecycle management using lowdb | ||
| * Provides session lifecycle management backed by SqliteStore (sql.js WASM) | ||
| */ |
There was a problem hiding this comment.
Header comment says sessions are backed by “sql.js WASM”, but SqliteStore imports the asm.js build (sql.js/dist/sql-asm.js). Update the comment to avoid implying a WASM dependency.
This issue also appears on line 46 of the same file.
| beforeEach(async () => { | ||
| // Initialize the sqlite store (required for sql.js WASM backend) | ||
| await sessionDataManager.initialize(); | ||
|
|
There was a problem hiding this comment.
sessionDataManager.initialize() is called before mocking getConfig() to use testStorageDir, but the underlying SqliteStore’s storage directory is fixed at construction time from the real config. As a result, these tests may initialize/create a DB in the default location and never clean it up. Consider resetting modules after setting MONITORING_STORAGE_LOCATION, or using a fresh SessionDataManager instance in the tests.
| // Check cache with mtime validation | ||
| const stat = statSync(queryPath); | ||
| const mtime = stat.mtimeMs; | ||
| const cached = metadataCache.get(queryPath); | ||
| if (cached && cached.mtime === mtime) { | ||
| return cached.metadata; | ||
| } | ||
|
|
||
| const queryContent = readFileSync(queryPath, 'utf-8'); |
There was a problem hiding this comment.
extractQueryMetadata calls statSync(queryPath) and then readFileSync(queryPath). This introduces a TOCTOU window (file can change between the two operations) and violates the repo’s guidance to avoid stat-then-read patterns. Prefer reading via an opened file descriptor and fstatSync, or read first and then stat for cache invalidation.
| const [start, end] = options.resultIndices; | ||
| selected = selected.slice(start, end); |
There was a problem hiding this comment.
getCacheSarifSubset treats resultIndices as [start, end] but uses slice(start, end), which excludes end. This is inconsistent with lineRange handling (inclusive end) and with the tool description that reads like an inclusive range. Either document end as exclusive, or change the implementation to include the end index (and add clamping/validation).
| const [start, end] = options.resultIndices; | |
| selected = selected.slice(start, end); | |
| const [rawStart, rawEnd] = options.resultIndices; | |
| const total = selected.length; | |
| if (Number.isFinite(rawStart) && Number.isFinite(rawEnd) && total > 0) { | |
| const start = Math.max(0, Math.min(total - 1, Math.floor(rawStart))); | |
| const end = Math.max(0, Math.min(total - 1, Math.floor(rawEnd))); | |
| if (end >= start) { | |
| // resultIndices is treated as an inclusive [start, end] range | |
| selected = selected.slice(start, end + 1); | |
| } else { | |
| selected = []; | |
| } | |
| } else { | |
| selected = []; | |
| } |
| * Opt-in via ENABLE_AUDIT_TOOLS=true (disabled by default). | ||
| * Requires ENABLE_ANNOTATION_TOOLS=true (audit tools are layered on annotations). |
There was a problem hiding this comment.
The header comment says “Opt-in via ENABLE_AUDIT_TOOLS=true”, but registerAuditTools only gates on config.enableAnnotationTools / ENABLE_ANNOTATION_TOOLS. Either add and wire up an enableAuditTools config/env flag as documented, or update the comment to reflect the actual gating behavior.
| * Opt-in via ENABLE_AUDIT_TOOLS=true (disabled by default). | |
| * Requires ENABLE_ANNOTATION_TOOLS=true (audit tools are layered on annotations). | |
| * Enabled when ENABLE_ANNOTATION_TOOLS=true (disabled by default when annotation tools are off). | |
| * Audit tools are layered on annotations; there is no separate ENABLE_AUDIT_TOOLS flag. |
| cacheKey: z.string().describe('The cache key of the result to retrieve.'), | ||
| lineRange: z.tuple([z.number(), z.number()]).optional().describe('Line range [start, end] (1-indexed). For graphtext/CSV output.'), | ||
| resultIndices: z.tuple([z.number(), z.number()]).optional().describe('SARIF result index range [start, end] (0-indexed).'), | ||
| fileFilter: z.string().optional().describe('For SARIF: only include results whose file path contains this string.'), | ||
| grep: z.string().optional().describe('Text search filter: only include lines/results containing this term.'), | ||
| maxLines: z.number().optional().describe('Maximum number of lines to return (default: 500).'), | ||
| }, | ||
| async ({ cacheKey, lineRange, resultIndices, fileFilter, grep, maxLines }) => { | ||
| const store = sessionDataManager.getStore(); | ||
| const meta = store.getCacheEntryMeta(cacheKey); | ||
|
|
||
| if (!meta) { | ||
| return { content: [{ type: 'text' as const, text: `No cached result found for key: ${cacheKey}` }] }; | ||
| } | ||
|
|
||
| // For SARIF format with SARIF-specific filters, use SARIF subset retrieval | ||
| const isSarif = meta.outputFormat.includes('sarif'); | ||
| if (isSarif && (resultIndices || fileFilter)) { | ||
| const subset = store.getCacheSarifSubset(cacheKey, { | ||
| resultIndices, | ||
| fileFilter, | ||
| maxResults: maxLines, | ||
| }); |
There was a problem hiding this comment.
query_results_cache_retrieve uses a maxLines parameter but passes it as maxResults for SARIF subset retrieval. This mixes “lines” and “results” concepts in the API and can be confusing for callers. Consider adding a separate maxResults parameter for SARIF (or renaming to a generic maxItems) and keep maxLines for line-oriented formats only.
Summary
Replace lowdb with sql.js (asm.js build) as a zero-dependency SQLite persistence backend. Add 14 new MCP tools for annotation management, audit workflows, and query result caching. Refactor server code into smaller, focused modules.
Changes
SqliteStore persistence backend
server/src/lib/sqlite-store.ts— unified SQLite persistence with 3 tables:sessions: session tracking (migrated from lowdb)annotations: key-value annotation store with categories and full-text searchquery_result_cache: BQRS/SARIF result caching with subset retrieval (line ranges, grep, SARIF filters)server/src/types/sql-js.d.ts— type declarations for sql.jsserver/package.json— added sql.js, removed lowdbNew tools (gated by
ENABLE_ANNOTATION_TOOLSenv var)annotation_create,annotation_list,annotation_search,annotation_deleteaudit_store_findings,audit_list_findings,audit_add_notes,audit_clear_repoquery_results_cache_lookup,query_results_cache_retrieve,query_results_cache_clear,query_results_cache_compareCode refactoring
database-resolver.tsfromcli-tool-registry.tsquery-resolver.tsfromcli-tool-registry.tsresult-processor.tsfromcli-tool-registry.tscodeql-version.tsfromcli-executor.tscli-tool-registry.tsreduced from ~1060 to ~693 linesBug fixes
params.outputnot propagated toprocessQueryRunResults(caused cache writes to silently skip)Testing
sqlite-store.test.ts(31 tests)session-data-manager.test.ts,monitoring-tools.test.tsReview order
This PR is independent and should be reviewed first (other PRs depend on it).
Closes #165
Part of #163