Skip to content

Plugin: AI chat over the vault — keyword RAG V1 (#70)#145

Merged
thetechjon merged 1 commit into
devfrom
feat/ai-chat-plugin-70
Jun 7, 2026
Merged

Plugin: AI chat over the vault — keyword RAG V1 (#70)#145
thetechjon merged 1 commit into
devfrom
feat/ai-chat-plugin-70

Conversation

@thetechjon
Copy link
Copy Markdown
Collaborator

Closes #70.

Summary

Adds noteser-ai-chat as a reference plugin under public/plugins/,
built on the v1.2 plugin API (fullscreen view + vault.read.all +
vault.write + VNode events). Brings noteser its first conversational
RAG surface without touching core.

Provider choice

Bring-your-own-key for both OpenAI (gpt-4o-mini default, gpt-4o
opt-in) and Anthropic (claude-sonnet-4-6 default,
claude-haiku-4-5 opt-in). The plugin posts directly to
api.openai.com/v1/chat/completions and api.anthropic.com/v1/messages
from the Worker via the ambient fetch global. No noteser-hosted
inference; nothing routes through any noteser server. Aligns with the
positioning note on #70 ("private, opt-in, bring-your-own-key" — the
issue's defining constraint).

Streaming is Server-Sent Events for both providers. The Worker
consumes the ReadableStream from response.body, decodes the
data: …\n\n events, and pushes one setFullscreenContent per token
delta so the user watches the response render in real time.

RAG approach

Keyword BM25-lite for V1. The flow on each Send:

  1. Extract a bag-of-words from the prompt (extractKeywords).
    Lowercase, strip punctuation, drop ~50 inline stopwords, dedupe.
  2. Snapshot the vault via ctx.vault.read.getAllNotes(), cached
    per-session for 30 seconds.
  3. Score each note with Σ idf(term) × tf / (tf + 1). Title hits
    weigh 2× body hits.
  4. Top 5 notes' bodies, truncated to ~500 chars at a word boundary,
    get stitched into a system prompt asking the model to cite by
    title in [brackets].

Toggle in Settings to disable RAG; off means a plain LLM call with
no vault content sent.

V2 roadmap (in README): swap keyword scoring for embeddings via the
existing src/utils/embeddings.ts + src/utils/aiClient.ts once a
plugin-side embedding capability lands.

Key storage caveat

API key lives unencrypted in localStorage via the plugin's
per-plugin settings namespace (ctx.setSetting('apiKey', …)). Display
masks all but the last 4 chars. The audit trail (Settings → Plugins
→ Audit log) records vault.write calls only — it never sees the
key, the prompt, or the response. README ships an explicit warning:
"Do not paste a key on a shared machine."

V2 roadmap

  • Embeddings-based retrieval (using the shipped embeddings.ts /
    aiClient.ts).
  • Citations as clickable VNode link nodes back to source notes.
  • Per-conversation system-prompt overrides.
  • network.fetch permission once v1.3 surfaces it, so the install
    modal can list api.openai.com / api.anthropic.com explicitly.

Test plan

  • extractKeywords unit tests (stopwords, dedupe, unicode, length floor)
  • scoreNotes BM25-lite ranking on a fixture vault
  • buildOpenAIRequest URL + headers + body shape
  • buildAnthropicRequest URL + headers (x-api-key, anthropic-version) + top-level system field
  • parseSseDeltas for both providers (including malformed JSON tolerance)
  • maskKey never reveals more than last 4 chars
  • conversationToMarkdown "save chat as note" shape
  • Stopword filter pinned to the documented size range
  • npm run lint clean, npx tsc --noEmit zero errors
  • npm test -- --ci green (208 / 208 suites)
  • Manifest served by npm run dev at /plugins/noteser-ai-chat/manifest.json

Ships a v1.2 plugin that opens a fullscreen chat panel, runs a keyword
BM25-lite retrieval pass over the vault, and streams an OpenAI or
Anthropic completion back into the modal. Bring-your-own-key only;
nothing leaves the browser until the user enters a key and hits Send.
Key lives in the per-plugin settings namespace, displayed masked, and
is never recorded in the audit trail (which logs vault writes only).

V1 RAG is keyword-only — extract bag-of-words from the prompt, score
each note with `Σ idf × tf/(tf+1)` (title hits weigh 2×), inject the
top 5 bodies (truncated to ~500 chars) as a system message. Embeddings
are a V2 roadmap item documented in the plugin README.

Tests cover the pure surface (keyword extraction, scoring, request
shape per provider, SSE parsing for both shapes, key masking,
markdown serialisation). No real API calls in tests.
@thetechjon thetechjon merged commit ddc140c into dev Jun 7, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant