Skip to content

docs: add local embeddings documentation#131

Merged
nvandessel merged 2 commits intofeature/yzma-integrationfrom
feature/yzma-phase-6-docs
Feb 20, 2026
Merged

docs: add local embeddings documentation#131
nvandessel merged 2 commits intofeature/yzma-integrationfrom
feature/yzma-phase-6-docs

Conversation

@nvandessel
Copy link
Owner

Summary

  • New docs/EMBEDDINGS.md: setup guide covering interactive/non-interactive setup, configuration, troubleshooting, and technical details
  • Update docs/CLI_REFERENCE.md: add --embeddings / --no-embeddings flags to init command
  • Update docs/FLOOP_USAGE.md: add Local Embeddings section explaining the feature and how it works
  • Update docs/integrations/mcp-server.md: add note about vector pre-filtering in floop_active
  • Update docs/SCIENCE.md: add Embedding-Based Retrieval section covering architecture, model, and lifecycle

Test plan

  • All docs are valid markdown
  • Cross-references between docs are correct
  • No code changes — docs only

🤖 Generated with Claude Code

@greptile-apps
Copy link

greptile-apps bot commented Feb 19, 2026

Greptile Summary

Adds comprehensive documentation for local embeddings feature across five documentation files. The PR introduces docs/EMBEDDINGS.md as a dedicated setup guide and updates existing docs to explain how vector-based semantic retrieval complements the existing predicate matching and spreading activation pipeline.

Key additions:

  • New standalone embeddings guide covering interactive/non-interactive setup, configuration via config file and environment variables, troubleshooting, and technical implementation details
  • CLI reference updated with --embeddings and --no-embeddings flags for the init command
  • Usage guide explains how embeddings work at learn-time and retrieval-time with fallback behavior
  • Science doc expanded with embedding-based retrieval architecture, model details, and lifecycle
  • MCP server doc notes vector pre-filtering in floop_active tool

Issues found:

  • Critical dimension mismatch: documentation states 384 dimensions in multiple locations, but nomic-embed-text-v1.5 produces 768-dimensional embeddings according to the official Nomic AI specification

Confidence Score: 4/5

  • Safe to merge after correcting the dimension inconsistency
  • Documentation-only PR with clear, comprehensive content and good cross-referencing. Score reduced by 1 point due to a critical technical inaccuracy (384 vs 768 dimensions) that appears in multiple files and affects storage calculations. Once corrected, this will be excellent documentation.
  • Pay close attention to docs/EMBEDDINGS.md lines 84, 88, 128 and docs/SCIENCE.md line 139 - all contain the dimension mismatch that needs correction

Important Files Changed

Filename Overview
docs/CLI_REFERENCE.md Added --embeddings and --no-embeddings flags to init command with clear examples
docs/EMBEDDINGS.md New comprehensive guide covering setup, configuration, troubleshooting, and technical details
docs/FLOOP_USAGE.md Added Local Embeddings section with setup and usage instructions
docs/SCIENCE.md Added Embedding-Based Retrieval section explaining architecture, model, and lifecycle
docs/integrations/mcp-server.md Added note about vector pre-filtering in floop_active tool

Last reviewed commit: f0ffff8

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

Base automatically changed from feature/yzma-phase-5-cli to feature/yzma-integration February 20, 2026 02:56
nvandessel and others added 2 commits February 19, 2026 19:06
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
nomic-embed-text-v1.5 produces 768-dimensional embeddings.
The model spec was already correct but storage/performance
calculations incorrectly used 384. Fixes all 4 instances
across EMBEDDINGS.md and SCIENCE.md.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@nvandessel nvandessel force-pushed the feature/yzma-phase-6-docs branch from 0f929b7 to 10a6ab1 Compare February 20, 2026 03:06
@nvandessel nvandessel merged commit 9ad76f2 into feature/yzma-integration Feb 20, 2026
@nvandessel nvandessel deleted the feature/yzma-phase-6-docs branch February 20, 2026 03:07
nvandessel added a commit that referenced this pull request Feb 20, 2026
* docs: add local embeddings documentation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(docs): correct embedding dimension from 384 to 768

nomic-embed-text-v1.5 produces 768-dimensional embeddings.
The model spec was already correct but storage/performance
calculations incorrectly used 384. Fixes all 4 instances
across EMBEDDINGS.md and SCIENCE.md.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
nvandessel added a commit that referenced this pull request Feb 20, 2026
* docs: add local embeddings documentation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(docs): correct embedding dimension from 384 to 768

nomic-embed-text-v1.5 produces 768-dimensional embeddings.
The model spec was already correct but storage/performance
calculations incorrectly used 384. Fixes all 4 instances
across EMBEDDINGS.md and SCIENCE.md.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
nvandessel added a commit that referenced this pull request Feb 20, 2026
* feat: implement local LLM client with yzma (purego)

Replace llama-go CGo bindings with hybridgroup/yzma, which uses purego
to load llama.cpp shared libraries at runtime. No CGo, no build tags,
no C++ compiler needed — go build always works.

- Single local.go file (no stub), package-level sync.Once for lib init
- Per-call context creation in Embed() for simplicity
- Available() checks lib dir + model file via os.Stat
- Integration tests gated on FLOOP_TEST_LIB_PATH + FLOOP_TEST_MODEL_PATH

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address review feedback on yzma local client

- Increase token context padding from +16 to +64 for special tokens
- Check llama.Decode() error return instead of ignoring it
- Reset model/vocab/nEmbd handles and sync.Once in Close() so
  post-close usage fails cleanly or reloads safely

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: resolve lint errors in local LLM client

- Fix gofmt alignment in LocalClient struct fields
- Check llama.Free and llama.ModelFree error returns (errcheck)
- Suppress gosec G115 for safe int->int32/uint32 conversions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: replace nolint comments with proper bounds checks

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: resolve gosec G115 lint errors

- Change GPULayers field type to int32 (matches llama API)
- Exclude gosec G115 (integer overflow) — no dataflow analysis,
  false positives on bounded values like len()
- Remove unnecessary math import and bounds check wrappers

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use MergeCandidate threshold in integration test

The IntentMatch threshold (>0.8) is too strict for small embedding
models like all-MiniLM-L6-v2. Use MergeCandidate (>0.7) which is
the actionable threshold for dedup and works across model sizes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: restore similarity.go removed in code quality audit

The cherry-picked yzma commits depend on CosineSimilarity and normalize
functions that were in similarity.go before the deep code quality audit
(d5bb413) removed the file. Restore it for the local LLM client.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(store): add embedding vector persistence via schema V3→V4 migration (#123)

Add EmbeddingStore interface and schema V4 migration for vector retrieval
support (yzma Phase 1). This enables storing embedding vectors alongside
behaviors for future semantic search.

Changes:
- Add BehaviorEmbedding struct and EmbeddingStore interface to store.go
- Schema V3→V4 migration: add embedding BLOB and embedding_model TEXT
  columns to behaviors table
- Implement StoreEmbedding, GetAllEmbeddings, GetBehaviorIDsWithoutEmbeddings
  on SQLiteGraphStore with binary encode/decode (little-endian float32)
- Add EmbeddingStore support to InMemoryGraphStore for testing
- Add EmbeddingStore delegation to MultiGraphStore (local/global merge)
- Add tests: migration, encode/decode round-trip, CRUD, NULL handling

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* feat(vectorsearch): add brute-force vector search and context composition (#126)

* feat(store): add embedding vector persistence via schema V3→V4 migration

Add EmbeddingStore interface and schema V4 migration for vector retrieval
support (yzma Phase 1). This enables storing embedding vectors alongside
behaviors for future semantic search.

Changes:
- Add BehaviorEmbedding struct and EmbeddingStore interface to store.go
- Schema V3→V4 migration: add embedding BLOB and embedding_model TEXT
  columns to behaviors table
- Implement StoreEmbedding, GetAllEmbeddings, GetBehaviorIDsWithoutEmbeddings
  on SQLiteGraphStore with binary encode/decode (little-endian float32)
- Add EmbeddingStore support to InMemoryGraphStore for testing
- Add EmbeddingStore delegation to MultiGraphStore (local/global merge)
- Add tests: migration, encode/decode round-trip, CRUD, NULL handling

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(vectorsearch): add brute-force vector search and context query composition

Implements Phase 2 of the yzma vector retrieval feature with an isolated,
well-tested vectorsearch package:

- BruteForceSearch: finds topK most similar behaviors by cosine similarity
- cosineSimilarity: local copy to avoid import cycle with internal/llm
- ComposeContextQuery: builds embeddable text from ContextSnapshot fields

All edge cases covered: empty/nil inputs, topK clamping, ordering, and
ProjectTypeUnknown exclusion.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* feat(vectorsearch): add embedder service (#125)

* feat(store): add embedding vector persistence via schema V3→V4 migration

Add EmbeddingStore interface and schema V4 migration for vector retrieval
support (yzma Phase 1). This enables storing embedding vectors alongside
behaviors for future semantic search.

Changes:
- Add BehaviorEmbedding struct and EmbeddingStore interface to store.go
- Schema V3→V4 migration: add embedding BLOB and embedding_model TEXT
  columns to behaviors table
- Implement StoreEmbedding, GetAllEmbeddings, GetBehaviorIDsWithoutEmbeddings
  on SQLiteGraphStore with binary encode/decode (little-endian float32)
- Add EmbeddingStore support to InMemoryGraphStore for testing
- Add EmbeddingStore delegation to MultiGraphStore (local/global merge)
- Add tests: migration, encode/decode round-trip, CRUD, NULL handling

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(vectorsearch): add brute-force vector search and context query composition

Implements Phase 2 of the yzma vector retrieval feature with an isolated,
well-tested vectorsearch package:

- BruteForceSearch: finds topK most similar behaviors by cosine similarity
- cosineSimilarity: local copy to avoid import cycle with internal/llm
- ComposeContextQuery: builds embeddable text from ContextSnapshot fields

All edge cases covered: empty/nil inputs, topK clamping, ordering, and
ProjectTypeUnknown exclusion.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(vectorsearch): add embedder service bridging LLM embeddings and store

Implements the Embedder type that orchestrates embedding generation and
persistence. Handles nomic-embed-text task prefixes (search_document for
behaviors, search_query for retrieval), provides BackfillMissing for
batch embedding of existing behaviors, and extractCanonical helper for
behavior content extraction.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* feat(mcp): wire vector embedding retrieval into pipeline (#127)

* feat(store): add embedding vector persistence via schema V3→V4 migration

Add EmbeddingStore interface and schema V4 migration for vector retrieval
support (yzma Phase 1). This enables storing embedding vectors alongside
behaviors for future semantic search.

Changes:
- Add BehaviorEmbedding struct and EmbeddingStore interface to store.go
- Schema V3→V4 migration: add embedding BLOB and embedding_model TEXT
  columns to behaviors table
- Implement StoreEmbedding, GetAllEmbeddings, GetBehaviorIDsWithoutEmbeddings
  on SQLiteGraphStore with binary encode/decode (little-endian float32)
- Add EmbeddingStore support to InMemoryGraphStore for testing
- Add EmbeddingStore delegation to MultiGraphStore (local/global merge)
- Add tests: migration, encode/decode round-trip, CRUD, NULL handling

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(vectorsearch): add brute-force vector search and context query composition

Implements Phase 2 of the yzma vector retrieval feature with an isolated,
well-tested vectorsearch package:

- BruteForceSearch: finds topK most similar behaviors by cosine similarity
- cosineSimilarity: local copy to avoid import cycle with internal/llm
- ComposeContextQuery: builds embeddable text from ContextSnapshot fields

All edge cases covered: empty/nil inputs, topK clamping, ordering, and
ProjectTypeUnknown exclusion.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(vectorsearch): add embedder service bridging LLM embeddings and store

Implements the Embedder type that orchestrates embedding generation and
persistence. Handles nomic-embed-text task prefixes (search_document for
behaviors, search_query for retrieval), provides BackfillMissing for
batch embedding of existing behaviors, and extractCanonical helper for
behavior content extraction.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(mcp): wire vector embedding retrieval into floop_active and floop_learn

Adds vector pre-filtering to handleFloopActive: when a local embedder is
available, behaviors are retrieved by semantic similarity to the current
context before entering the existing activation/spreading pipeline. Falls
back to full table scan when embedder is unavailable or vector search
returns no results.

Also embeds newly learned behaviors in the background after floop_learn,
and schedules a background backfill of unembedded behaviors on server
startup.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* feat(cli): add local embeddings setup to floop init (#130)

* feat(cli): add local embeddings setup to floop init

Adds embedding setup flow to `floop init`:
- Interactive prompt to download llama.cpp libs + nomic-embed-text model
- `--embeddings` / `--no-embeddings` flags for non-interactive use
- Auto-detects installed dependencies in ~/.floop/ on server startup
- Creates internal/setup package with DetectInstalled, DownloadLibraries,
  and DownloadEmbeddingModel functions
- Updates config.yaml with local provider settings after download

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(cli): improve embedding setup error handling and use stdlib

- Fail with error when --embeddings is explicitly passed and setup fails
  (previously silently continued with success status)
- Include embeddings_error in JSON output for interactive mode failures
- Replace custom contains() with strings.Contains() in tests

Addresses review feedback from Greptile on PR #130.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* docs: add local embeddings documentation (#131)

* docs: add local embeddings documentation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(docs): correct embedding dimension from 384 to 768

nomic-embed-text-v1.5 produces 768-dimensional embeddings.
The model spec was already correct but storage/performance
calculations incorrectly used 384. Fixes all 4 instances
across EMBEDDINGS.md and SCIENCE.md.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* refactor: extract vecmath package, fix vector fallback logic

Deduplicate CosineSimilarity/Normalize into internal/vecmath so both
llm and vectorsearch import from a single source. Fix handleFloopActive
to distinguish "vector retrieval succeeded with empty results" from
"not attempted / failed" by checking nodes == nil instead of len == 0.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: resolve golangci-lint issues (gofmt, unused type)

Fix trailing newline in search.go, struct field alignment in
embedder_test.go, and remove unused testNodeGetter interface.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: bump embedding migration to V5→V6, address review feedback

Renumber embedding migration from V4→V5 to V5→V6 since main's V4→V5
now holds the co_activations table (PR #137). Also fix int32 bounds
check for GPU layers config (clamp negative values to 0) and correct
schema comment from V4 to V6.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use ParseInt with bitSize 32 to satisfy CodeQL narrowing check

strconv.ParseInt(v, 10, 32) guarantees the value fits in int32,
eliminating the CodeQL integer-narrowing warning on the Atoi→int32 cast.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments