Semantic code search for Claude Code
Find code by meaning, not just string matching.
Quick Start · Usage · Models · Commands · Config · Examples
98.3% accuracy · 5x faster than grep · 20-query benchmark on a real codebase
# Install Ollama (local embeddings, free)
brew install ollama
ollama serve &
ollama pull nomic-embed-text
# Install Beacon plugin
claude plugin marketplace add sagarmk/Claude-Code-Beacon-Plugin
claude plugin install beacon@claude-code-beacon-plugin
# Restart Claude Code — Beacon indexes automaticallyAfter installing, Beacon indexes automatically on session start. Here's the essentials:
> /reindex
Deletes existing embeddings and rebuilds from scratch — useful after switching models or if the index gets stale.
> /index
Beacon Index
● ● ● ● ● nomic-embed-text · Ollama (local)
● ● ● ● ● 768 dims · 3.8 MB
● ● ● ● ●
● ● ● ● ● Coverage: 100% (38/38 files)
Indexed by extension
● .js 25 files
● .md 13 files
Statistics
Indexed files 38
Total chunks 109
Avg chunks/file 2.9
Last sync 2 minutes ago
For a quick numeric summary:
> /index-status
{
"files_indexed": 38,
"total_chunks": 114,
"last_sync": "2026-03-01T04:30:21.453Z",
"embedding_model": "nomic-embed-text",
"embedding_endpoint": "http://localhost:11434/v1"
}> /search-code "authentication flow"
[
{
"file": "src/middleware/auth.ts",
"lines": "12-45",
"similarity": "0.82",
"score": "0.74",
"preview": "export async function verifyAuth(req, res, next) {\n const token = req.headers.authorization?.split(' ')[1];\n ..."
},
{
"file": "src/routes/login.ts",
"lines": "8-32",
"similarity": "0.78",
"score": "0.65",
"preview": "router.post('/login', async (req, res) => {\n const { email, password } = req.body;\n ..."
}
]Hybrid search combines semantic similarity (understands meaning), BM25 keyword matching, and identifier boosting — so searching "auth flow" finds code about authentication even if it never uses the word "auth".
Options: --top-k N (results count), --threshold F (min score), --path <dir> (scope to directory), --no-hybrid (pure vector search).
Beacon runs on open-source models by default — no API keys, no cloud costs, fully local via Ollama.
| Model | Dims | Context | Speed | Best for |
|---|---|---|---|---|
| nomic-embed-text (default) | 768 | 8192 | Fast | General-purpose, great code search |
| mxbai-embed-large | 1024 | 512 | Fast | Higher accuracy, larger vectors |
| snowflake-arctic-embed:l | 1024 | 512 | Medium | Strong retrieval benchmarks |
| all-minilm | 384 | 512 | Very fast | Lightweight, low resource usage |
To switch models, pull with Ollama and update your config:
ollama pull mxbai-embed-large// .claude/beacon.json
{
"embedding": {
"model": "mxbai-embed-large",
"dimensions": 1024,
"query_prefix": ""
}
}Then run /reindex to rebuild with the new model.
For cloud-hosted embeddings, create .claude/beacon.json in your repo:
OpenAI
export OPENAI_API_KEY="sk-..."{
"embedding": {
"api_base": "https://api.openai.com/v1",
"model": "text-embedding-3-small",
"api_key_env": "OPENAI_API_KEY",
"dimensions": 1536,
"batch_size": 100,
"query_prefix": ""
}
}Voyage AI
export VOYAGE_API_KEY="pa-..."{
"embedding": {
"api_base": "https://api.voyageai.com/v1",
"model": "voyage-code-3",
"api_key_env": "VOYAGE_API_KEY",
"dimensions": 1024,
"batch_size": 50,
"query_prefix": ""
}
}LiteLLM proxy (Vertex AI, Bedrock, Azure, etc.)
pip install litellm
litellm --model vertex_ai/text-embedding-004 --port 4000{
"embedding": {
"api_base": "http://localhost:4000/v1",
"model": "vertex_ai/text-embedding-004",
"api_key_env": "LITELLM_API_KEY",
"dimensions": 1024,
"batch_size": 50,
"query_prefix": ""
}
}Custom endpoint
Any server implementing the OpenAI /v1/embeddings API will work. Set api_base, model, dimensions, and optionally api_key_env in .claude/beacon.json.
Beacon indexes your codebase automatically on session start and re-embeds files as you edit — no manual steps needed.
| Command | Description |
|---|---|
/search-code |
Hybrid code search — semantic + keyword + BM25 matching. Supports --path <dir> to scope results |
| Command | Description |
|---|---|
/index |
Visual overview — files, chunks, coverage, provider |
/index-status |
Quick health check — file count, chunk count, last sync |
/reindex |
Force full re-index from scratch |
/run-indexer |
Manually trigger indexing |
/terminate-indexer |
Kill a running sync process |
| Command | Description |
|---|---|
/config |
View and modify Beacon configuration |
/blacklist |
Prevent indexing of specific directories |
/whitelist |
Allow indexing in otherwise-blacklisted directories |
Beacon also provides a code-explorer agent and a semantic-search skill that Claude can invoke automatically.
Why Beacon?
- Understands your questions — ask "where is the auth flow?" and get
lib/auth.ts, not every file containing "auth" - Query expansion — searches for "auth" automatically find code mentioning "authentication", "authorize", and "login"
- Stays in sync automatically — hooks handle full index, incremental re-embedding on edits, and garbage collection
- Resilient — retries with backoff on transient failures, auto-recovers from DB corruption, debounces GC
- Works with any embedding provider — Ollama (local/free), OpenAI, Voyage AI, LiteLLM, or any OpenAI-compatible API
- Gives Claude better context — slash commands, a code-explorer agent, and a grep-nudge hook for smarter search
How It Works
Beacon uses Claude Code hooks to stay in sync with your codebase:
| Hook | Trigger | What it does |
|---|---|---|
| SessionStart | Every session | Full index on first run, diff-based catch-up on subsequent runs |
| PostToolUse | Write, Edit, MultiEdit |
Re-embeds the changed file |
| PostToolUse | Bash |
Garbage collects embeddings for deleted files |
| PreCompact | Before context compaction | Injects index status so search capability survives compaction |
| PreToolUse | Grep |
Intercepts grep and redirects to Beacon for semantic-style queries |
Configuration
Default configuration (config/beacon.default.json):
{
"embedding": {
"api_base": "http://localhost:11434/v1",
"model": "nomic-embed-text",
"api_key_env": "",
"dimensions": 768,
"batch_size": 10,
"query_prefix": "search_query: "
},
"chunking": {
"strategy": "hybrid",
"max_tokens": 512,
"overlap_tokens": 50
},
"indexing": {
"include": ["**/*.ts", "**/*.tsx", "**/*.js", "..."],
"exclude": ["node_modules/**", "dist/**", "..."],
"max_file_size_kb": 500,
"auto_index": true,
"max_files": 10000,
"concurrency": 4
},
"search": {
"top_k": 10,
"similarity_threshold": 0.35,
"hybrid": {
"enabled": true,
"weight_vector": 0.4,
"weight_bm25": 0.3,
"weight_rrf": 0.3,
"doc_penalty": 0.5,
"identifier_boost": 1.5,
"debug": false
}
},
"storage": {
"path": ".claude/.beacon"
}
}| Option | Default | Description |
|---|---|---|
embedding.api_base |
http://localhost:11434/v1 |
Embedding API endpoint |
embedding.model |
nomic-embed-text |
Embedding model name |
embedding.dimensions |
768 |
Vector dimensions (must match model) |
embedding.query_prefix |
search_query: |
Prefix prepended to search queries |
indexing.include |
Common code patterns | Glob patterns for files to index |
indexing.exclude |
node_modules, dist, etc. |
Glob patterns to skip |
indexing.max_file_size_kb |
500 |
Skip files larger than this |
indexing.auto_index |
true |
Auto-index on session start |
indexing.concurrency |
4 |
Number of files to index in parallel |
search.top_k |
10 |
Max results per query |
search.similarity_threshold |
0.35 |
Minimum similarity score |
search.hybrid.enabled |
true |
Enable hybrid search (set false for pure vector) |
Create .claude/beacon.json in any repo to override defaults. Values are deep-merged with the default config:
{
"embedding": {
"api_base": "https://api.openai.com/v1",
"model": "text-embedding-3-small",
"api_key_env": "OPENAI_API_KEY",
"dimensions": 1536
},
"indexing": {
"include": ["**/*.py"],
"max_files": 5000
}
}Beacon stores its SQLite database at .claude/.beacon/embeddings.db (configurable via storage.path). This file is auto-generated and safe to delete — run /reindex to rebuild. The database uses sqlite-vec for vector search and FTS5 for keyword matching.
Troubleshooting
Beacon degrades gracefully when the embedding server is unreachable — it never blocks your session. Embedding requests automatically retry with backoff (1s, 4s) before giving up.
| Scenario | Behavior |
|---|---|
| Session start | Sync is skipped, error is logged, session continues normally |
| Search | Falls back to keyword-only (BM25) search — still returns results |
| File edits | Re-embedding fails silently, old embeddings are preserved |
| Status commands | Work normally (DB-only, no Ollama needed) |
| DB corruption | Auto-detected and rebuilt on next sync |
Start Ollama at any time and run /run-indexer to catch up.
| Command | What it does |
|---|---|
/run-indexer |
Manually trigger indexing — useful when auto_index is off or after starting Ollama late |
/reindex |
Force a full re-index from scratch (deletes existing embeddings first) |
/terminate-indexer |
Kill a stuck sync process and clean up lock state |
Run /index for a visual overview with a coverage bar, file list, and provider info. For a quick numeric summary, use /index-status — it shows file count, chunk count, and last sync time.
Things to look for:
- Low coverage % — files may be excluded by glob patterns or exceeding
max_file_size_kb - Sync status errors — usually means the embedding server was unreachable during the last sync
- Stale sync warnings — the index hasn't been updated recently; run
/run-indexerto refresh
Run /search-code with a test query to confirm search is working. If results include "FTS-only" in debug output, the embedding server is unreachable — search still works but without semantic matching (keyword/BM25 only).
See EXAMPLES.md for real-world use cases — intent-based search, codebase navigation, identifier tracking, and auto-sync — each with concrete before/after comparisons.