Dual-engine code intelligence for OpenCode.
lgrep gives AI agents a better first move.
Instead of starting with glob, grep, and random file reads, agents can:
- search by meaning when they do not know the symbol yet
- search by symbol when they know the name
- inspect file and repo structure before opening code
- reuse one warm local server across multiple sessions and agents
That is the whole pitch: fewer bad searches, less wasted context, faster understanding for both humans and agents.
lgrep combines two complementary engines in one MCP server:
- Semantic engine - natural-language code search using Voyage Code 3 embeddings with local LanceDB storage
- Symbol engine - exact symbol, outline, and text tools using tree-sitter parsing with a local JSON index
Use the semantic engine to answer questions like:
- "where is auth enforced between route and service?"
- "how does retry logic work for failed requests?"
- "where are permissions checked?"
Use the symbol engine to answer questions like:
- "find the
authenticatefunction" - "show me the outline for
src/auth.py" - "get the symbol source for
UserService.login"
Your code stays local. Only short semantic queries and indexing payloads go to Voyage. Symbol lookup stays fully local and works without an API key.
AI coding agents usually fail early, not late.
They miss because they start with the wrong retrieval primitive:
grepcannot answer concept questions- full-file reads waste tokens on irrelevant code
- repeated local exploration across parallel agents duplicates work
lgrep fixes that by giving agents a search stack that matches how they actually reason:
- find the implementation by intent
- narrow to the right file or symbol
- retrieve only the code that matters
For heavy OpenCode users, this is not a convenience plugin. It is search infrastructure.
- Intent-first search - agents can ask by meaning before they know names
- Exact structure tools - file outlines, repo outlines, symbol lookup, and text search are in the same server
- Local-first storage - vectors and indexes live on disk, not in someone else's SaaS
- Shared warm process - one HTTP MCP server can serve multiple concurrent OpenCode sessions
- Commercially usable -
lgrepis MIT licensed, so commercial use is allowed
grep and rg are still the right tool for exact text and regex lookups. They are not good at intent discovery.
If the code says jwt.verify() and your agent asks "where is authentication enforced?", text search often misses the right entry point. Semantic search closes that gap.
mgrep is the closest semantic-search comparison point.
mgrepis semantic-onlylgrepcombines semantic search with symbol and structure toolsmgrepis cloud-orientedlgrepkeeps vectors local and shares one warm server across agents
jcodemunch-mcp is a strong symbol-first MCP server. It is built around tree-sitter indexing and precise byte-offset retrieval, which makes it good when an agent already knows roughly what symbol it wants and wants to minimize context spend.
lgrep is optimized for a different problem: the agent does not know the symbol yet and needs to find the right implementation by intent first, then drill into structure.
| grep / rg | mgrep | jCodeMunch MCP | lgrep | |
|---|---|---|---|---|
| Primary strength | Exact text / regex | Semantic search | Symbol retrieval | Semantic discovery + symbol retrieval |
| Semantic search | No | Yes | No | Yes |
| Symbol tools | No | No | Yes | Yes |
| File / repo outline | No | No | Yes | Yes |
| Local vector storage | N/A | No | N/A | Yes |
| No API key mode | Yes | No | Yes | Yes (symbol engine) |
| Commercial usage | Yes | Subscription service | Paid commercial license required per its README | Yes (MIT) |
| Best first question | "find this exact string" | "find code matching this concept" | "find this exact symbol" | "find the code that implements this idea" |
If you want the most precise symbol retrieval with no semantic layer, jcodemunch-mcp is a credible option.
If you want OpenCode agents to start with intent-level discovery and still have symbol tooling in the same server, lgrep is the better fit.
flowchart LR
A[OpenCode Session 1] --> M[lgrep MCP Server]
B[OpenCode Session 2] --> M
C[OpenCode Session N] --> M
M --> S[Semantic Engine\nVoyage Code 3 + LanceDB]
M --> Y[Symbol Engine\ntree-sitter + JSON index]
S --> V[(Local vector store)]
Y --> J[(Local symbol store)]
S -. query embeddings .-> Q[Voyage API]
Agent -> lgrep MCP server -> semantic engine + symbol engine
- Discover files while respecting
.gitignore - Chunk code with AST-aware boundaries
- Embed chunks with Voyage Code 3
- Store vectors locally in LanceDB
- Search with hybrid retrieval and reranking
- Parse source with tree-sitter
- Extract functions, classes, methods, and related structure
- Store a local symbol index
- Serve symbol search, outlines, and source retrieval without an API call
- Python 3.11+
- a Voyage API key if you want semantic search
pip install git+https://github.com/Sharper-Flow/lgrep.gitgit clone https://github.com/Sharper-Flow/lgrep.git
cd lgrep
pip install .Create a key at dash.voyageai.com.
You only need this for the semantic engine. The symbol engine works without it.
Recommended manual start:
VOYAGE_API_KEY=your-key \
LGREP_WARM_PATHS=/path/to/project-a:/path/to/project-b \
lgrep --transport streamable-http --host 127.0.0.1 --port 6285Why HTTP instead of stdio?
With stdio, each OpenCode session spawns its own server process. With streamable-http, one warm server handles all sessions.
Add this to ~/.config/opencode/opencode.json:
{
"instructions": [
"~/.config/opencode/instructions/lgrep-tools.md"
],
"mcp": {
"lgrep": {
"type": "remote",
"url": "http://localhost:6285/mcp",
"enabled": true
}
}
}Or let lgrep do the wiring for you:
lgrep install-opencodeThat installer can:
- add the MCP entry
- copy the packaged
lgrep-tools.mdinstruction into your OpenCode config - copy the packaged
skills/lgrep/SKILL.mdreference file into your OpenCode config - append it to the
instructionsarray so agents preferlgrepfirst
Important: the active agent must also expose lgrep_* tool definitions in its
tool manifest. If an agent profile only allows read/glob/grep, the model
cannot choose lgrep even when the MCP server is configured and the instruction
policy is present.
The installed files land at:
~/.config/opencode/instructions/lgrep-tools.md~/.config/opencode/skills/lgrep/SKILL.md
lgrep init-ignore /path/to/projectTypical OpenCode flow:
- Ask an intent question with
lgrep_search_semantic - Inspect structure with
lgrep_get_file_outlineorlgrep_get_repo_outline - Retrieve exact symbols with
lgrep_search_symbolsandlgrep_get_symbol
Examples:
lgrep_search_semantic(query="authentication flow", path="/path/to/project")
lgrep_get_file_outline(path="/path/to/project/src/auth.py")
lgrep_index_symbols_folder(path="/path/to/project")
lgrep_search_symbols(query="authenticate", path="/path/to/project")
lgrep_get_symbol(symbol_id="src/auth.py:function:authenticate", path="/path/to/project")
High-value prompts:
- "Where do we enforce auth between route and service?"
- "Find the
authenticatefunction" - "What are the main symbols in
src/auth.py?" - "Show me the repo structure around billing"
- "Find references to
verifyToken"
| Task | Best tool | Why |
|---|---|---|
| Intent or concept discovery | lgrep_search_semantic |
Search by meaning |
| Find a function or class by name | lgrep_search_symbols |
Exact symbol lookup |
| Inspect a single file's structure | lgrep_get_file_outline |
Fast AST outline |
| Inspect repo structure | lgrep_get_repo_outline |
Symbol-level overview |
| Find exact text or identifiers | lgrep_search_text or grep |
Literal match |
| Retrieve exact source for a symbol | lgrep_get_symbol |
Targeted code retrieval |
| Read a known file directly | Read |
No search needed |
| Tool | Purpose |
|---|---|
lgrep_search_semantic(query, path, limit=10, hybrid=true) |
Search code by meaning |
lgrep_index_semantic(path) |
Build or refresh a semantic index |
lgrep_status_semantic(path?) |
Show semantic index and watcher status |
lgrep_watch_start_semantic(path) |
Start background semantic re-indexing |
lgrep_watch_stop_semantic(path?) |
Stop the watcher |
| Tool | Purpose |
|---|---|
lgrep_index_symbols_folder(path, max_files=500, incremental=True) |
Index symbols in a local folder |
lgrep_index_symbols_repo(repo, ref="HEAD") |
Index symbols from a GitHub repo |
lgrep_list_repos() |
List indexed symbol repos |
lgrep_get_file_tree(path, max_files=500) |
Show repo file tree |
lgrep_get_file_outline(path) |
Show symbol outline for one file |
lgrep_get_repo_outline(path, max_files=500) |
Show symbol outline for a repo |
lgrep_search_symbols(query, path, limit=20, kind?) |
Search symbols by name |
lgrep_search_text(query, path, max_results=50) |
Search literal text |
lgrep_get_symbol(symbol_id, path) |
Retrieve one symbol |
lgrep_get_symbols(symbol_ids, path) |
Retrieve multiple symbols |
lgrep_invalidate_cache(path) |
Drop the symbol index for a repo |
Symbol IDs use this deterministic format:
file_path:kind:name
Examples:
src/auth.py:function:authenticate
src/auth.py:class:AuthManager
src/auth.py:method:login
lgrep supports both stdio and streamable-http, but shared HTTP is the intended deployment mode for OpenCode.
lgrep --transport streamable-http --host 127.0.0.1 --port 6285Security notes:
- default host is
127.0.0.1 - there is no built-in auth layer on the HTTP transport
lgrepdoes not set CORS headers, and browser-based clients should not connect directly to the streamable HTTP endpoint- if you do put it behind a proxy, enforce your own authentication and origin controls there
- exposing
0.0.0.0is a non-default, explicit opt-in; do not do it without a reverse proxy or firewall
| Variable | Required | Default | Description |
|---|---|---|---|
VOYAGE_API_KEY |
For semantic search | none | Voyage API key |
LGREP_LOG_LEVEL |
No | INFO |
Log verbosity |
LGREP_CACHE_DIR |
No | ~/.cache/lgrep |
Cache directory |
LGREP_WARM_PATHS |
No | none | Colon-separated projects to warm on startup |
.gitignoreis respected automatically.lgrepignorelets you exclude additional paths
Example .lgrepignore entries:
src/generated/
docs/site/
*.test.data
Typical operating profile:
| Resource | Idle | During indexing |
|---|---|---|
| RAM | ~300MB | ~500MB |
| CPU | <1% | usually low local CPU; Voyage does most semantic heavy lifting |
| Disk | ~250MB per semantic index + ~5MB per symbol index | grows with indexed projects |
| Network | minimal | semantic indexing and semantic queries call Voyage |
- Semantic engine - AST-aware chunking for 30+ languages, with text fallback when needed
- Symbol engine - tree-sitter-language-pack support across 165+ languages
VOYAGE_API_KEY not set
Set the key in your MCP server environment. The symbol engine still works without it.
Slow first semantic index
The first run embeds the whole project. Later runs skip unchanged files using hashes.
Repository not indexed from symbol tools
Run lgrep_index_symbols_folder(path=...) first.
Stale semantic results
Run lgrep_index_semantic(...) or start lgrep_watch_start_semantic(...).
Native dependency build issues
lgrep depends on packages with native extensions. Prebuilt wheels usually work; otherwise install the required compiler toolchain.
The semantic tools were renamed in v2.0.0:
| v1.x | v2.x |
|---|---|
lgrep_search |
lgrep_search_semantic |
lgrep_index |
lgrep_index_semantic |
lgrep_status |
lgrep_status_semantic |
lgrep_watch_start |
lgrep_watch_start_semantic |
lgrep_watch_stop |
lgrep_watch_stop_semantic |
git clone https://github.com/Sharper-Flow/lgrep.git
cd lgrep
pip install -e ".[dev]"
pytest -vMIT - see LICENSE.