Local code intelligence for AI agents. Index a codebase once; agents query its structure instead of re-discovering it with grep every session.
v0.3.0 · 12 languages · 16 MCP tools · SQLite-backed · zero network calls at query time · gate-green (~3,055 tests)
An AI coding agent starts every session blind. To answer "what breaks if I change init_db?" it greps for the name, opens each hit, reads the surrounding code, follows the imports, and reconstructs the call graph by hand — spending thousands of tokens rebuilding structural knowledge that was true last session and the session before.
That structure is computable. A parser already knows that index_one_file calls init_db, that Server holds a Client, that UserView implements Renderable. Seam computes it once, stores it in a local SQLite graph, and exposes it over a handful of MCP tools so the agent asks instead of greps.
Without Seam: "what calls init_db?" → grep → 14 files → read each → trace imports → ~30k tokens, often wrong
With Seam: seam_impact init_db → blast radius by risk tier → ~4.5k tokens, graph-accurate
The win compounds: every structural question — callers, blast radius, call paths, functional areas, which tests to run — is one tool call against a graph that stays fresh automatically.
Think of Seam as a compiler's symbol table and call graph for your whole repository, kept fresh in the background and exposed over MCP and a CLI.
source files tree-sitter .seam/seam.db
(12 languages) ─────────▶ structural ─────▶ SQLite + FTS5 ◀── file watcher
parsing (symbols + edges (debounced
+ clusters + FTS) re-index)
│
┌───────────────┴───────────────┐
▼ ▼
MCP server (stdio) CLI read commands
16 read-only tools schema / query / impact …
│ │
└──────────────┬────────────────┘
▼
AI agent (Claude Code · Cursor · Codex)
Three properties define it:
- Indexed once, fresh forever.
seam initbuilds the graph; an optionalwatchdogdaemon re-indexes edited files in the background. The agent never thinks about staleness — and graph-traversal tools surface a banner if the index is stale. - 100% local. The index is a per-project SQLite file (
.seam/seam.db). The read path makes no network calls — no API keys, no cloud, no telemetry. - A graph, not a search box. Symbols are nodes; calls, imports, inheritance, composition, and field access are typed edges. Every answer is graph traversal, not string matching.
Published on PyPI as seam-code — the PyPI name seam belongs to an unrelated package, so the distribution is seam-code while the import package and the seam command keep the short name.
pip install seam-code # CLI only
pip install 'seam-code[server]' # + the MCP server (seam start)
pip install 'seam-code[semantic]' # + semantic search (fastembed, ONNX/CPU, ~67 MB model on first run)
pip install 'seam-code[semantic-ann]' # + ANN acceleration for semantic search (sqlite-vec KNN)
pip install 'seam-code[web]' # + the Seam Explorer web UI (FastAPI)Or with npx (JS/TS projects, no Python toolchain required):
Prerequisite: install uv first — the shim delegates to
uvx(bundled with uv) to download and runseam-codefrom PyPI. No Python install needed; uv manages it.
npx @catafal/seam init # index the project
npx @catafal/seam search "auth token" # full-text search
npx @catafal/seam impact init_db --json # blast radiusThe npm package version mirrors the PyPI version exactly — @catafal/seam@0.4.0 always installs seam-code==0.4.0. No drift, no silent upgrades.
Or from source with uv:
git clone <repo-url> && cd seam
uv sync # CLI only — no MCP server, no semantic search
uv sync --extra server # + the MCP server (`seam start`) — adds the `mcp` package
uv sync --extra semantic # + semantic search (fastembed, ONNX/CPU, no torch, ~67 MB model on first run)
uv sync --extra web # + the Seam Explorer web UI (FastAPI)
# everything: uv sync --extra server --extra semantic --extra webEvery read command queries the SQLite index directly — the full feature set works with no MCP server running:
cd /path/to/your/project
uv run seam init # index the project (writes .seam/seam.db)
uv run seam search "auth token" # full-text (hybrid semantic when enabled)
uv run seam query "verify user login" # concept search + 1-hop graph expansion
uv run seam graph-search --preset hotspot --json # structural graph search
uv run seam architecture --json # bounded repo architecture briefing
uv run seam context authenticate_user # 360° view: callers, callees, cluster, signature
uv run seam impact authenticate_user # blast radius by risk tier
uv run seam structure # whole-repo directory/container map
# also: trace · changes · why · clusters · affected · flows · pack · status · syncseam install defaults to writing a token-lean CLI playbook into the repo so the agent queries via the seam CLI (cheaper than MCP — the CLI's --quiet mode is ~14× leaner than the leanest MCP call, and there's no ~6k-token standing tool-schema cost). It renders into each agent's cheapest native mechanism: a Claude Code skill, a Cursor agent-requested rule, and an AGENTS.md block for Codex.
uv run seam install # CLI guidance for Claude Code (skill + CLAUDE.md hook)
uv run seam install --target all # guidance for Claude Code + Cursor + Codex
uv run seam install --with-mcp # ALSO wire the MCP server (needs the `server` extra)
uv run seam install --print-config # preview everything, write nothing
uv run seam uninstall # reverse it (removes guidance + MCP config)The guidance teaches the agent the escalation ladder (--quiet → --json --lean → full --json), how to keep the index fresh (seam init / seam sync), and when to reach for each command. It's idempotent (a marker-delimited block in AGENTS.md/CLAUDE.md, never duplicated, foreign content preserved) and reversible.
Prefer native tool-calling? Add --with-mcp (install the server extra first). To wire MCP by hand, add to .mcp.json at the repo root:
{
"mcpServers": {
"seam": { "type": "stdio", "command": "seam", "args": ["start", "/path/to/your/project"] }
}
}For teams, running seam init on every developer machine (and keeping each one fresh) is unnecessary overhead. The shared-index flow lets CI build the index once on each merge to main, publish it as an artifact, and let developers download a pre-built index in seconds instead of waiting for full indexing.
Step 1 — wire the CI workflow (one-time)
Copy .github/workflows/seam-index.yml (included in this repo as a template) into your project and adapt the upload step to your artifact store. The template uses gh release create with no extra pinned actions; teams can swap the upload step for S3, GCS, Artifactory, or any store that serves the archive over HTTPS.
CI can produce the same artifact directly:
seam export-index --json
# seam pack-index --json remains as a compatibility aliasThe archive contains seam.db, optional vector sidecars, and an embedded manifest.json with the producer version, schema version, repository fingerprint, git metadata, and content flags. The archive is paired with seam-index.sha256.
Step 2 — set SEAM_INDEX_ARTIFACT_URL (per developer)
Add to your shell profile (or .env):
# The {sha} placeholder is expanded by `seam fetch` to the current HEAD SHA,
# then walked back through first-parent history until a published artifact is found.
export SEAM_INDEX_ARTIFACT_URL="https://github.com/<owner>/<repo>/releases/download/seam-index-{sha}/seam-index.tar.gz"Step 3 — run seam fetch
seam fetch # download, verify checksum, rebase CI paths to local, sync delta
seam fetch --semantic # also embed any symbols added locally after the CI cut-pointseam fetch is the one intentional setup-time network path in Seam. Every other command is fully offline — query time makes zero network calls (verified at the syscall level by .github/workflows/no-egress.yml). seam fetch is excluded from that proof because it is a deliberate one-time provisioning download, exactly like seam init --semantic's model bootstrap.
When a checksum sidecar is available, seam fetch --json also reports artifact.verified, the checksum, and the embedded manifest when present. Pre-manifest artifacts with a valid checksum still fetch as verified legacy artifacts and report artifact.manifest=null. Older artifact stores that do not publish seam-index.sha256 still work, but they are reported as unverified so automation can decide whether to reject them.
Local archive workflow
Use the local lifecycle commands when an artifact has already been downloaded or moved through a trusted internal channel:
seam inspect-index /path/to/seam-index.tar.gz --json
seam import-index /path/to/seam-index.tar.gz --path /repo --jsoninspect-index is read-only. import-index requires the checksum sidecar, rejects unsupported manifest/schema versions, refuses repository identity mismatches unless --allow-repo-mismatch is explicit, stages extraction in a temporary directory, opens the staged SQLite DB, rebases paths to the local checkout by default, then atomically swaps .seam/. Git remote and HEAD are used when available; the path fingerprint is a fallback for non-git artifacts. Validation, extraction, rebase, and swap failures preserve the existing index. Pass --sync only when you want local edits incorporated immediately after the artifact has landed.
After seam fetch or seam import-index runs, the index is local. All subsequent reads (seam query, seam impact, seam context, MCP tools) are offline. Run seam fetch again when you want to pull a newer CI build.
Caveats
- Model match: the embedding model used in CI (set via
SEAM_EMBED_MODELin the workflow, defaulting toBAAI/bge-small-en-v1.5) must match the model on developer machines. A mismatch is safe —seam fetchdetects it and the fetched index degrades gracefully to FTS5-only search. Semantic search re-enables automatically afterseam fetch --semantic. - Staleness banner: after
seam fetch, the index reflects the CI commit. If local files have diverged (new code, edited files),seam_impactand other graph tools will attach anindex_status: {stale: true, ...}banner. This is the normal signal to re-runseam fetch(for a newer CI build) orseam sync(to incorporate local edits into the fetched base).
seam workspace lets you query several already-indexed repos as one explicit local trust
set. It does not scan sibling directories, does not merge databases, and does not change
the default single-repo behavior of seam query, seam graph-search, seam impact, MCP
tools, or the Explorer.
seam workspace init /path/to/workspace --json
seam workspace add api /path/to/api /path/to/workspace --json
seam workspace add web /path/to/web /path/to/workspace --json
seam workspace status /path/to/workspace --json
seam workspace graph-search /path/to/workspace --kind route --json
seam workspace route-callers /path/to/workspace --method GET --path /api/users --json
seam workspace matches /path/to/workspace --config-key DATABASE_URL --json
seam workspace impact /path/to/workspace authenticate_user --jsonThe workspace registry lives at .seam/workspace.json under the selected workspace root.
Adding a repo writes only that registry; registered child repos are opened read-only during
status and query commands. Results carry a repo alias plus both local_uid and a
workspace UID formatted as repo_alias:local_uid, so follow-up snippets can target the
right checkout without depending on absolute paths. Config/resource matching keeps the
existing no-secret contract: key names and redacted value shape are allowed, raw values are
not returned.
Grouped by the question an agent is asking. Every tool is read-only; the server never writes the index.
| Tool | Answers | Key args |
|---|---|---|
seam_schema |
"What can this index answer?" — schema version, counts, optional capabilities, freshness, tool guidance, and warnings. | verbose |
seam_architecture |
"What kind of repo is this?" — bounded architecture briefing with physical areas, clusters, entry points, routes, configs, resources, infra, hotspots, boundaries, edge mix, warnings, and next calls. | scope, sections (MCP) / section (CLI/Web), limit, max_bytes |
seam_search |
"Where is text X mentioned?" — FTS5 over names + docstrings + signatures, with fuzzy fallback; hybrid keyword+semantic when enabled. | text, limit, semantic |
seam_query |
"Find all code related to concept X." — FTS5 match + 1-hop graph expansion, rescored by name/path/cluster signals. | concept, limit, semantic |
seam_graph_search |
"Which symbols match this graph shape?" — typed structural discovery by kind, edge kind, degree, path, preset, route/config/resource nodes, and optional one-hop previews. | kind, edge_kind, direction, preset, limit |
seam_snippet |
"Show me the exact code for this result." — bounded source text by UID, symbol, symbol+file, or file+line, with freshness/truncation warnings and optional same-file neighbor hints. | uid, symbol, file, line, include_neighbors |
| Tool | Answers | Key args |
|---|---|---|
seam_context |
"Show me everything about symbol X." — callers, callees, signature, cluster, field_readers/field_writers, and static test_callers/tested_symbols. Resolves bare/qualified/class names and merges homonyms. |
symbol or uid |
seam_context_pack |
"Give me a paste-ready bundle for X." — seam_context + WHY comments + enriched neighbors + cluster peers in one call; neighbors ranked by relevance to the seed. |
symbol |
seam_why |
"Why is this code like this?" — the WHY/HACK/NOTE/TODO/FIXME comments. |
symbol, path |
| Tool | Answers | Key args |
|---|---|---|
seam_impact |
"What breaks if I change X?" — blast radius bucketed into risk tiers, with provenance, summary counts, and a hard byte/entry budget. | target, direction, max_depth, limit, max_bytes |
seam_changes |
"Is my current diff risky?" — git diff → changed symbols → overall risk level. | scope, base_ref |
seam_affected |
"Which tests should I run?" — changed files → impacted test files via reverse-dependency traversal. | changed_files, depth |
| Tool | Answers | Key args |
|---|---|---|
seam_trace |
"How does X reach Y?" — shortest call/dependency path, hop by hop, with edge kind + confidence. | source, target, max_depth |
seam_flows |
"Where does execution start, and where does it go?" — entry points ranked by reach, or one entry's forward call-chain tree. | entry (optional) |
| Tool | Answers | Key args |
|---|---|---|
seam_clusters |
"What are the functional areas?" — Louvain communities (semantic coupling), or the members of one. | cluster_id (optional) |
seam_structure |
"How is the repo laid out?" — the filesystem → directory → file → container tree with symbol counts and area labels. | path, depth, nodes |
Lean mode. The enrichment-carrying tools (
seam_context,seam_trace,seam_impact,seam_context_pack) acceptverbose=false(CLI--lean) to drop heavy provenance fields and shrink the response for tight token budgets.seam_impactadditionally supports a hardmax_bytesceiling and emitsnext_actionshints when results are trimmed.
Twelve languages, parsed with tree-sitter. All share the same tools, symbol kinds, edge kinds, and enrichment fields.
| Language | Extensions | Language | Extensions | |
|---|---|---|---|---|
| Python | .py |
C# | .cs |
|
| TypeScript | .ts .tsx |
Ruby | .rb |
|
| JavaScript | .js .mjs .cjs |
C | .c .h |
|
| Go | .go |
C++ | .cpp .cc .cxx .c++ .hpp .hh .hxx |
|
| Rust | .rs |
PHP | .php |
|
| Java | .java |
Swift | .swift |
Kotlin is not yet supported — the available tree-sitter-kotlin grammar mis-parses common constructs (interfaces, objects, classes with constructors). Tracked for a future release. See
docs/adr/009-swift-support.md.Per-language extraction caveats (e.g.
.hmaps to C; C++ method visibility; Ruby dynamic visibility) are documented indocs/CONCEPTS.md.
Two reproducible, deliberately-honest benchmarks back the pitch.
Token reduction vs. grep + read — docs/benchmark.md
91.8% fewer tokens vs. realistic whole-file grep (88.7% vs. a conservative windowed grep), across 6 structural questions — 314k → 26k estimated tokens. Every query is a win; the largest are the things grep is worst at (
seam_clusters97.9%,seam_search97%).
Reproduce: seam init . && python benchmarks/run_benchmark.py.
Head-to-head vs. CodeGraph / graphify — docs/competitive-benchmark.md
Run on an external repo Seam had never seen:
- Fastest index — 0.26s (others 0.60–4.04s) and smallest footprint — 1.1 MB (others 3.4–45 MB).
- The only tool that writes nothing outside its own
.seam/folder — others mutate yourAGENTS.md/CLAUDE.mdor config on install. - Leanest answer on concept search.
Both are static retrieval-context proxies (chars ÷ 4), not live agent-session A/Bs — the docs state their own limitations. The competitive run predates the Phase 8 impact-tool fix, so current output is leaner than it shows.
A short tour — the full treatment is in docs/CONCEPTS.md.
The graph. Nodes are symbols (function, class, method, interface, type, field, route, config, resource). Edges are typed and capture fifteen relationships:
| Edge kind | Captures |
|---|---|
call |
one symbol invokes another |
import |
a module/symbol import |
extends · implements |
class inheritance / interface conformance |
instantiates |
new Foo() / struct-literal construction |
holds |
a class stores a typed field/property (composition / DI) |
uses |
a function references a user type as a parameter (signature coupling) |
reads · writes |
a field/property is read or written (data-flow) |
http_calls |
a symbol calls a literal HTTP route |
reads_config |
code reads a literal config or env key |
configures |
a config key describes a runtime resource |
raises |
a symbol explicitly raises or throws a visible exception type |
catches |
a symbol explicitly handles a typed exception |
tests |
a test symbol statically exercises or names a production symbol |
Edges are keyed by symbol name, not row id — this is what lets the watcher re-index one file independently without rewriting the whole graph. Route, config, and resource nodes live as normal symbols.kind values; route metadata lives in routes, config metadata lives in config_keys, and resource metadata lives in resources. Config metadata stores key names and redacted value shape only, never raw values. Docker Compose and Dockerfile evidence is indexed as resource nodes for services, images, Dockerfiles, build contexts, ports, stages, env files, volumes, and networks; dynamic/interpolated values are skipped. Traversal is kind-agnostic, and exception/test edges are exposed through graph search, trace, context, and architecture. Default blast-radius impact intentionally excludes raises/catches and tests so failure-path and test-evidence relationships do not inflate production change-risk tiers.
For infra evidence, start with seam architecture --section infra --json, then use seam graph-search --kind resource --json for concrete resource nodes and previews.
Confidence tiers. Each edge resolves to EXTRACTED (target is unambiguous), AMBIGUOUS (name collides — verify), or INFERRED (heuristic / cross-module). A multi-hop path is only as strong as its weakest hop. Each result carries resolved_by provenance explaining how the tier was decided.
Risk tiers. seam_impact and seam_changes bucket dependents by distance: WILL_BREAK (d=1, must update), LIKELY_AFFECTED (d=2, should test), MAY_NEED_TESTING (d≥3, test if critical). seam_changes rolls these up into low → medium → high → critical.
Clusters. A pure-Python Louvain pass groups symbols into functional areas by coupling. Labels are deterministic by default (dir/file — top symbol) or, opt-in, LLM-generated at index time only.
Edge synthesis. Static parsing can't see runtime polymorphism. A post-pass over the whole graph synthesizes the edges parsing structurally misses — interface→implementation fan-out, closure-collection dispatch, event-emitter handlers — tagged with their provenance channel.
Semantic search. Opt-in local embeddings (fastembed, ONNX on CPU) merge with FTS5 via Reciprocal Rank Fusion, so "retry logic" can surface _backoff_with_jitter even with no shared token. The model downloads once, then runs 100% locally.
KNN scaffold via sqlite-vec (WS2b). The optional [semantic-ann] extra wires in sqlite-vec as a third tier in the read path (vec0 KNN → mmap → SQL brute-force), each falling through to the next on any structural issue. This tier is off by default (SEAM_VEC_ANN=off) and ships as a forward-compatible scaffold, not a performance improvement today.
Benchmark reality (sqlite-vec v0.1.9): sqlite-vec's
vec0virtual table performs exact brute-force KNN — it has no approximate-nearest-neighbour index (no HNSW, no IVF). Measured on synthetic 384-dim float32 embeddings it is ~5× slower than the built-in numpy matmul path at every scale tested (10k–250k rows), with perfect recall@10 = 1.000. EnableSEAM_VEC_ANN=ononly if you are testing the scaffold or waiting for sqlite-vec to ship a true approximate index.
N rows numpy BF ms sqlite-vec ms speedup recall@10 10,000 0.588 3.919 0.2× 1.000 50,000 5.903 20.496 0.3× 1.000 100,000 7.053 41.609 0.2× 1.000 250,000 19.740 109.373 0.2× 1.000
When sqlite-vec ships a true approximate index (HNSW/IVF), enabling SEAM_VEC_ANN=on will transparently upgrade from exact to approximate without any re-index of the embeddings table — the scaffold is there, waiting. To opt in:
pip install 'seam-code[semantic-ann]'
SEAM_VEC_ANN=on seam init --semantic # builds embeddings + vec0 KNN tableTwo knobs govern the vec0 tier:
| Variable | Default | Effect |
|---|---|---|
SEAM_VEC_ANN |
off |
Master switch. on enables the vec0 KNN tier (off = byte-identical to pre-WS2b). Leave off until sqlite-vec ships approximate indexing. |
SEAM_VEC_ANN_MIN_ROWS |
50000 |
Minimum embedding count before the vec0 table is built. Forward-compat gate; does not indicate a performance crossover today (vec0 is currently slower at all scales). |
Packaging caveat. sqlite-vec is a native C extension. Some Python builds — notably the macOS system Python — disable conn.enable_load_extension() at compile time. On those builds Seam detects the failure, logs a single clear notice, and falls back to the mmap/brute-force path automatically. Nothing crashes; no results are lost. To use the vec0 tier on macOS, install Python via python.org or brew install python (both enable extension loading), or run via uv (which uses its own Python build that supports extension loading).
Staleness banner. Graph-traversal tools attach an index_status banner when the index has drifted from disk — so an agent is never silently handed wrong blast-radius answers.
Everything is environment-variable driven with sensible defaults — see docs/CONFIGURATION.md for the full reference (~50 knobs). The few you might actually set:
| Variable | Default | Effect |
|---|---|---|
SEAM_SEMANTIC |
off |
Enable hybrid keyword + semantic search (needs the semantic extra + seam init --semantic). |
SEAM_CLUSTER_NAMING |
deterministic |
llm opts into LLM cluster labels at index time (needs SEAM_LLM_API_KEY). Read path stays 100% local regardless. |
SEAM_IMPACT_MAX_BYTES |
0 (off) |
Hard character ceiling on seam_impact output for tight context budgets. |
Most knobs are gated on by default and have an off that restores byte-identical pre-feature behavior — a deliberate discipline so upgrades never silently change tool output.
seam serve (the [web] extra) starts a read-only, 127.0.0.1-only browser explorer for the index.
uv sync --extra web
seam init # index first
seam serve # opens http://127.0.0.1:7420
seam serve --no-open --port 8000A React + TypeScript SPA (React Flow) served by FastAPI. Nothing leaves the machine. Features: command-palette search, a depth-1 caller/callee card-canvas with confidence-styled edges, lazy expand, a resizable detail panel with grouped clickable caller/callee rows (edge kind + confidence badges), a GraphHUD (live node/edge/filtered counts + freshness dot), a FilterBar (node kinds + edge kinds + confidence tiers with All/None controls and live counts), a file-tree sidebar with debounced search, an impact overlay that paints blast radius by risk tier, a fly-to-fit viewport on overlay activation, a trace-path highlighter, a git-changes drawer, a schema/architecture read API, a whole-repo cluster constellation, and a 3D Constellation Explorer tab. Explorer routes reuse the same handlers that power the CLI/MCP tools — a third transport, no query logic duplicated. Filter preferences (node kinds, edge kinds, confidence) persist across page reloads via localStorage and survive symbol navigation within a session.
The Constellation tab renders the entire indexed graph as an interactive star field:
- Nodes are spheres sized and colored by degree and symbol kind (red dwarf → blue giant stellar scale). Hub classes appear as large blue/white spheres; isolated helpers are small red dots.
- Edges are additive-blended line segments color-coded by kind (teal = call, blue = import, purple = extends, cyan = holds, green = reads, red = writes, amber = uses).
- Cluster halos are faint translucent spheres marking the spatial extent of each Louvain functional area. Click any node to fly the camera there and open a detail panel with neighbors.
- Filters (left panel): show/hide node kinds and edge kinds. HUD (top-left): visible counts, max-nodes slider.
The layout is computed server-side (ForceAtlas2 + ring seeds in numpy) and cached per index version, so the browser tab always loads pre-positioned positions with no client-side simulation cost. Positions are deterministic — the same index always produces the same layout.
pip install 'seam-code[web]'
seam init # index first
seam serve # opens http://127.0.0.1:7420 → click the "Constellation" tabThe import hierarchy is strictly layered — read flows down, never up:
cli / server / web → analysis → query → indexer / db
analysis is built from pure leaf modules (no DB, no IO, never raise) — clustering, RWR neighbor ranking, byte budgeting, the truncation steer, staleness detection. This makes each piece testable in isolation and keeps the failure surface tiny.
- Read it as a narrative:
docs/ARCHITECTURE.md— current system overview, then the full phase-by-phase history. - Read it as a diagram:
docs/architecture.html— a standalone illustrated page (system pipeline, layered hierarchy, data-flow, edge-kind graph, schema). - Decision rationale:
docs/adr/— architecture decision records. - Concepts in depth:
docs/CONCEPTS.md— how and why each subsystem works.
These are the non-negotiables — and they are guarantees to the user, not just internal rules:
- Zero external services at runtime. The MCP read path makes no network calls. The only optional outbound call (LLM cluster naming) runs at index time, is off by default, and falls back to deterministic labels on any error.
- SQLite only. No graph DB, no vector DB to babysit, no ORM. One file per project, gitignored.
- Parsers never raise. A malformed file is skipped, not fatal. The indexer degrades gracefully; analysis leaves return empty rather than throw.
- Edges are keyed by name. This is what makes independent per-file re-indexing correct — the watcher can rewrite one file's symbols and edges without touching the rest of the graph.
- Additive by default. New features ship behind a defaulted-on switch with an
offthat is byte-identical to before. Schema changes are additive migrations that auto-run on open.
uv sync --dev # install dev dependencies
make gate # lint (ruff) + typecheck (mypy) + tests — must be green before every commit
make fmt # format + autofix (not part of the gate)
make build-web # build the Explorer SPA into seam/_web/ (requires Node.js — build-time only)
make eval # run the recall-regression harness
make test-npm # run the npm shim vitest suite (requires Node.js ≥18)
make bench-semantic # semantic recall benchmark — needs [semantic] extra + seam init --semantic
make bench-semantic-ann # ANN scale benchmark (brute-force vs KNN latency + recall) — needs [semantic-ann] extra
make soak # sustained mixed read load (P5.5 diagnostics harness)The npm shim (pkg/npm/) is published manually from that directory, trailing the matching PyPI release:
# Step 1: ensure seam-code==<version> is live on PyPI
# Step 2: bump pkg/npm/package.json version to match pyproject.toml version
make gate # the version-parity test fails if they differ
# Step 3: publish
cd pkg/npm
npm publish --access publicThe gate test test_npm_package_version_matches_pyproject (in tests/unit/test_smoke.py) fails if pkg/npm/package.json version diverges from pyproject.toml version, preventing accidental drift.
Conventions: max 200 lines/function, 1000 lines/file · all imports at top · config only from seam/config.py · type hints required (X | None, not Optional[X]) · tests in tests/ mirroring the package.
See CONTRIBUTING.md for setup, the gate workflow, and project conventions. Bug reports and feature requests: use the issue templates. Security issues: see SECURITY.md. Participation is governed by the Code of Conduct.
See LICENSE.

