GroundMemory - Documentation

This document covers installation, integration guides, configuration reference, and environment variables. For a project overview and quick start, see README.md.

GroundMemory - Documentation

Installation & Configuration

Option 1 - Docker

Docker is the recommended way to run GroundMemory. It requires no Python environment setup and keeps your workspace data in a local ./data directory.

git clone https://github.com/huss-mo/GroundMemory && cd GroundMemory
cp groundmemory/config/.env.example .env
docker compose up -d
# → listening on http://127.0.0.1:4242/mcp

The default compose file starts a single GroundMemory service using BM25-only search (no embedding API required). Edit .env to switch providers - see Embedding Providers below.

To run a second workspace on a different port (e.g. for a separate project or user), uncomment the GroundMemory-personal service in docker-compose.yml:

# GroundMemory-personal:
#   build:
#     context: .
#   image: GroundMemory:latest
#   restart: unless-stopped
#   ports:
#     - "4243:4242"
#   volumes:
#     - ./data:/data
#   env_file:
#     - .env
#   environment:
#     GROUNDMEMORY_WORKSPACE: personal

Workspace data is stored in ./data/<workspace-name>/ on the host and persists across container restarts.

Building with sentence-transformers (local embeddings)

The default Docker image does not include sentence-transformers. To build an image that supports the local embedding provider, pass the EXTRAS=local build argument:

docker compose build --build-arg EXTRAS=local
docker compose up -d

Then set GROUNDMEMORY_EMBEDDING__PROVIDER=local in your .env.

Option 2 - pip

For development or direct integration without Docker:

# BM25-only - no extra dependencies
pip install groundmemory

# With local sentence-transformers embeddings
pip install "groundmemory[local]"

Then start the MCP server:

GROUNDMEMORY_WORKSPACE=my-project groundmemory-mcp
# → listening on http://127.0.0.1:4242/mcp

Configuration for pip installs

GroundMemory reads config from ~/.groundmemory/ - the same directory where workspace data lives. Place your config files there once and they'll be found regardless of which directory you run groundmemory-mcp from:

~/.groundmemory/
├── .env                  ← environment-variable style config
├── groundmemory.yaml       ← YAML style config
└── default/              ← workspace data (auto-created on first run)
    ├── MEMORY.md
    └── ...

Both .env and groundmemory.yaml are optional - use whichever format you prefer (or neither, and set env vars directly). Environment variables always take priority over config files.

On first run, groundmemory-mcp automatically copies an annotated example config into ~/.groundmemory/groundmemory.yaml.example - the full YAML reference with every option documented. You can also find it in the repository at GroundMemory/config/groundmemory.yaml.example.

For environment-variable style config, copy the bundled example manually:

cp GroundMemory/config/.env.example ~/.groundmemory/.env

A cwd-level config file (./groundmemory.yaml or ./.env) is also checked as a fallback, which is useful for per-project overrides in dev mode.

Network Access

By default, GroundMemory binds to 127.0.0.1 (localhost only). This means only processes on the same machine can reach the server - which is the right default for a single-user setup.

LAN access (same local network)

To accept connections from other devices on your network, set the host to 0.0.0.0 and add your server's LAN address to the allowed-hosts list:

# pip / direct install
GROUNDMEMORY_MCP__HOST=0.0.0.0 \
GROUNDMEMORY_MCP__ALLOWED_HOSTS="192.168.1.50:4242" \
groundmemory-mcp

# groundmemory.yaml
mcp:
  host: 0.0.0.0
  allowed_hosts: "192.168.1.50:4242"

For Docker, uncomment the two required network-access lines in docker-compose.yml (see the comments in that file).

allowed_hosts is the DNS-rebinding protection allowlist - it controls which Host: header values the server accepts. List every address clients will use to reach the server (e.g. 192.168.1.50:4242). localhost and 127.0.0.1 are always allowed implicitly. Only exact strings are supported - wildcards and CIDR ranges are not.

Authentication

When exposing the server beyond localhost, set GROUNDMEMORY_MCP__API_KEY (or mcp.api_key in YAML) to a secret token. Every request must then include:

Authorization: Bearer <your-token>

MCP clients that support custom headers (Cline, Cursor, Claude Desktop) can be configured like this:

{
  "mcpServers": {
    "GroundMemory": {
      "url": "http://192.168.1.50:4242/mcp",
      "headers": {
        "Authorization": "Bearer <your-secret-token>"
      }
    }
  }
}

When api_key is not set (the default), no authentication is enforced and the server behaves exactly as before - no breaking change.

Public internet access

Do not expose GroundMemory directly on the public internet. Use a reverse proxy (nginx, Caddy, Traefik) with TLS in front of it, and set api_key for authentication.

The GROUNDMEMORY_MCP__FORWARDED_ALLOW_IPS setting controls which upstream IPs uvicorn trusts to pass X-Forwarded-For / X-Real-IP headers. Set it to your proxy's internal IP when running behind a reverse proxy (default: 127.0.0.1).

Embedding Providers

GroundMemory supports three embedding providers. You can switch between them at any time by changing GROUNDMEMORY_EMBEDDING__PROVIDER (or embedding.provider in groundmemory.yaml). No data migration is required.

Provider	Value	Extra install required?	When to use
BM25-only	`none`	No	Default. Pure keyword search via SQLite FTS5. Works offline, no API key needed.
OpenAI-compatible API	`openai`	No	Any HTTP embedding API: OpenAI, Ollama, LM Studio, LiteLLM, vLLM, Mistral, etc. Requires `GROUNDMEMORY_EMBEDDING__BASE_URL` and `GROUNDMEMORY_EMBEDDING__API_KEY`.
Local sentence-transformers	`local`	Yes - `pip install "groundmemory[local]"` or `--build-arg EXTRAS=local` for Docker	Fully offline vector embeddings. Downloads model on first run.

none - BM25-only (default)

No configuration needed. GroundMemory uses SQLite FTS5 for all search. Ideal for getting started quickly or for air-gapped environments.

GROUNDMEMORY_EMBEDDING__PROVIDER=none groundmemory-mcp

openai - OpenAI-compatible HTTP API

Works with any endpoint that follows the OpenAI embeddings API format. No extra Python packages are required - only httpx, which is a core dependency already installed with GroundMemory.

# Real OpenAI
GROUNDMEMORY_EMBEDDING__PROVIDER=openai \
GROUNDMEMORY_EMBEDDING__API_KEY=sk-... \
GROUNDMEMORY_EMBEDDING__MODEL=text-embedding-3-small \
groundmemory-mcp

# Ollama (local server, no API key needed)
GROUNDMEMORY_EMBEDDING__PROVIDER=openai \
GROUNDMEMORY_EMBEDDING__BASE_URL=http://localhost:11434/v1 \
GROUNDMEMORY_EMBEDDING__API_KEY=ollama \
GROUNDMEMORY_EMBEDDING__MODEL=nomic-embed-text \
groundmemory-mcp

# LM Studio
GROUNDMEMORY_EMBEDDING__PROVIDER=openai \
GROUNDMEMORY_EMBEDDING__BASE_URL=http://localhost:1234/v1 \
GROUNDMEMORY_EMBEDDING__API_KEY=lm-studio \
GROUNDMEMORY_EMBEDDING__MODEL=nomic-ai/nomic-embed-text-v1.5-GGUF \
groundmemory-mcp

local - sentence-transformers (offline)

Runs a local embedding model entirely on your machine - no network call, no API key. Requires installing the optional local extra, which pulls in sentence-transformers and its dependencies (PyTorch, Transformers, etc.). The model is downloaded from HuggingFace on first use.

# Install the extra first
pip install "groundmemory[local]"

GROUNDMEMORY_EMBEDDING__PROVIDER=local \
GROUNDMEMORY_EMBEDDING__LOCAL_MODEL=all-MiniLM-L6-v2 \
groundmemory-mcp

MCP Server

GroundMemory can run as a standalone MCP (Model Context Protocol) server over HTTP, exposing all 8 memory tools to any MCP-compatible client - including Claude Desktop, Cursor, Cline, and custom agents.

Each server instance owns a single workspace. Multiple workspaces require multiple server processes running on different ports.

Running the Server

The groundmemory-mcp command is available after installing GroundMemory (MCP support is included by default).

# Default: workspace "default", host 127.0.0.1, port 4242
groundmemory-mcp

# Custom workspace
GROUNDMEMORY_WORKSPACE=my-project groundmemory-mcp

# Custom host and port
GROUNDMEMORY_MCP__HOST=0.0.0.0 GROUNDMEMORY_MCP__PORT=9000 groundmemory-mcp

The server starts at http://<host>:<port>/mcp using the streamable-http MCP transport.

Client Configuration

GroundMemory speaks standard MCP over HTTP, so any MCP-compatible client works. The table below shows the most common categories and examples - the list is not exhaustive.

Category	Examples	How to connect
AI coding assistants	Cursor, Cline, Windsurf, Claude Code, Codex CLI	Add the JSON snippet below to your client's MCP server config
AI desktop clients	Claude Desktop, Open WebUI	Add the JSON snippet below via the client's settings
Agent frameworks & platforms	LangChain, LangGraph, CrewAI, AutoGen, Google ADK, LiteLLM, n8n	Python API (`MemorySession`) or point an HTTP tool node at the MCP endpoint

{
  "mcpServers": {
    "GroundMemory": {
      "url": "http://<server-ip>:4242/mcp"
    }
  }
}

For clients that use the stdio transport, add the following block instead:

{
  "mcpServers": {
    "GroundMemory": {
      "command": "npx",
      "args": [
        "mcp-remote@latest", 
        "http://<server-ip>:4242/mcp", 
        "--allow-http"
      ]
    }
  }
}

If an api key is set on the server, add --header and the token value to the args list (both lines are required):

{
  "mcpServers": {
    "GroundMemory": {
      "command": "npx",
      "args": [
        "mcp-remote@latest",
        "http://<server-ip>:4242/mcp",
        "--allow-http",
        "--header",
        "Authorization: Bearer your-secret-token"
      ]
    }
  }
}

Clients that support the MCP Prompts primitive (such as Cline and Claude Desktop) will also show a memory_bootstrap_prompt entry in their Prompts panel - click it at session start to inject memory context without waiting for the agent to call the tool. For agent frameworks and platforms that use the Python API, see Connecting to Your AI Agent Using The Python API.

Available MCP Tools

Once connected, the client has access to 4 core tools and 1 prompt. Two additional tools are available behind config flags.

Core tools (always registered)

Tool	Description
`memory_bootstrap`	Call once at session start. Returns the full memory context (MEMORY.md, USER.md, AGENTS.md, RELATIONS.md, daily logs) as a formatted string.
`memory_read`	Unified read tool. Supply `query` for hybrid search (SEARCH mode) or `file` for direct file/line-range access (GET mode).
`memory_write`	Unified write tool. APPEND, REPLACE_TEXT, REPLACE_LINES, or DELETE - mode is selected by the combination of parameters supplied.
`memory_relate`	Record a typed entity relationship (`subject → predicate → object`) with semantic deduplication.

Optional tools (config-gated)

Tool	Config flag	Description
`memory_list`	`mcp.expose_memory_list: true`	List workspace files with sizes and line counts.
`memory_tool`	`mcp.dispatcher_mode: true`	Single dispatcher tool - replaces all four core tools with one `action` + `args` call.

Prompt	Description
`memory_bootstrap_prompt`	Same content as `memory_bootstrap`, exposed as an MCP Prompt for clients that support the Prompts primitive (Cline, Claude Desktop). Click it in your client's Prompts panel at session start instead of waiting for the agent to call the tool.

Bootstrap - Loading Memory at Session Start

GroundMemory's memory context (long-term facts, user profile, agent instructions, entity graph, daily logs) needs to be loaded at the start of each session. Two mechanisms are provided:

Tool-based bootstrap (all clients)

The memory_bootstrap tool description is written so that most agents call it automatically at the start of a session without any explicit instruction - the tool's description alone signals that it should be the first action taken. No system-prompt changes are necessary in most cases.

If you find that your agent does not call memory_bootstrap on its own, you can add an explicit fallback instruction to the system prompt:

At the start of every session, call memory_bootstrap before doing anything else.
Use the returned context as your background knowledge for the rest of the session.

Prompt-based bootstrap (Cline, Claude Desktop)

Clients that support the MCP Prompts primitive (Cline, Claude Desktop) will show a memory_bootstrap_prompt entry in their Prompts panel. Click it at the start of a session to inject the memory context directly into the conversation - no agent tool call required. This is an alternative to the tool-based path, useful when you want to load memory context manually rather than waiting for the agent to call the tool.

The content returned by both mechanisms is identical.

Connecting to Your AI Agent Using The Python API

GroundMemory exposes standard JSON schemas for function calling, compatible with OpenAI and Anthropic out of the box. The primary export is ALL_TOOLS - a list of (schema, run) pairs. Pass the schemas to the model so it knows what tools are available; when the model calls a tool, dispatch it back through the paired run function (or use session.execute_tool directly). Both paths are shown below.

OpenAI

from openai import OpenAI
from GroundMemory.session import MemorySession
from GroundMemory.tools import ALL_TOOLS
import json

session = MemorySession.create("my-project")
client = OpenAI()

# Build the tools list for the API call
tools = [{"type": "function", "function": schema} for schema, _ in ALL_TOOLS]

response = client.chat.completions.create(
    model="gpt-4.1",
    messages=[
        {"role": "system", "content": session.bootstrap()},
        {"role": "user", "content": "What do you remember about my preferences?"},
    ],
    tools=tools,
)

# Dispatch tool calls the model makes back to GroundMemory
for call in response.choices[0].message.tool_calls or []:
    result = session.execute_tool(call.function.name, **json.loads(call.function.arguments))
    print(result)

Anthropic

from anthropic import Anthropic
from GroundMemory.session import MemorySession
from GroundMemory.tools import ALL_TOOLS

session = MemorySession.create("my-project")
client = Anthropic()

# Anthropic tool schema uses input_schema instead of parameters
tools = [
    {
        "name": schema["name"],
        "description": schema["description"],
        "input_schema": schema["parameters"],
    }
    for schema, _ in ALL_TOOLS
]

response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=1024,
    system=session.bootstrap(),
    messages=[{"role": "user", "content": "Summarize what you know about me."}],
    tools=tools,
)

# Dispatch tool use blocks back to GroundMemory
for block in response.content:
    if block.type == "tool_use":
        result = session.execute_tool(block.name, **block.input)
        print(result)

Complete runnable agents - examples/openai_agent.py and examples/anthropic_agent.py show a full interactive loop with workspace sync on startup and graceful shutdown. Run them with:
uv run python examples/openai_agent.py
uv run python examples/anthropic_agent.py

Python API Example

from groundmemory.session import MemorySession

# Create (or reopen) a named workspace
session = MemorySession.create("my-project")

# Append a long-term memory
session.execute_tool("memory_write", file="MEMORY.md", content="User prefers concise answers.")

# Append a daily log entry
session.execute_tool("memory_write", file="USER.md", content="Working on the auth service refactor.")

# Record a relationship between entities
session.execute_tool("memory_relate", subject="Alice", predicate="works_at", object="Acme Corp")

# Search across all memory tiers (SEARCH mode)
result = session.execute_tool("memory_read", query="communication preferences")
for item in result["results"]:
    print(item["content"])

# Read a specific file (GET mode)
result = session.execute_tool("memory_read", file="USER.md")
print(result["content"])

# Read a line range (1-indexed, GET mode)
result = session.execute_tool("memory_read", file="USER.md", start_line=5, end_line=10)
print(result["content"])

# Replace the first occurrence of a string (REPLACE_TEXT mode)
session.execute_tool("memory_write", file="USER.md", search="old text", content="new text")

# Replace a line range (REPLACE_LINES mode)
session.execute_tool("memory_write", file="USER.md", start_line=5, end_line=7, content="new content")

# Hard-delete a line range - physically removes lines, no tombstone (DELETE mode)
session.execute_tool("memory_write", file="USER.md", start_line=5, end_line=7, content="")

# Build the system prompt context block for your agent
system_prompt = session.bootstrap()

MemorySession.create("my-project") creates a workspace directory at ~/.groundmemory/my-project/ on first run, seeding all required files. Subsequent calls reopen the same workspace.

Tools Reference

Use session.execute_tool(name, **kwargs) to call tools programmatically, or pass the schemas from build_tool_registry(config) to your model framework.

Immutability rule: MEMORY.md and all daily/*.md files are append-only. The DELETE, REPLACE_TEXT, and REPLACE_LINES modes of memory_write will reject any attempt to modify them. Use memory_write in APPEND mode to add new information to these files.

Hard-delete: DELETE mode physically removes lines from the file - no tombstone comment is written.

memory_read - Unified read tool

Parameter	Type	Description
`query`	string (optional)	Natural-language search query → SEARCH mode
`file`	string (optional)	File to read → GET mode (or SEARCH filter when combined with `query`)
`top_k`	int (optional)	Max results to return in SEARCH mode
`start_line`	int (optional)	1-based first line to return in GET mode
`end_line`	int (optional)	1-based last line (inclusive) in GET mode

Mode dispatch: query alone → SEARCH; file alone → GET; both → GET (file wins for slicing); neither → error.

memory_write - Unified write tool

Mode	Trigger	Parameters
APPEND	No `search`, no `start_line`/`end_line`	`file`, `content` (+ optional `tags`)
REPLACE_TEXT	`search` provided	`file`, `search`, `content`
REPLACE_LINES	`start_line` + `end_line` + non-empty `content`	`file`, `start_line`, `end_line`, `content`
DELETE	`start_line` + `end_line` + `content=""`	`file`, `start_line`, `end_line`, `content=""`

APPEND targets and their destination files:

`file` value	Written to	Behaviour
`MEMORY.md`	`MEMORY.md`	Appended permanently. Immutable - append only.
`daily`	`daily/YYYY-MM-DD.md`	Appended to today's log. Immutable - append only.
`USER.md`	`USER.md`	Updates the stable user profile. Mutable.
`AGENTS.md`	`AGENTS.md`	Updates agent operating instructions. Mutable.

memory_read source filters (SEARCH mode, pass file= to restrict):

`file` value	Searches
(omitted)	All files
`MEMORY.md`	Long-term memory only
`USER.md`	User profile only
`AGENTS.md`	Agent instructions only
`daily`	All daily logs
`RELATIONS.md`	Relations file + SQLite graph

memory_relate - Record a typed entity relationship

Parameter	Type	Default	Description
`subject`	string	required	Source entity (e.g. "Alice")
`predicate`	string	required	Relationship type (e.g. "works_at")
`object`	string	required	Target entity (e.g. "Acme Corp")
`note`	string	`""`	Optional free-text annotation
`source_file`	string	`"RELATIONS.md"`	Workspace-relative file
`confidence`	float	`1.0`	Confidence score 0.0–1.0
`supersedes`	bool	`false`	Delete all prior `(subject, predicate)` triples before writing

memory_list (optional - requires mcp.expose_memory_list: true)

Lists all workspace files with sizes and line counts. No required parameters.

memory_tool (optional - requires mcp.dispatcher_mode: true)

Single dispatcher that replaces all four core tools. Pass action (one of read, write, bootstrap, relate, list, describe) and args (the same parameters the underlying tool accepts).

Architecture

Architectural Layers

1. Workspace (`GroundMemory/core/workspace.py`)

Workspace is a pure filesystem abstraction for a single memory workspace. On first use it creates the directory tree and seeds the default Markdown files (MEMORY.md, USER.md, AGENTS.md, RELATIONS.md). All other layers receive a Workspace object to resolve file paths - it never holds runtime state such as a database connection or embedding provider.

The on-disk layout is a single directory level under ~/.groundmemory:

~/.groundmemory/
└── <workspace_name>/          ← one directory per named workspace
    ├── MEMORY.md              long-term curated memory
    ├── USER.md                stable user profile
    ├── AGENTS.md              agent operating instructions
    ├── RELATIONS.md           human-readable entity relation graph
    ├── daily/
    │   └── YYYY-MM-DD.md     append-only daily logs
    └── .index/
        └── memory.db          SQLite index (chunks + FTS5 + relations + embeddings)

2. Memory Storage (`GroundMemory/core/storage.py`)

Low-level atomic Markdown file I/O. All writes go through a temp-file + rename cycle to prevent partial writes. Provides write_long_term, write_daily, read_file, delete_lines, and list_daily_files.

3. Text Chunker (`GroundMemory/core/chunker.py`)

Splits Markdown files into overlapping chunks that respect heading boundaries. Each Chunk carries a deterministic chunk_id (SHA-256 of path + start line + text) and 0-indexed line ranges for precise memory_get retrieval.

4. Embedding Providers (`GroundMemory/core/embeddings.py`)

Abstract EmbeddingProvider with three concrete implementations:

Provider	Class	Notes
`none`	`NullEmbeddingProvider`	Returns empty vectors; BM25-only path, no network or GPU
`openai`	`OpenAICompatibleProvider`	HTTP calls via `httpx`; works with any OpenAI-compatible endpoint
`local`	`SentenceTransformerProvider`	Runs a local model in-process via `sentence-transformers`

For provider configuration, install commands, and usage examples, see Embedding Providers.

5. Memory Index (`GroundMemory/core/index.py`)

SQLite database (memory.db) with five tables:

Table	Purpose
`files`	Tracks indexed files with SHA-256 hash + mtime for change detection
`chunks`	Text chunks with JSON-serialised embedding vectors
`chunks_fts`	FTS5 virtual table - BM25 keyword search via SQLite triggers
`relations`	Named entity relationships (subject → predicate → object)
`embedding_cache`	Reuses embeddings when chunk content hasn't changed

Vector search is implemented in pure Python (NumPy cosine similarity) so it works everywhere without native extensions. The database runs in WAL (Write-Ahead Logging) mode (PRAGMA journal_mode=WAL) for better concurrent read performance.

6. Hybrid Search (`GroundMemory/core/search.py`)

Nine-step pipeline:

Embed the query via the configured provider.
Vector search - cosine similarity over all chunk embeddings → top k × candidate_multiplier candidates.
Keyword search - FTS5 BM25 → top k × candidate_multiplier candidates.
Merge & re-score - score = vector_weight × vec_score + (1 − vector_weight) × bm25_score.
Cross-encoder reranking (optional) - if search.rerank_model is set, a cross-encoder rescores the merged candidates for higher precision.
Temporal decay - score × exp(−decay_rate × days_old) (disabled by default; applied post-rerank so recency nudges the final order).
MMR diversification (optional) - if search.mmr_lambda > 0, Maximal Marginal Relevance greedily selects top_k results that balance relevance against similarity to already-selected results: mmr_score = mmr_lambda × relevance − (1 − mmr_lambda) × max_cosine_sim_to_selected. Set via config only — not exposed as a tool parameter.
Graph expansion - extract entity mentions from top results, attach related relation triples as relation_context.
Return top k as SearchResult objects.

7. Relation Graph (`GroundMemory/core/relations.py`)

Single consolidated module for all relation logic. Stores typed entity triples (subject → predicate → object) in two places simultaneously:

SQLite relations table - fast structured lookup used by graph expansion during search.
RELATIONS.md - human-readable, editable mirror, injected at bootstrap.

Source-of-truth model: RELATIONS.md is the authoritative record. Any change to the file is automatically reconciled back into SQLite.

Public API:

Symbol	Description
`add_relation(...)`	Write a relation to both SQLite and RELATIONS.md with semantic dedup
`get_relations(...)`	Read relations from SQLite
`parse_relations_from_text(text)`	Parse valid relation lines from a raw string; used by `memory_delete` to identify rows to remove without a temp-file round-trip
`parse_relations_from_file(path)`	Parse valid lines from RELATIONS.md into `{subject, predicate, object, note}` dicts (delegates to `parse_relations_from_text`)
`sync_relations_from_file(path, index)`	Upsert/delete SQLite rows to match RELATIONS.md exactly; called by `sync_file` / `sync_workspace` and via `sync_after_edit` after every in-place edit
`validate_relations_replacement(text)`	Validate that every non-blank, non-comment line in a replacement string matches the RELATIONS.md format; returns `(all_valid, valid_lines, invalid_lines)`
`format_relations_for_context`	Format the relation graph as a Markdown block for bootstrap injection
`RELATION_LINE_RE`	Compiled regex for a valid RELATIONS.md line
`RELATIONS_FORMAT_REMINDER`	Human-readable format reminder string included in validation error responses

Format for each line in RELATIONS.md:

- [Subject] --predicate--> [Object] (YYYY-MM-DD) - "optional note"

Semantic deduplication: before inserting, the new triple is embedded and compared (cosine similarity) against all existing triples. If similarity ≥ dedup_threshold (default 0.92) the write is skipped and the existing triple is returned.

8. Sync (`GroundMemory/core/sync.py`)

Keeps the SQLite index consistent with the Markdown files using SHA-256 content hashing (not timestamps). sync_workspace walks all files and re-indexes changed ones. sync_file force-syncs a single file - called immediately after every memory_write so new content is searchable within the same session.

9. Bootstrap Injector (`GroundMemory/bootstrap/injector.py`)

Assembles a system-prompt block from workspace files, respecting per-file and total character budgets (max_chars_per_file, max_total_chars). Truncated files get a visible [TRUNCATED - use memory_get to read the rest] marker. Injects (in order): long-term memory, user profile, agent instructions, relation graph, and daily logs. The number of daily log files injected is controlled by daily_log_days (default: 1 = today only; set to 2 for today + yesterday).

When compaction_token_threshold > 0, the injector counts the tokens in the assembled prompt (using either len // 4 or tiktoken) and, if the count is above the threshold, appends a compaction notice instructing the agent to compact the configured tiers one by one using memory_compact.

9a. Token Counter (`GroundMemory/bootstrap/token_counter.py`)

Counts tokens in the bootstrap string. Two methods: "approx" (len(text) // 4, no extra deps, default) and "tiktoken" (accurate BPE count via the tiktoken library, which is included as a core dependency). Falls back to "approx" if tiktoken is not importable.

9b. Workspace Backup (`GroundMemory/core/backup.py`)

Creates timestamped zip archives of the workspace before compaction. Archives are stored in <workspace>/backups/YYYY-MM-DD_HHmmss.zip and contain all Markdown files, the daily/ directory, and .index/memory.db. Provides create_backup, list_backups, parse_spec (resolves user-supplied restore specs), and restore_backup.

10. Tools (`GroundMemory/tools/`)

Four core tools + two optional (config-gated) tools exposed to the LLM via function calling:

Tool	File	Notes
`memory_bootstrap`	-	Assembles and returns the full workspace context as a Markdown string.
`memory_read`	-	Unified read: SEARCH mode (hybrid vector+BM25) or GET mode (file/line-range).
`memory_write`	various	Unified write: APPEND, REPLACE_TEXT, REPLACE_LINES, DELETE - dispatched by parameter combination. Hard-delete (physical line removal); rejected on `MEMORY.md`/`daily/*.md` for edit modes.
`memory_relate`	`RELATIONS.md` + SQLite	Semantic dedup before insert. `supersedes=True` deletes prior `(subject, predicate)` triples.
`memory_list` (optional)	-	Lists all workspace files with sizes and line counts. Gated by `mcp.expose_memory_list`.
`memory_tool` (optional)	-	Single dispatcher - routes `action` + `args` to the appropriate underlying tool. Gated by `mcp.dispatcher_mode`.

Tool registry (GroundMemory/tools/__init__.py):

build_tool_registry(config) returns (all_tools, tool_runners, tool_schemas) based on config flags. dispatcher_mode=True replaces all four core tools with the single memory_tool dispatcher. expose_memory_list=True adds memory_list to the core set.

Shared utilities (GroundMemory/tools/base.py):

Symbol	Description
`ok(data)` / `err(msg)`	Wrap a successful or error tool result
`is_immutable(file)`	Return `True` for `MEMORY.md` and `daily/*.md`
`sync_after_edit(session, resolved, is_relations, base_payload)`	Re-index a file after an in-place edit and return `ok(payload)`. Calls `sync_file` (and, when `is_relations=True`, `sync_relations_from_file`) non-fatally - sync failures add a `warning` key rather than raising.

11. LLM Adapters (`GroundMemory/adapters/`)

Thin schema-conversion + agentic-loop helpers:

adapters/openai.py - converts schemas to OpenAI function-calling format; handle_tool_calls dispatches tool calls and appends results to the message list; run_agent_loop iterates until the model stops calling tools.
adapters/anthropic.py - same for Anthropic's tool_use / tool_result block format.

12. Session (`GroundMemory/session.py`)

MemorySession is the composition root that holds references to Workspace, MemoryIndex, and EmbeddingProvider. It exposes execute_tool, bootstrap, and sync as the primary API surface.

12 (note). Session vs Workspace - not two different things

MemorySession is the workspace session - there is no meaningful distinction between the two concepts in GroundMemory. Workspace is the low-level filesystem handle; MemorySession is the high-level runtime object that wraps it together with the index and embedding provider. When you call MemorySession.create("my-project"), it resolves the path as ~/.groundmemory/my-project - a single directory, not a nested one.

Data Flow

User message
     │
     ▼
MemorySession.bootstrap()
     │  Reads MEMORY.md, USER.md, AGENTS.md, RELATIONS.md, daily logs
     │  → assembled into system prompt block
     ▼
LLM receives system prompt + tool schemas + user message
     │
     │  Model may call memory tools:
     │
     ├─► memory_read(query, ...)          ← SEARCH mode
     │       └─► provider.embed(query)                  → query vector
     │       └─► index.vector_search                    → cosine top-k
     │       └─► index.keyword_search                   → FTS5 BM25 top-k
     │       └─► merge + decay + graph expansion        → ranked SearchResult list
     │
     ├─► memory_read(file, ...)           ← GET mode
     │       └─► storage.read_file                      → raw Markdown slice (1-indexed)
     │
     ├─► memory_write(file, content)      ← APPEND mode
     │       └─► storage.write_long_term / write_daily  → appends to Markdown
     │       └─► sync.sync_file                         → chunk → embed → upsert SQLite
     │
     ├─► memory_write(file, search, content)   ← REPLACE_TEXT mode
     │       └─► is_immutable(file) check               → reject if MEMORY.md or daily/*.md
     │       └─► storage.replace_text                   → first-match replacement in Markdown
     │       └─► sync.sync_file                         → re-index
     │
     ├─► memory_write(file, start, end, content)  ← REPLACE_LINES mode
     │       └─► is_immutable(file) check               → reject if MEMORY.md or daily/*.md
     │       └─► storage.replace_lines                  → line-range replacement in Markdown
     │       └─► sync.sync_file                         → re-index
     │
     ├─► memory_write(file, start, end, content="")  ← DELETE mode
     │       └─► is_immutable(file) check               → reject if MEMORY.md or daily/*.md
     │       └─► storage.hard_delete_lines              → physically removes lines (no tombstone)
     │       └─► sync.sync_file                         → re-index
     │
     └─► memory_relate(subject, predicate, object)
             └─► relations._find_semantic_duplicate      → cosine dedup check
             └─► index.insert_relation                  → SQLite relations table
             └─► storage._atomic_write                  → append to RELATIONS.md
     │
     ▼
Agent response returned to user
     │
     ▼
Next session: session.bootstrap() reloads persisted facts

Tech Stack

Component	Technology
Language	Python 3.10+
Configuration	Pydantic Settings + PyYAML (YAML file + env vars)
Database	SQLite via `sqlite3` stdlib, WAL mode (`PRAGMA journal_mode=WAL`)
Full-text search	SQLite FTS5 (BM25) with auto-sync triggers
Vector search	NumPy cosine similarity (pure Python; no native extension required)
Embeddings - local	`sentence-transformers` (optional extra; not installed by default)
Embeddings - remote	Any OpenAI-compatible HTTP endpoint via `httpx` (core dependency)
HTTP client	`httpx`
Packaging	`hatchling` build backend (`pyproject.toml`), installable via `uv` or `pip`
Tests	`pytest`

Configuration

Minimum Config

No configuration file is required. With no config, GroundMemory uses BM25-only search backed by SQLite - no API key, no GPU, no extra packages.

Finding the example config files

Both example files are bundled with the package under groundmemory/config/ in the repository:

groundmemory/config/groundmemory.yaml.example - full YAML reference with every option documented
groundmemory/config/.env.example - all environment variables with descriptions and defaults

For pip installs, groundmemory-mcp automatically copies groundmemory.yaml.example into ~/.groundmemory/ on first run. For Docker installs, copy the .env.example manually as shown in the Docker quick-start above.

groundmemory.yaml Reference

Config file search order (first match wins):

Location	Resolved path	Use case
`$GROUNDMEMORY_ROOT_DIR/groundmemory.yaml`	`~/.groundmemory/groundmemory.yaml` (pip) or `/data/groundmemory.yaml` → `./data/groundmemory.yaml` on host (Docker)	Global user config - recommended for pip installs and Docker
`./groundmemory.yaml`	cwd at process start	Per-project override in dev mode

The same search order applies to .env files ($GROUNDMEMORY_ROOT_DIR/.env then ./.env).

Settings in these files are overridden by environment variables, which in turn are overridden by constructor kwargs.

# ---------------------------------------------------------------------------
# General
# ---------------------------------------------------------------------------

# Root directory for all workspaces (default: ~/.groundmemory)
# root_dir: ~/.groundmemory

# Default workspace name
# workspace: default

# ---------------------------------------------------------------------------
# Embedding provider
# ---------------------------------------------------------------------------
embedding:
  # provider options:
  #   "none"   - BM25 keyword search only (no vector search, no GPU needed) [default]
  #   "openai" - OpenAI-compatible HTTP API (no extra install required)
  #   "local"  - sentence-transformers (requires: pip install groundmemory[local])
  provider: none

  # --- sentence-transformers (provider: local) ---
  # Requires: pip install groundmemory[local]  (not installed by default)
  # Any model from https://www.sbert.net/docs/pretrained_models.html
  # local_model: all-MiniLM-L6-v2      # fast, 384-dim, good quality
  # local_model: all-mpnet-base-v2     # slower, 768-dim, higher quality

  # --- OpenAI-compatible API (provider: openai) ---
  # Supports: OpenAI, Ollama, LM Studio, vLLM, Mistral, Together, etc.
  #
  # Real OpenAI (leave base_url blank):
  # base_url: ~
  # api_key: sk-...
  # model: text-embedding-3-small
  #
  # Ollama local server:
  # base_url: http://localhost:11434/v1
  # api_key: ollama          # required by the OpenAI client but ignored by Ollama
  # model: nomic-embed-text  # pull with: ollama pull nomic-embed-text
  #
  # LM Studio:
  # base_url: http://localhost:1234/v1
  # api_key: lm-studio
  # model: nomic-ai/nomic-embed-text-v1.5-GGUF
  #
  # OpenRouter:
  # base_url: https://openrouter.ai/api/v1
  # api_key: sk-or-...
  # model: openai/text-embedding-3-small

  # Number of texts sent per embedding API call
  # batch_size: 64

# ---------------------------------------------------------------------------
# Hybrid search
# ---------------------------------------------------------------------------
search:
  # Number of results returned by memory_search
  top_k: 6

  # Oversampling factor: top_k * candidate_multiplier candidates fetched per
  # path (vector + keyword) before merging and re-ranking
  candidate_multiplier: 4

  # Weight for vector similarity score (0.0-1.0) used in Reciprocal Rank Fusion.
  # keyword_weight = 1.0 - vector_weight.
  # Set to 0.0 for pure BM25 (useful when provider: none).
  vector_weight: 0.7

  # Reciprocal Rank Fusion k constant.
  # Controls how much rank differences influence the final score.
  # Higher values flatten the curve (less penalisation for lower ranks).
  # Standard value is 60; lower values (e.g. 10-20) amplify rank differences.
  rrf_k: 60

  # Temporal decay: score *= exp(-decay_rate * days_old)
  # 0.0 disables decay; 0.01 halves relevance after ~70 days
  temporal_decay_rate: 0.0

  # MMR (Maximal Marginal Relevance) diversity
  # 0.0 = disabled (pure relevance), 1.0 = maximum diversity
  mmr_lambda: 0.0

# ---------------------------------------------------------------------------
# Text chunking
# ---------------------------------------------------------------------------
chunking:
  # Target chunk size in approximate tokens (1 token ≈ 4 chars)
  tokens: 400

  # Overlap between consecutive chunks in approximate tokens
  overlap: 80

# ---------------------------------------------------------------------------
# Bootstrap - system-prompt injection at session start
# ---------------------------------------------------------------------------
bootstrap:
  # Maximum characters per file before a truncation warning is appended
  max_chars_per_file: 10000

  # Maximum total characters injected across all files
  max_total_chars: 50000

  # Which memory files to inject into the system prompt
  inject_long_term_memory: true  # MEMORY.md
  inject_user_profile: true      # USER.md
  inject_agents: true            # AGENTS.md
  inject_daily_logs: true        # daily/YYYY-MM-DD.md
  inject_relations: true         # RELATIONS.md

  # Number of daily log files to inject, counting back from today.
  # 1 = today only (default), 2 = today + yesterday, 0 = none.
  daily_log_days: 1

  # Re-index all workspace files at the start of every session before injecting context.
  #
  # Purpose: if you edit memory files outside the agent (e.g. in a text editor
  # or via git), the SQLite/vector index may be out of date. Enabling this
  # ensures the index is always consistent with the files on disk.
  #
  # Leave disabled (false) in normal usage - the agent keeps the index in sync
  # automatically after every memory_write / memory_relate / memory_delete call.
  sync_memory_on_bootstrap: false

  # --- Memory compaction ---
  #
  # Threshold is measured against compactable tiers only (MEMORY.md, USER.md,
  # AGENTS.md). RELATIONS.md and daily logs are excluded from the count.
  # Set to 0 (default) to disable compaction.
  #
  # compaction_token_threshold: 6000
  # compaction_token_counter: approx   # "approx" or "tiktoken" (pip install groundmemory[local])
  # compaction_tiers:
  #   - MEMORY.md
  #   # - USER.md
  #   # - AGENTS.md

# ---------------------------------------------------------------------------
# MCP server (groundmemory-mcp command)
# ---------------------------------------------------------------------------
# mcp:
  # Host address the MCP server binds to.
  # Default "127.0.0.1" allows connections from this machine only.
  # Set to "0.0.0.0" to accept connections from other machines (see below).
  # host: 127.0.0.1

  # TCP port the MCP server listens on.
  # port: 4242

  # --- Network access (disabled by default) ---
  #
  # By default, the server is local-only. To allow access from another machine:
  #
  # 1. Set host to "0.0.0.0" (binds to all interfaces).
  # 2. Add the Host header value your client sends to allowed_hosts.
  #    This is always your machine's IP:port as the client sees it.
  #    "localhost" and "127.0.0.1" are always allowed and do not need to be listed.
  #
  # Example - LAN access (single address):
  #   host: "0.0.0.0"
  #   allowed_hosts: "192.168.1.50:4242"
  #
  # Example - LAN access (multiple addresses, comma-separated):
  #   host: "0.0.0.0"
  #   allowed_hosts: "192.168.1.50:4242,192.168.1.51:4242"
  #
  # Note: allowed_hosts requires exact Host header values - wildcards and
  # CIDR ranges are not supported. Separate multiple values with commas.
  #
  # allowed_hosts: ""
  #
  # --- Reverse proxy / forwarded headers ---
  #
  # forwarded_allow_ips controls which upstream IPs uvicorn trusts to set
  # X-Forwarded-For and X-Real-IP headers.
  #
  # You do NOT need to set this for plain LAN access (host: 0.0.0.0 +
  # allowed_hosts). Only set it when a reverse proxy (nginx, Caddy, Traefik)
  # sits in front of GroundMemory and forwards requests. Set it to the proxy's
  # internal IP so uvicorn trusts the headers that proxy sends.
  #
  # forwarded_allow_ips: "127.0.0.1"
  #
  # --- Public internet ---
  #
  # GroundMemory has no authentication layer. Do not expose it directly to the
  # public internet. Place it behind a reverse proxy (nginx, Caddy, Traefik)
  # that handles TLS and authentication, then set host to "127.0.0.1" and
  # add the public hostname to allowed_hosts:
  #
  #   host: "127.0.0.1"
  #   allowed_hosts: "yourdomain.com"
  #   forwarded_allow_ips: "127.0.0.1"
  #
  # --- Authentication ---
  #
  # Static bearer token required on every request. When unset (default), no
  # authentication is enforced. Set this when exposing the server beyond localhost.
  #
  # Clients must send: Authorization: Bearer <your-token>
  #
  # Generate with `openssl rand -base64 32`
  # api_key: ""

Environment Variables

All settings are available as environment variables using the GROUNDMEMORY_ prefix. Nested keys use double-underscore (__) as a separator. Environment variables take priority over groundmemory.yaml.

Embedding

Variable	Description	Default
`GROUNDMEMORY_EMBEDDING__PROVIDER`	`none` / `local` / `openai`	`none`
`GROUNDMEMORY_EMBEDDING__BASE_URL`	OpenAI-compatible endpoint URL	-
`GROUNDMEMORY_EMBEDDING__API_KEY`	API key for the endpoint	-
`GROUNDMEMORY_EMBEDDING__MODEL`	Embedding model name (provider: openai)	`text-embedding-3-small`
`GROUNDMEMORY_EMBEDDING__LOCAL_MODEL`	sentence-transformers model name	`all-MiniLM-L6-v2`
`GROUNDMEMORY_EMBEDDING__BATCH_SIZE`	Texts per embedding API call	`64`

Search

Variable	Description	Default
`GROUNDMEMORY_SEARCH__TOP_K`	Results returned per query	`6`
`GROUNDMEMORY_SEARCH__CANDIDATE_MULTIPLIER`	Oversampling factor per path	`4`
`GROUNDMEMORY_SEARCH__VECTOR_WEIGHT`	Vector list weight in RRF (0.0–1.0); keyword weight = 1 - vector_weight	`0.7`
`GROUNDMEMORY_SEARCH__RRF_K`	Reciprocal Rank Fusion k constant. Higher values flatten rank differences. Standard default is 60.	`60`
`GROUNDMEMORY_SEARCH__TEMPORAL_DECAY_RATE`	Score decay per day of age	`0.0`
`GROUNDMEMORY_SEARCH__MMR_LAMBDA`	MMR diversity (0 = disabled)	`0.0`

Chunking

Variable	Description	Default
`GROUNDMEMORY_CHUNKING__TOKENS`	Target chunk size in tokens	`400`
`GROUNDMEMORY_CHUNKING__OVERLAP`	Overlap between chunks in tokens	`80`

Relations

Variable	Description	Default
`GROUNDMEMORY_RELATIONS__DEDUP_THRESHOLD`	Cosine similarity threshold for dedup	`0.92`

Bootstrap

Variable	Description	Default
`GROUNDMEMORY_BOOTSTRAP__MAX_CHARS_PER_FILE`	Max chars per injected file before truncation	`10000`
`GROUNDMEMORY_BOOTSTRAP__MAX_TOTAL_CHARS`	Max total chars across all injected files	`50000`
`GROUNDMEMORY_BOOTSTRAP__INJECT_LONG_TERM_MEMORY`	Inject MEMORY.md	`true`
`GROUNDMEMORY_BOOTSTRAP__INJECT_USER_PROFILE`	Inject USER.md	`true`
`GROUNDMEMORY_BOOTSTRAP__INJECT_AGENTS`	Inject AGENTS.md	`true`
`GROUNDMEMORY_BOOTSTRAP__INJECT_DAILY_LOGS`	Enable/disable daily log injection entirely	`true`
`GROUNDMEMORY_BOOTSTRAP__INJECT_RELATIONS`	Inject RELATIONS.md	`true`
`GROUNDMEMORY_BOOTSTRAP__DAILY_LOG_DAYS`	Number of daily log files to inject counting back from today. `1` = today only, `2` = today + yesterday, `0` = none.	`1`
`GROUNDMEMORY_BOOTSTRAP__SYNC_MEMORY_ON_BOOTSTRAP`	Re-index all workspace files at session start. Enable when you edit memory files outside the agent so the index stays consistent with disk.	`false`
`GROUNDMEMORY_BOOTSTRAP__COMPACTION_TOKEN_THRESHOLD`	Token count of compactable tiers (MEMORY.md, USER.md, AGENTS.md) above which a backup is taken and a compaction notice is injected. RELATIONS.md and daily logs are excluded from this count. `0` = disabled.	`0`
`GROUNDMEMORY_BOOTSTRAP__COMPACTION_TOKEN_COUNTER`	Token counting method: `"approx"` (`len // 4`) or `"tiktoken"` (accurate BPE, requires `pip install groundmemory[local]`).	`"approx"`
`GROUNDMEMORY_BOOTSTRAP__COMPACTION_TIERS`	Memory files the agent is allowed to compact. Daily logs are never compacted.	`["MEMORY.md"]`

General

Variable	Description	Default
`GROUNDMEMORY_ROOT_DIR`	Base directory for all workspaces	`~/.groundmemory`
`GROUNDMEMORY_WORKSPACE`	Default workspace name	`default`

MCP Server

Variable	Description	Default
`GROUNDMEMORY_MCP__HOST`	Host address the server binds to. Set to `0.0.0.0` for LAN access (see Network Access).	`127.0.0.1`
`GROUNDMEMORY_MCP__PORT`	TCP port the server listens on	`4242`
`GROUNDMEMORY_MCP__ALLOWED_HOSTS`	Comma-separated list of `Host:` header values to allow (DNS-rebinding protection). `localhost` and `127.0.0.1` are always allowed. Required when `HOST=0.0.0.0`.	``
`GROUNDMEMORY_MCP__FORWARDED_ALLOW_IPS`	IPs uvicorn trusts to pass `X-Forwarded-For` headers. Not needed for plain LAN access - only set when a reverse proxy sits in front of GroundMemory.	`127.0.0.1`
`GROUNDMEMORY_MCP__API_KEY`	Static bearer token required on every request. When unset (default), no authentication is enforced. Set when exposing the server beyond localhost. Clients must send `Authorization: Bearer <token>`.	(unset)

Backup and Restore

When compaction is triggered, GroundMemory automatically takes a zip backup of the workspace before the agent can modify any files. Backups are stored in <workspace>/backups/YYYY-MM-DD_HHmmss.zip and contain all Markdown files, the daily/ directory, and .index/memory.db.

Use the groundmemory CLI to manage backups:

# List all backups for the current workspace
groundmemory --list-backups

# Restore the most recent backup
groundmemory --restore -1

# Restore the second-most-recent backup
groundmemory --restore -2

# Restore by exact date (error if multiple backups on that date)
groundmemory --restore 2026-04-08

# Restore by exact timestamp
groundmemory --restore 2026-04-08_165530

The workspace is always resolved from your environment / config (same as groundmemory-mcp). Set GROUNDMEMORY_WORKSPACE or configure workspace in groundmemory.yaml to target a specific workspace before running the restore command.

If a date matches multiple backups, the command prints the list and exits — you can then use the full timestamp to disambiguate. After restoring, restart the MCP server if it is running.

Configuration priority (highest wins):

constructor kwargs  >  environment variables  >  $GROUNDMEMORY_ROOT_DIR/.env / ./.env  >  $GROUNDMEMORY_ROOT_DIR/groundmemory.yaml / ./groundmemory.yaml  >  built-in defaults

FilesExpand file tree

DOCS.md

Latest commit

History