sdr-memory

Long-term memory for LLM agents using Sparse Distributed Representations — no vector DB, no GPU, no external server.

Bio-inspired memory system based on Numenta/HTM theory. Uses 4096-bit SDR encoding with Hamming distance retrieval over pure SQLite WAL. Built-in salience filter rejects ~85% of noise automatically.

Why not just use a vector database?

Feature	sdr-memory	Mem0	Zep	ChromaDB
External server required	No	Yes	Yes	Yes
GPU required	No	No	No	Optional
Storage backend	SQLite	Various	Postgres	DuckDB
Memory approach	SDR (bio-inspired)	Embeddings	Knowledge Graph	Embeddings
Built-in salience filter	Yes	No	No	No
`pip install` and go	Yes	Yes	No	Yes
Retrieval (top-1 accuracy)	92%	—	—	81%*
Search latency (1600 memories)	2.4ms	—	—	0.8ms*
Memory footprint per entry	512 bytes	~3KB+	~5KB+	~1.5KB+

*Measured against sentence-transformers/all-MiniLM-L6-v2 with 32-dim dense embeddings on the same dataset. See benchmarks/. Mem0 and Zep benchmarks pending — contributions welcome.

Trade-off: SDR retrieval is pure brute-force over bit arrays — fast up to ~100K memories, but doesn't scale to millions like HNSW/FAISS. For LLM agent memory (typically 1K-50K entries), this is more than enough.

Quick Start

pip install sdr-memory

from sdr_memory import SDRMemory

store = SDRMemory("my_memory.db")

store.store("Database connection pool exhausted after traffic spike")
store.store("Payment gateway returns 503 during maintenance windows")
store.store("DNS timeout was the root cause of the outage")

results = store.query("database timeout", limit=3)
for r in results:
    print(f"[{r['score']:.3f}] {r['text']}")
# [0.987] DNS resolution timeout was the root cause of the outage
# [0.984] Database connection pool exhausted after traffic spike
# [0.982] Payment gateway returns 503 during maintenance windows

That's it. No server, no config, no model download.

Daemon Mode (Unix Socket)

Run as a persistent daemon for inter-process communication:

# Start the daemon
sdr-memory serve

# Query from another process
sdr-memory call --json '{"action":"store","text":"nginx OOM killed at 3am","metadata":{"severity":"critical"}}'
sdr-memory call --json '{"action":"query","text":"out of memory","limit":3}'
sdr-memory call --json '{"action":"stats"}'

Or from Python:

from sdr_memory.server import client_call
from pathlib import Path

result = client_call(Path("/tmp/sdr-memory.sock"), {
    "action": "query",
    "text": "memory leak",
    "limit": 5
})
print(result)

How It Works

Input text
    │
    ▼
┌─────────────────┐
│  Salience Filter │──── reject (~85% noise)
│  (regex-based)   │
└────────┬────────┘
         │ accept
         ▼
┌─────────────────┐
│  Trigram Hashing │  "database timeout" → {"  d","dat","ata","tab",...}
│                  │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│  SDR Encoding    │  Hash each trigram → bit position (mod 4096)
│  4096-bit vector │  Cap at 80 active bits → sparse binary vector
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│  SQLite WAL      │  Pack bits → 512 bytes BLOB → INSERT
│  (storage)       │  Journal mode: WAL, synchronous: NORMAL
└────────┬────────┘
         │
    ┌────┴────┐
    │  Query  │
    └────┬────┘
         │
         ▼
┌─────────────────┐
│ Hamming Distance │  XOR query bits with each stored SDR
│ (retrieval)      │  Score = 1 - (hamming_dist / 4096)
└─────────────────┘

Salience Filter

Not everything is worth remembering. The salience filter automatically rejects:

Procedural narration: "I'll read the file now", "Starting search..."
Path-only strings: "/usr/local/bin/python"
Short noise: Anything under 8 characters

And accepts:

High-signal keywords: error, timeout, resolved, fixed, critical, etc.
Substantive statements: Anything 60+ characters that passes the filters

This is critical for LLM agent memory — without it, you'd store 10x more garbage than useful facts.

Why SDR?

Sparse Distributed Representations come from Numenta's HTM theory and Kanerva's Sparse Distributed Memory (1988). Key properties:

Noise tolerance: Partial matches work naturally — a query doesn't need to exactly match to retrieve relevant memories
No training required: Unlike embeddings, SDR encoding is deterministic and instant
Compositionality: Similar texts produce overlapping bit patterns automatically via trigram hashing
Tiny footprint: 512 bytes per memory vs ~1.5-6KB for embedding vectors

The trade-off vs embeddings: SDR captures lexical similarity (shared words/substrings), not semantic similarity ("dog" ≠ "puppy"). For factual/technical memory (logs, incidents, configs), lexical overlap is usually sufficient and often more precise.

Benchmarks

Measured on 1,600 stored memories with 100 queries (benchmark details):

Method	Top-1 Accuracy	Top-5 Accuracy	MRR	Search Time/Query
SDR 4096-bit Hamming	92%	98%	0.949	2.4ms
Dense 32-dim cosine (MiniLM-L6)	81%	90%	0.852	0.8ms

SDR wins on accuracy by +11% top-1, at the cost of ~1.6ms extra search time (still sub-5ms). Dense embeddings are faster because they use optimized BLAS, but SDR doesn't need a model download or GPU.

# Run benchmarks yourself (requires [research] extras)
pip install sdr-memory[research]
python benchmarks/vector_duel.py --data your_data.jsonl

Configuration

Parameter	Default	Environment Variable	Description
DB path	`~/.sdr-memory/memory.db`	`SDR_MEMORY_DB`	SQLite database location
Socket path	`/tmp/sdr-memory.sock`	`SDR_MEMORY_SOCK`	Unix socket for daemon mode
SDR bits	4096	—	Width of the SDR vector
Max active bits	80	—	Sparsity cap per encoding

Use with LLM Agents

sdr-memory is designed to give LLM agents persistent memory across sessions. Example integration with Claude Code hooks:

# On session start — recall relevant context
from sdr_memory.server import client_call
from pathlib import Path

result = client_call(Path("/tmp/sdr-memory.sock"), {
    "action": "query",
    "text": "What was I working on?",
    "limit": 10
})
context = "\n".join(r["text"] for r in result.get("results", []))
# Inject `context` into the LLM system prompt

# On session end — store what was learned
client_call(Path("/tmp/sdr-memory.sock"), {
    "action": "store",
    "text": "Resolved OOM by increasing connection pool from 10 to 50",
    "metadata": {"session": "abc123", "source": "manual"}
})

The salience filter ensures only meaningful facts get stored — not the "Let me read the file..." filler that LLMs produce.

Research Background

This project started as a research question: Can we build memory for LLM agents that works more like human memory — associative, lossy, and reconstructive — rather than brute-force database search?

The examples/research/ directory contains the experiments that led to this implementation:

Experiment 01: Baseline SDR retrieval accuracy
Experiment 02: Semantic embedding comparison (sentence-transformers)
Experiment 02 Hybrid: SDR + semantic fusion
Experiment 05: Memory reconstruction from compressed representations
Vector Duel: Head-to-head SDR vs dense embeddings (the benchmark)

See docs/research_context.md for the full research narrative and docs/architecture.md for technical details.

Installation

# Core only (numpy + sqlite, ~5MB)
pip install sdr-memory

# With research extras (torch, transformers, etc.)
pip install sdr-memory[research]

# Development
git clone https://github.com/angelsu/sdr-memory.git
cd sdr-memory
pip install -e ".[dev]"
pytest

Contributing

Contributions welcome! Some areas that need work:

Semantic hybrid mode: Combine SDR with lightweight embeddings for best of both worlds
Forgetting curve: Time-decay scoring so old memories fade naturally
Batch operations: Bulk store/query for large imports
Index optimization: Replace brute-force scan with approximate nearest neighbor for >100K memories
More benchmarks: Compare against Mem0, Zep, MemoryBank on standard datasets

# Development setup
git clone https://github.com/angelsu/sdr-memory.git
cd sdr-memory
pip install -e ".[dev]"
pytest --cov

License

MIT — see LICENSE.

Built with curiosity about how memory actually works, not just how databases store things.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
data		data
docs		docs
examples		examples
sdr_memory		sdr_memory
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sdr-memory

Why not just use a vector database?

Quick Start

Daemon Mode (Unix Socket)

How It Works

Salience Filter

Why SDR?

Benchmarks

Configuration

Use with LLM Agents

Research Background

Installation

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

sdr-memory

Why not just use a vector database?

Quick Start

Daemon Mode (Unix Socket)

How It Works

Salience Filter

Why SDR?

Benchmarks

Configuration

Use with LLM Agents

Research Background

Installation

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages