Skip to content

Commit 06eeef0

Browse files
authored
Merge pull request #8 from Ashish-dwi99/beta-v1
feat: CLS Distillation Memory v1.4 — bio-inspired consolidation + pro…
2 parents c83902b + 2ca3dd9 commit 06eeef0

49 files changed

Lines changed: 3031 additions & 343 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

README.md

Lines changed: 51 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,9 @@ But Engram isn't just a handoff bus. It solves four fundamental problems with ho
5555
| **Nobody forgets** | Store everything forever | **Ebbinghaus decay curve, ~45% less storage** |
5656
| **Agents write with no oversight** | Store directly | **Staging + verification + trust scoring** |
5757
| **No episodic memory** | Vector search only | **CAST scenes (time/place/topic)** |
58+
| **No consolidation** | Store everything as-is | **CLS Distillation — replay-driven fact extraction** |
59+
| **Single decay rate** | One exponential curve | **Multi-trace Benna-Fusi model (fast/mid/slow)** |
60+
| **No intent routing** | Same search for all queries | **Episodic vs semantic query classification** |
5861
| Multi-modal encoding | Single embedding | **5 retrieval paths (EchoMem)** |
5962
| Cross-agent memory sharing | Per-agent silos | **Scoped retrieval with all-but-mask privacy** |
6063
| Concurrent multi-agent access | Single-process locks | **sqlite-vec WAL mode — multiple agents, one DB** |
@@ -90,6 +93,9 @@ pip install "engram-memory[sqlite_vec]"
9093
# OpenAI provider add-on
9194
pip install "engram-memory[openai]"
9295

96+
# NVIDIA provider add-on (Llama 3.1, nv-embed-v1, etc.)
97+
pip install "engram-memory[nvidia]"
98+
9399
# Ollama provider add-on
94100
pip install "engram-memory[ollama]"
95101
```
@@ -144,7 +150,7 @@ Engram has five opinions about how memory should work:
144150

145151
1. **Switching agents shouldn't mean starting over.** When an agent pauses — rate limit, crash, tool switch — it saves a session digest. The next agent loads it and continues. Zero re-explanation.
146152
2. **Agents need shared real-time state.** Active Memory lets agents broadcast what they're doing right now — no polling, no coordination protocol. Agent A posts "editing auth.py"; Agent B sees it instantly.
147-
3. **Memory has a lifecycle.** New memories start in short-term (SML), get promoted to long-term (LML) through repeated access, and fade away through Ebbinghaus decay if unused.
153+
3. **Memory has a lifecycle.** New memories start in short-term (SML), get promoted to long-term (LML) through repeated access, and fade away through Ebbinghaus decay if unused. Sleep cycles distill episodic conversations into durable semantic facts (CLS consolidation), cascade strength traces from fast to slow, and prune redundant or contradictory memories.
148154
4. **Agents are untrusted writers.** Every write is a proposal that lands in staging. Trusted agents can auto-merge; untrusted ones wait for approval.
149155
5. **Scoping is mandatory.** Every memory is scoped by user. Agents see only what they're allowed to — everything else gets the "all but mask" treatment (structure visible, details redacted).
150156

@@ -209,7 +215,7 @@ Engram has five opinions about how memory should work:
209215

210216
### The Memory Stack
211217

212-
Engram combines seven systems, each handling a different aspect of how memory should work:
218+
Engram combines multiple systems, each handling a different aspect of how memory should work:
213219

214220
#### Active Memory — Real-Time Signal Bus
215221

@@ -289,6 +295,48 @@ Scene: "Engram v2 architecture session"
289295
Memories: [mem_1, mem_2] ← semantic facts extracted
290296
```
291297

298+
#### CLS Distillation Memory — Bio-Inspired Consolidation (v1.4)
299+
300+
Inspired by Complementary Learning Systems (CLS) theory — how the hippocampus and neocortex work together in the brain. Engram v1.4 adds five mechanisms that make memory smarter over time:
301+
302+
**1. Episodic/Semantic Memory Types**
303+
Conversations are stored as `episodic` memories. During sleep cycles, a replay-driven distiller extracts durable facts into `semantic` memories — just like how your brain consolidates experiences into knowledge overnight.
304+
305+
**2. Replay-Driven Distillation**
306+
The `ReplayDistiller` samples recent episodic memories, groups them by scene/time, and uses the LLM to extract reusable semantic facts. Every distilled fact links back to its source episodes (provenance tracking).
307+
308+
**3. Multi-Mechanism Forgetting**
309+
Beyond simple exponential decay, Engram now has three advanced forgetting mechanisms:
310+
- **Interference Pruning** — contradictory memories are detected and the weaker one is demoted
311+
- **Redundancy Collapse** — near-duplicate memories are auto-fused
312+
- **Homeostatic Normalization** — memory budgets per namespace prevent unbounded growth
313+
314+
**4. Multi-Timescale Strength Traces (Benna-Fusi Model)**
315+
Each memory has three strength traces instead of one scalar:
316+
```
317+
s_fast (decay: 0.20/day) — recent access, volatile
318+
s_mid (decay: 0.05/day) — medium-term consolidation
319+
s_slow (decay: 0.005/day) — durable long-term knowledge
320+
```
321+
New memories start in `s_fast`. Sleep cycles cascade strength: `fast → mid → slow`. Important facts become nearly permanent.
322+
323+
**5. Intent-Aware Retrieval Routing**
324+
Queries are classified as episodic ("when did we discuss..."), semantic ("what is the deployment process?"), or mixed. Matching memory types get a retrieval boost — the right type of answer for the right type of question.
325+
326+
```
327+
┌──────────────────────────────────────────────────────────────┐
328+
│ Sleep Cycle (v1.4) │
329+
│ │
330+
│ 1. Standard FadeMem decay (SML/LML) │
331+
│ 2. Multi-trace decay (fast/mid/slow independently) │
332+
│ 3. Interference pruning (contradict → demote weaker) │
333+
│ 4. Redundancy collapse (near-dupes → fuse) │
334+
│ 5. Homeostatic normalization (budget enforcement) │
335+
│ 6. Replay distillation (episodic → semantic facts) │
336+
│ 7. Trace cascade (fast → mid → slow consolidation) │
337+
└──────────────────────────────────────────────────────────────┘
338+
```
339+
292340
#### Handoff Bus — Cross-Agent Continuity
293341

294342
Engram now defaults to a zero-intervention continuity model: MCP adapters automatically request resume context before tool execution and auto-write checkpoints on lifecycle events (`tool_complete`, `agent_pause`, `agent_end`). The legacy tools (`save_session_digest`, `get_last_session`, `list_sessions`) remain available for compatibility.
@@ -785,7 +833,7 @@ Engram is based on:
785833
| Multi-hop Reasoning | +12% accuracy |
786834
| Retrieval Precision | +8% on LTI-Bench |
787835

788-
Biological inspirations: Ebbinghaus Forgetting Curve → exponential decay, Spaced Repetition → access boosts strength, Sleep Consolidation → SML → LML promotion, Working Memory → Active Memory signal bus, Conscious/Subconscious Split → Active vs Passive memory, Production Effect → echo encoding, Elaborative Encoding → deeper processing = stronger memory.
836+
Biological inspirations: Ebbinghaus Forgetting Curve → exponential decay, Spaced Repetition → access boosts strength, Sleep Consolidation → SML → LML promotion + CLS replay distillation, Benna-Fusi Model → multi-timescale strength traces (fast/mid/slow), Complementary Learning Systems → episodic-to-semantic consolidation, Working Memory → Active Memory signal bus, Conscious/Subconscious Split → Active vs Passive memory, Production Effect → echo encoding, Elaborative Encoding → deeper processing = stronger memory.
789837

790838
---
791839

engram/api/app.py

Lines changed: 25 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -85,22 +85,32 @@ class DecayResponse(BaseModel):
8585
redoc_url="/redoc",
8686
)
8787

88+
_cors_origins_raw = os.environ.get("ENGRAM_CORS_ORIGINS", "")
89+
_cors_origins = (
90+
[o.strip() for o in _cors_origins_raw.split(",") if o.strip()]
91+
if _cors_origins_raw
92+
else ["http://localhost:3000", "http://127.0.0.1:3000"]
93+
)
94+
8895
app.add_middleware(
8996
CORSMiddleware,
90-
allow_origins=["*"],
97+
allow_origins=_cors_origins,
9198
allow_credentials=True,
9299
allow_methods=["*"],
93100
allow_headers=["*"],
94101
)
95102
add_metrics_routes(app)
96103

97104
_memory: Optional[Memory] = None
105+
_memory_lock = threading.Lock()
98106

99107

100108
def get_memory() -> Memory:
101109
global _memory
102110
if _memory is None:
103-
_memory = Memory()
111+
with _memory_lock:
112+
if _memory is None:
113+
_memory = Memory()
104114
return _memory
105115

106116

@@ -403,7 +413,7 @@ async def search_memories(request: SearchRequestV2, http_request: Request):
403413
raise require_session_error(exc)
404414
except Exception as exc:
405415
logger.exception("Error searching memories")
406-
raise HTTPException(status_code=500, detail=str(exc))
416+
raise HTTPException(status_code=500, detail="Internal server error")
407417

408418

409419
@app.get("/v1/scenes")
@@ -494,7 +504,7 @@ async def add_memory(request: AddMemoryRequestV2, http_request: Request):
494504
raise require_session_error(exc)
495505
except Exception as exc:
496506
logger.exception("Error creating proposal/direct memory")
497-
raise HTTPException(status_code=500, detail=str(exc))
507+
raise HTTPException(status_code=500, detail="Internal server error")
498508

499509

500510
@app.get("/v1/staging/commits")
@@ -779,15 +789,19 @@ async def get_memory_by_id(memory_id: str):
779789

780790
@app.put("/v1/memories/{memory_id}", response_model=Dict[str, Any])
781791
@app.put("/v1/memories/{memory_id}/", response_model=Dict[str, Any])
782-
async def update_memory(memory_id: str, request: Dict[str, Any]):
792+
async def update_memory(memory_id: str, request: Dict[str, Any], http_request: Request):
793+
token = get_token_from_request(http_request)
794+
require_token_for_untrusted_request(http_request, token)
783795
memory = get_memory()
784796
result = memory.update(memory_id, request)
785797
return result
786798

787799

788800
@app.delete("/v1/memories/{memory_id}")
789801
@app.delete("/v1/memories/{memory_id}/")
790-
async def delete_memory(memory_id: str):
802+
async def delete_memory(memory_id: str, http_request: Request):
803+
token = get_token_from_request(http_request)
804+
require_token_for_untrusted_request(http_request, token)
791805
memory = get_memory()
792806
memory.delete(memory_id)
793807
return {"status": "deleted", "id": memory_id}
@@ -796,14 +810,18 @@ async def delete_memory(memory_id: str):
796810
@app.delete("/v1/memories", response_model=Dict[str, Any])
797811
@app.delete("/v1/memories/", response_model=Dict[str, Any])
798812
async def delete_memories(
813+
http_request: Request,
799814
user_id: Optional[str] = Query(default=None),
800815
agent_id: Optional[str] = Query(default=None),
801816
run_id: Optional[str] = Query(default=None),
802817
app_id: Optional[str] = Query(default=None),
818+
dry_run: bool = Query(default=False, description="Preview what would be deleted without actually deleting"),
803819
):
820+
token = get_token_from_request(http_request)
821+
require_token_for_untrusted_request(http_request, token)
804822
memory = get_memory()
805823
try:
806-
return memory.delete_all(user_id=user_id, agent_id=agent_id, run_id=run_id, app_id=app_id)
824+
return memory.delete_all(user_id=user_id, agent_id=agent_id, run_id=run_id, app_id=app_id, dry_run=dry_run)
807825
except FadeMemValidationError as exc:
808826
raise HTTPException(status_code=400, detail=exc.message)
809827

engram/api/schemas.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -93,31 +93,31 @@ class HandoffSessionDigestRequest(BaseModel):
9393

9494

9595
class SearchRequestV2(BaseModel):
96-
query: str
96+
query: str = Field(min_length=1, max_length=10000)
9797
user_id: str = Field(default="default")
9898
agent_id: Optional[str] = Field(default=None)
9999
limit: int = Field(default=10, ge=1, le=100)
100100
categories: Optional[List[str]] = Field(default=None)
101101

102102

103103
class AddMemoryRequestV2(BaseModel):
104-
content: Optional[str] = Field(default=None)
104+
content: Optional[str] = Field(default=None, max_length=100000)
105105
messages: Optional[Union[str, List[Dict[str, Any]]]] = Field(default=None)
106106
user_id: str = Field(default="default")
107107
agent_id: Optional[str] = Field(default=None)
108108
metadata: Optional[Dict[str, Any]] = Field(default=None)
109109
categories: Optional[List[str]] = Field(default=None)
110110
scope: Optional[str] = Field(default="work")
111111
namespace: Optional[str] = Field(default="default")
112-
mode: str = Field(default="staging", description="staging|direct")
112+
mode: Literal["staging", "direct"] = Field(default="staging", description="staging|direct")
113113
infer: bool = Field(default=False)
114114
source_app: Optional[str] = Field(default=None)
115115
source_type: str = Field(default="rest")
116116
source_event_id: Optional[str] = Field(default=None)
117117

118118

119119
class SceneSearchRequest(BaseModel):
120-
query: str
120+
query: str = Field(min_length=1, max_length=10000)
121121
user_id: str = Field(default="default")
122122
agent_id: Optional[str] = Field(default=None)
123123
limit: int = Field(default=10, ge=1, le=100)

engram/configs/active.py

Lines changed: 22 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
from enum import Enum
44
from typing import Dict
55

6-
from pydantic import BaseModel, Field
6+
from pydantic import BaseModel, Field, field_validator
77

88

99
class TTLTier(str, Enum):
@@ -25,6 +25,14 @@ class SignalScope(str, Enum):
2525
NAMESPACE = "namespace" # Only agents in same namespace
2626

2727

28+
class ConsolidationConfig(BaseModel):
29+
"""Configuration for active → passive memory consolidation."""
30+
promote_critical: bool = True
31+
promote_high_read: bool = True
32+
promote_read_threshold: int = 3
33+
directive_to_passive: bool = True
34+
35+
2836
class ActiveMemoryConfig(BaseModel):
2937
"""Configuration for the Active Memory signal bus."""
3038
enabled: bool = True
@@ -40,11 +48,18 @@ class ActiveMemoryConfig(BaseModel):
4048
consolidation_enabled: bool = True
4149
consolidation_min_age_seconds: int = 600
4250
consolidation_min_reads: int = 3
51+
consolidation: ConsolidationConfig = Field(default_factory=ConsolidationConfig)
4352

53+
@field_validator("default_ttl_tier")
54+
@classmethod
55+
def _valid_ttl_tier(cls, v: str) -> str:
56+
allowed = {t.value for t in TTLTier}
57+
v = str(v).strip().lower()
58+
if v not in allowed:
59+
return TTLTier.NOTABLE.value
60+
return v
4461

45-
class ConsolidationConfig(BaseModel):
46-
"""Configuration for active → passive memory consolidation."""
47-
promote_critical: bool = True
48-
promote_high_read: bool = True
49-
promote_read_threshold: int = 3
50-
directive_to_passive: bool = True
62+
@field_validator("max_signals_per_response")
63+
@classmethod
64+
def _clamp_max_signals(cls, v: int) -> int:
65+
return min(100, max(1, int(v)))

0 commit comments

Comments
 (0)