Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,8 @@
# EMBEDDING_PROVIDER=local # local | openai | voyage | cohere | gemini | openrouter
# LOCAL_EMBEDDING_MODEL=Xenova/paraphrase-multilingual-MiniLM-L12-v2 # Primary local model override
# EMBEDDING_MODEL=Xenova/bge-large-zh-v1.5 # Local fallback alias when LOCAL_EMBEDDING_MODEL is unset
# AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR=/opt/agentmemory-models # Optional transformers.js local model/cache directory
# HF_ENDPOINT=https://hf-mirror.com # Optional transformers.js remoteHost mirror; local provider remains local_files_only

# VOYAGE_API_KEY=pa-... # Optimised for code embeddings

Expand Down
6 changes: 4 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -1001,7 +1001,7 @@ npm install @xenova/transformers
| Cohere | `embed-english-v3.0` | Free trial | `EMBEDDING_PROVIDER=cohere` + `COHERE_API_KEY`; general purpose |
| OpenRouter | Any model | Varies | `EMBEDDING_PROVIDER=openrouter` + `OPENROUTER_API_KEY`; set `OPENROUTER_EMBEDDING_DIMENSIONS` for non-1536 models |

`LOCAL_EMBEDDING_MODEL` should name a Xenova feature-extraction model. agentmemory derives dimensions for common 384/512/768/1024-dimensional Xenova models and otherwise falls back to 384 unless `OPENAI_EMBEDDING_DIMENSIONS` is set. The dimension guard rejects mismatched vectors instead of silently corrupting the vector index. Local model loading uses transformers.js offline/local-file mode, so selected models must already be available in the transformers.js model cache.
`LOCAL_EMBEDDING_MODEL` should name a Xenova feature-extraction model. agentmemory derives dimensions for common 384/512/768/1024-dimensional Xenova models and otherwise falls back to 384 unless `OPENAI_EMBEDDING_DIMENSIONS` is set. The dimension guard rejects mismatched vectors instead of silently corrupting the vector index. Local model loading uses transformers.js offline/local-file mode, so selected models must already be available in the transformers.js model cache. Set `AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR` to point transformers.js local model lookup and filesystem cache at a prepared directory. Set `HF_ENDPOINT` to configure transformers.js `remoteHost` for mirror/proxy setups; the local provider still passes `local_files_only: true`, so this does not enable remote downloads by itself.

---

Expand Down Expand Up @@ -1376,7 +1376,7 @@ Reasoning-class models (`o1`-style with `<think>` blocks) can return empty `cont

OpenRouter reasoning models can be configured with `OPENROUTER_REASONING_EFFORT=xhigh|high|medium|low|minimal|none`. Set `OPENROUTER_INCLUDE_REASONING=true` to ask supported OpenRouter models to return reasoning output when they expose it.

Local embeddings are available via `@xenova/transformers` — set `EMBEDDING_PROVIDER=local` to use `paraphrase-multilingual-MiniLM-L12-v2` entirely on-device, or set `LOCAL_EMBEDDING_MODEL` to another Xenova feature-extraction model. Common 384/512/768/1024-dimensional local models are recognized automatically; set `OPENAI_EMBEDDING_DIMENSIONS` for custom local models. With no `EMBEDDING_PROVIDER`, agentmemory uses BM25+Graph search and does not call a text embedding provider.
Local embeddings are available via `@xenova/transformers` — set `EMBEDDING_PROVIDER=local` to use `paraphrase-multilingual-MiniLM-L12-v2` entirely on-device, or set `LOCAL_EMBEDDING_MODEL` to another Xenova feature-extraction model. Common 384/512/768/1024-dimensional local models are recognized automatically; set `OPENAI_EMBEDDING_DIMENSIONS` for custom local models. Set `AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR` for a prepared transformers.js local model/cache directory, and `HF_ENDPOINT` to set transformers.js `remoteHost` for mirror/proxy environments. The current local provider keeps `local_files_only: true`, so configured models must still be available locally. With no `EMBEDDING_PROVIDER`, agentmemory uses BM25+Graph search and does not call a text embedding provider.

### Cost-aware model selection

Expand Down Expand Up @@ -1564,6 +1564,8 @@ Create `~/.agentmemory/.env`:
# EMBEDDING_PROVIDER=local
# LOCAL_EMBEDDING_MODEL=Xenova/paraphrase-multilingual-MiniLM-L12-v2
# EMBEDDING_MODEL=Xenova/bge-large-zh-v1.5 # Fallback alias for local embeddings when LOCAL_EMBEDDING_MODEL is unset
# AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR=/opt/agentmemory-models # Optional transformers.js local model/cache directory
# HF_ENDPOINT=https://hf-mirror.com # Optional transformers.js remoteHost mirror; local provider remains local_files_only
# VOYAGE_API_KEY=...
# OPENAI_API_KEY=sk-...
# OPENAI_BASE_URL=https://api.openai.com # Override for Azure / vLLM / LM Studio / proxies
Expand Down
176 changes: 176 additions & 0 deletions docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/plan.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
# Issue 798 Local Embedding Cache And HF Mirror Implementation Plan

> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.

**Goal:** Add the remaining local embedding cache-directory and Hugging Face mirror configuration from Issue 798 while preserving the Issue 917 model/dimension behavior.

**Architecture:** Keep the change inside the local transformers configuration boundary. Extend `src/providers/transformers.ts` so every transformers.js load applies Node-safe WASM flags plus local model/cache and remote-host settings from environment, then cover it through existing embedding provider tests.

**Tech Stack:** TypeScript ESM, Vitest, `@xenova/transformers` v2.17.2 env configuration.

---

## Files

- Modify: `src/providers/transformers.ts`
- Modify: `test/embedding-provider.test.ts`
- Modify: `README.md`
- Modify: `.env.example`
- Modify: `docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/todo.md`
- Modify: `docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/plan.md`

Spec path: none. Source of truth is Issue 798, the current user delegation, the task record, and local repo behavior.

GitHub PR prep: mandatory local branch prep after implementation per `github-feature-loop`; no fetch, pull, push, or PR creation is approved.

Security-sensitive surfaces: user-controlled filesystem path and remote host configuration for a local embedding provider. No auth, secret, dependency, REST, MCP, schema, or persistence changes planned.

## Task 1: Add failing config tests

**Files:**
- Modify: `test/embedding-provider.test.ts`

- [ ] **Step 1: Add env cleanup keys**

Add `AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR` and `HF_ENDPOINT` to `ENV_KEYS`.

- [ ] **Step 2: Add direct transformer configuration tests**

Add tests under `describe("configureTransformersForNode", ...)` that import the real `src/providers/transformers.ts` through `freshTransformersModule()` and pass a fake module object:

```ts
const transformers = {
pipeline: vi.fn(),
env: {
localModelPath: "/models/",
cacheDir: "/cache/",
remoteHost: "https://huggingface.co/",
backends: {
onnx: {
wasm: { numThreads: 4 },
},
},
},
};
```

- [ ] **Step 3: Add cache/local dir tests**

Add one test that sets `process.env.AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR=/opt/agentmemory-models`, calls `configureTransformersForNode(transformers)`, and expects both `transformers.env.localModelPath` and `transformers.env.cacheDir` to equal `/opt/agentmemory-models`.

Add one `.env`-backed test by writing `AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR=/tmp/agentmemory-dotenv-models` into `${sandboxHome}/.agentmemory/.env`, calling `configureTransformersForNode(transformers)`, and expecting both fields to equal `/tmp/agentmemory-dotenv-models`.

- [ ] **Step 4: Add HF mirror test**

Add a test that sets `HF_ENDPOINT=https://hf-mirror.com`, calls `configureTransformersForNode(transformers)`, and expects `transformers.env.remoteHost` to equal `https://hf-mirror.com/`.

- [ ] **Step 5: Verify RED**

Run:

```bash
corepack pnpm exec vitest run --exclude test/integration.test.ts test/embedding-provider.test.ts
```

Expected before implementation: fails on the new cache/mirror expectations.

## Task 2: Implement transformers env configuration

**Files:**
- Modify: `src/providers/transformers.ts`

- [ ] **Step 1: Extend `TransformersModule` env type**

Add optional fields under `env`: `localModelPath`, `cacheDir`, and `remoteHost`.

- [ ] **Step 2: Add env helpers**

Read `AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR` and `HF_ENDPOINT` through `getEnvVar()`. Trim blank values; normalize `HF_ENDPOINT` to a trailing slash.

- [ ] **Step 3: Configure transformers.js env**

In `configureTransformersForNode`, after ONNX thread config:

```ts
const modelDir = getEnvVar("AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR")?.trim();
if (modelDir && transformers.env) {
transformers.env.localModelPath = modelDir;
transformers.env.cacheDir = modelDir;
}

const hfEndpoint = getEnvVar("HF_ENDPOINT")?.trim();
if (hfEndpoint && transformers.env) {
transformers.env.remoteHost = hfEndpoint.endsWith("/")
? hfEndpoint
: `${hfEndpoint}/`;
}
```

- [ ] **Step 4: Verify GREEN**

Run the focused Vitest command from Task 1 and expect all tests in `test/embedding-provider.test.ts` to pass.

## Task 3: Update docs

**Files:**
- Modify: `README.md`
- Modify: `.env.example`

- [ ] **Step 1: README embedding provider docs**

Extend the local embedding paragraph to mention:
- `AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR` points transformers.js local model lookup and filesystem cache at a prepared directory.
- `HF_ENDPOINT` sets transformers.js `env.remoteHost` for mirror/proxy setups, but the current local provider still passes `local_files_only: true`.
- Existing offline/local-file behavior means selected models still need to be present locally when local-only loading is used.

- [ ] **Step 2: Environment example**

Add commented examples near local embedding config:

```dotenv
# AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR=/opt/agentmemory-models # Optional transformers.js local model/cache directory
# HF_ENDPOINT=https://hf-mirror.com # Optional transformers.js remoteHost mirror; local provider remains local_files_only
```

- [ ] **Step 3: Inspect stale references**

Run:

```bash
rg -n "AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR|HF_ENDPOINT|local model/cache|transformers.js model cache" README.md .env.example test/embedding-provider.test.ts src/providers/transformers.ts
```

Expected: references are consistent and no docs claim unimplemented behavior.

## Task 4: Verification and local PR prep

**Files:**
- All task-owned files above.

- [ ] **Step 1: Run focused verification**

```bash
corepack pnpm exec vitest run --exclude test/integration.test.ts test/embedding-provider.test.ts
git diff --check
```

- [ ] **Step 2: Run security/static checks required by touched surface**

Run Semgrep on changed code/docs/task files:

```bash
semgrep scan --config p/default --error --metrics=off src/providers/transformers.ts test/embedding-provider.test.ts README.md .env.example docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/todo.md docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/plan.md
```

OSV is not required unless dependency, lockfile, container, vendored, or package-manager surfaces change.

- [ ] **Step 3: Run review chain and prepare local commit**

Use passive security review, focused simplification, implementation review, and verification-before-completion before staging. Stage only task-owned files and create a factual commit if all required checks pass.

## Self-Review

- Spec coverage: Issue 798 model configurability is already covered by Issue 917; this plan covers the remaining cache directory and HF mirror gaps plus docs/tests.
- Placeholder scan: no TBD/TODO placeholders remain.
- Plan review corrections: tests now target the real `configureTransformersForNode`, implementation uses `getEnvVar`, and HF mirror documentation preserves the existing offline `local_files_only` contract.
- Type consistency: tests and implementation use `AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR`, `HF_ENDPOINT`, `localModelPath`, `cacheDir`, and `remoteHost` consistently.
Loading
Loading