wbugitlab1 · wbugitlab1 · Jun 17, 2026 · Jun 17, 2026
diff --git a/.env.example b/.env.example
@@ -84,6 +84,8 @@
 # EMBEDDING_PROVIDER=local                       # local | openai | voyage | cohere | gemini | openrouter
 # LOCAL_EMBEDDING_MODEL=Xenova/paraphrase-multilingual-MiniLM-L12-v2  # Primary local model override
 # EMBEDDING_MODEL=Xenova/bge-large-zh-v1.5       # Local fallback alias when LOCAL_EMBEDDING_MODEL is unset
+# AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR=/opt/agentmemory-models  # Optional transformers.js local model/cache directory
+# HF_ENDPOINT=https://hf-mirror.com              # Optional transformers.js remoteHost mirror; local provider remains local_files_only
 
 # VOYAGE_API_KEY=pa-...                          # Optimised for code embeddings
 

diff --git a/README.md b/README.md
@@ -1001,7 +1001,7 @@ npm install @xenova/transformers
 | Cohere | `embed-english-v3.0` | Free trial | `EMBEDDING_PROVIDER=cohere` + `COHERE_API_KEY`; general purpose |
 | OpenRouter | Any model | Varies | `EMBEDDING_PROVIDER=openrouter` + `OPENROUTER_API_KEY`; set `OPENROUTER_EMBEDDING_DIMENSIONS` for non-1536 models |
 
-`LOCAL_EMBEDDING_MODEL` should name a Xenova feature-extraction model. agentmemory derives dimensions for common 384/512/768/1024-dimensional Xenova models and otherwise falls back to 384 unless `OPENAI_EMBEDDING_DIMENSIONS` is set. The dimension guard rejects mismatched vectors instead of silently corrupting the vector index. Local model loading uses transformers.js offline/local-file mode, so selected models must already be available in the transformers.js model cache.
+`LOCAL_EMBEDDING_MODEL` should name a Xenova feature-extraction model. agentmemory derives dimensions for common 384/512/768/1024-dimensional Xenova models and otherwise falls back to 384 unless `OPENAI_EMBEDDING_DIMENSIONS` is set. The dimension guard rejects mismatched vectors instead of silently corrupting the vector index. Local model loading uses transformers.js offline/local-file mode, so selected models must already be available in the transformers.js model cache. Set `AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR` to point transformers.js local model lookup and filesystem cache at a prepared directory. Set `HF_ENDPOINT` to configure transformers.js `remoteHost` for mirror/proxy setups; the local provider still passes `local_files_only: true`, so this does not enable remote downloads by itself.
 
 ---
 
@@ -1376,7 +1376,7 @@ Reasoning-class models (`o1`-style with `<think>` blocks) can return empty `cont
 
 OpenRouter reasoning models can be configured with `OPENROUTER_REASONING_EFFORT=xhigh|high|medium|low|minimal|none`. Set `OPENROUTER_INCLUDE_REASONING=true` to ask supported OpenRouter models to return reasoning output when they expose it.
 
-Local embeddings are available via `@xenova/transformers` — set `EMBEDDING_PROVIDER=local` to use `paraphrase-multilingual-MiniLM-L12-v2` entirely on-device, or set `LOCAL_EMBEDDING_MODEL` to another Xenova feature-extraction model. Common 384/512/768/1024-dimensional local models are recognized automatically; set `OPENAI_EMBEDDING_DIMENSIONS` for custom local models. With no `EMBEDDING_PROVIDER`, agentmemory uses BM25+Graph search and does not call a text embedding provider.
+Local embeddings are available via `@xenova/transformers` — set `EMBEDDING_PROVIDER=local` to use `paraphrase-multilingual-MiniLM-L12-v2` entirely on-device, or set `LOCAL_EMBEDDING_MODEL` to another Xenova feature-extraction model. Common 384/512/768/1024-dimensional local models are recognized automatically; set `OPENAI_EMBEDDING_DIMENSIONS` for custom local models. Set `AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR` for a prepared transformers.js local model/cache directory, and `HF_ENDPOINT` to set transformers.js `remoteHost` for mirror/proxy environments. The current local provider keeps `local_files_only: true`, so configured models must still be available locally. With no `EMBEDDING_PROVIDER`, agentmemory uses BM25+Graph search and does not call a text embedding provider.
 
 ### Cost-aware model selection
 
@@ -1564,6 +1564,8 @@ Create `~/.agentmemory/.env`:
 # EMBEDDING_PROVIDER=local
 # LOCAL_EMBEDDING_MODEL=Xenova/paraphrase-multilingual-MiniLM-L12-v2
 # EMBEDDING_MODEL=Xenova/bge-large-zh-v1.5 # Fallback alias for local embeddings when LOCAL_EMBEDDING_MODEL is unset
+# AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR=/opt/agentmemory-models # Optional transformers.js local model/cache directory
+# HF_ENDPOINT=https://hf-mirror.com       # Optional transformers.js remoteHost mirror; local provider remains local_files_only
 # VOYAGE_API_KEY=...
 # OPENAI_API_KEY=sk-...
 # OPENAI_BASE_URL=https://api.openai.com   # Override for Azure / vLLM / LM Studio / proxies

diff --git a/docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/plan.md b/docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/plan.md
@@ -0,0 +1,176 @@
+# Issue 798 Local Embedding Cache And HF Mirror Implementation Plan
+
+> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
+
+**Goal:** Add the remaining local embedding cache-directory and Hugging Face mirror configuration from Issue 798 while preserving the Issue 917 model/dimension behavior.
+
+**Architecture:** Keep the change inside the local transformers configuration boundary. Extend `src/providers/transformers.ts` so every transformers.js load applies Node-safe WASM flags plus local model/cache and remote-host settings from environment, then cover it through existing embedding provider tests.
+
+**Tech Stack:** TypeScript ESM, Vitest, `@xenova/transformers` v2.17.2 env configuration.
+
+---
+
+## Files
+
+- Modify: `src/providers/transformers.ts`
+- Modify: `test/embedding-provider.test.ts`
+- Modify: `README.md`
+- Modify: `.env.example`
+- Modify: `docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/todo.md`
+- Modify: `docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/plan.md`
+
+Spec path: none. Source of truth is Issue 798, the current user delegation, the task record, and local repo behavior.
+
+GitHub PR prep: mandatory local branch prep after implementation per `github-feature-loop`; no fetch, pull, push, or PR creation is approved.
+
+Security-sensitive surfaces: user-controlled filesystem path and remote host configuration for a local embedding provider. No auth, secret, dependency, REST, MCP, schema, or persistence changes planned.
+
+## Task 1: Add failing config tests
+
+**Files:**
+- Modify: `test/embedding-provider.test.ts`
+
+- [ ] **Step 1: Add env cleanup keys**
+
+Add `AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR` and `HF_ENDPOINT` to `ENV_KEYS`.
+
+- [ ] **Step 2: Add direct transformer configuration tests**
+
+Add tests under `describe("configureTransformersForNode", ...)` that import the real `src/providers/transformers.ts` through `freshTransformersModule()` and pass a fake module object:
+
+```ts
+const transformers = {
+  pipeline: vi.fn(),
+  env: {
+    localModelPath: "/models/",
+    cacheDir: "/cache/",
+    remoteHost: "https://huggingface.co/",
+    backends: {
+      onnx: {
+        wasm: { numThreads: 4 },
+      },
+    },
+  },
+};
+```
+
+- [ ] **Step 3: Add cache/local dir tests**
+
+Add one test that sets `process.env.AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR=/opt/agentmemory-models`, calls `configureTransformersForNode(transformers)`, and expects both `transformers.env.localModelPath` and `transformers.env.cacheDir` to equal `/opt/agentmemory-models`.
+
+Add one `.env`-backed test by writing `AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR=/tmp/agentmemory-dotenv-models` into `${sandboxHome}/.agentmemory/.env`, calling `configureTransformersForNode(transformers)`, and expecting both fields to equal `/tmp/agentmemory-dotenv-models`.
+
+- [ ] **Step 4: Add HF mirror test**
+
+Add a test that sets `HF_ENDPOINT=https://hf-mirror.com`, calls `configureTransformersForNode(transformers)`, and expects `transformers.env.remoteHost` to equal `https://hf-mirror.com/`.
+
+- [ ] **Step 5: Verify RED**
+
+Run:
+
+```bash
+corepack pnpm exec vitest run --exclude test/integration.test.ts test/embedding-provider.test.ts
+```
+
+Expected before implementation: fails on the new cache/mirror expectations.
+
+## Task 2: Implement transformers env configuration
+
+**Files:**
+- Modify: `src/providers/transformers.ts`
+
+- [ ] **Step 1: Extend `TransformersModule` env type**
+
+Add optional fields under `env`: `localModelPath`, `cacheDir`, and `remoteHost`.
+
+- [ ] **Step 2: Add env helpers**
+
+Read `AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR` and `HF_ENDPOINT` through `getEnvVar()`. Trim blank values; normalize `HF_ENDPOINT` to a trailing slash.
+
+- [ ] **Step 3: Configure transformers.js env**
+
+In `configureTransformersForNode`, after ONNX thread config:
+
+```ts
+const modelDir = getEnvVar("AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR")?.trim();
+if (modelDir && transformers.env) {
+  transformers.env.localModelPath = modelDir;
+  transformers.env.cacheDir = modelDir;
+}
+
+const hfEndpoint = getEnvVar("HF_ENDPOINT")?.trim();
+if (hfEndpoint && transformers.env) {
+  transformers.env.remoteHost = hfEndpoint.endsWith("/")
+    ? hfEndpoint
+    : `${hfEndpoint}/`;
+}
+```
+
+- [ ] **Step 4: Verify GREEN**
+
+Run the focused Vitest command from Task 1 and expect all tests in `test/embedding-provider.test.ts` to pass.
+
+## Task 3: Update docs
+
+**Files:**
+- Modify: `README.md`
+- Modify: `.env.example`
+
+- [ ] **Step 1: README embedding provider docs**
+
+Extend the local embedding paragraph to mention:
+- `AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR` points transformers.js local model lookup and filesystem cache at a prepared directory.
+- `HF_ENDPOINT` sets transformers.js `env.remoteHost` for mirror/proxy setups, but the current local provider still passes `local_files_only: true`.
+- Existing offline/local-file behavior means selected models still need to be present locally when local-only loading is used.
+
+- [ ] **Step 2: Environment example**
+
+Add commented examples near local embedding config:
+
+```dotenv
+# AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR=/opt/agentmemory-models  # Optional transformers.js local model/cache directory
+# HF_ENDPOINT=https://hf-mirror.com              # Optional transformers.js remoteHost mirror; local provider remains local_files_only
+```
+
+- [ ] **Step 3: Inspect stale references**
+
+Run:
+
+```bash
+rg -n "AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR|HF_ENDPOINT|local model/cache|transformers.js model cache" README.md .env.example test/embedding-provider.test.ts src/providers/transformers.ts
+```
+
+Expected: references are consistent and no docs claim unimplemented behavior.
+
+## Task 4: Verification and local PR prep
+
+**Files:**
+- All task-owned files above.
+
+- [ ] **Step 1: Run focused verification**
+
+```bash
+corepack pnpm exec vitest run --exclude test/integration.test.ts test/embedding-provider.test.ts
+git diff --check
+```
+
+- [ ] **Step 2: Run security/static checks required by touched surface**
+
+Run Semgrep on changed code/docs/task files:
+
+```bash
+semgrep scan --config p/default --error --metrics=off src/providers/transformers.ts test/embedding-provider.test.ts README.md .env.example docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/todo.md docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/plan.md
+```
+
+OSV is not required unless dependency, lockfile, container, vendored, or package-manager surfaces change.
+
+- [ ] **Step 3: Run review chain and prepare local commit**
+
+Use passive security review, focused simplification, implementation review, and verification-before-completion before staging. Stage only task-owned files and create a factual commit if all required checks pass.
+
+## Self-Review
+
+- Spec coverage: Issue 798 model configurability is already covered by Issue 917; this plan covers the remaining cache directory and HF mirror gaps plus docs/tests.
+- Placeholder scan: no TBD/TODO placeholders remain.
+- Plan review corrections: tests now target the real `configureTransformersForNode`, implementation uses `getEnvVar`, and HF mirror documentation preserves the existing offline `local_files_only` contract.
+- Type consistency: tests and implementation use `AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR`, `HF_ENDPOINT`, `localModelPath`, `cacheDir`, and `remoteHost` consistently.