diff --git a/.env.example b/.env.example index d7c525f06..9e4c7c8a9 100644 --- a/.env.example +++ b/.env.example @@ -73,6 +73,9 @@ # MAX_TOKENS=4096 # Cap LLM completion tokens for compression / summarise calls # AGENTMEMORY_COMPRESS_MODEL=cheap-model # Optional model for provider.compress(); provider.summarize() keeps the provider model +# AGENTMEMORY_OUTPUT_LANG=match # Optional generated-text language. Empty/unset keeps default prompts unchanged. +# # Use "match" to follow input/observation language, a known code such as de/ja/pt-BR, +# # or a custom language name such as Português. Code, file paths, XML tags, and schema names stay verbatim. # Outbound LLM / embedding timeout — shared across every raw-fetch provider # (Gemini, OpenRouter, MiniMax, OpenAI LLM, and OpenAI/Cohere/Voyage/OpenRouter/Ollama diff --git a/README.md b/README.md index bc496e8e9..0abbe19d4 100644 --- a/README.md +++ b/README.md @@ -1456,6 +1456,12 @@ OPENAI_MODEL=your-main-model AGENTMEMORY_COMPRESS_MODEL=your-cheap-compression-model ``` +Set `AGENTMEMORY_OUTPUT_LANG` when generated memory text should be written in a specific language. Empty or unset keeps the default prompts unchanged. Use `match` to follow the input or observation language, a known language code such as `de`, `ja`, or `pt-BR`, or a custom language name such as `Português`. Code, identifiers, file paths, XML tags, and schema names are still preserved verbatim. + +```env +AGENTMEMORY_OUTPUT_LANG=match +``` + Sources: [OpenRouter pricing for Sonnet 4.6](https://openrouter.ai/anthropic/claude-sonnet-4.6/pricing), [DeepSeek V4 Pro](https://openrouter.ai/deepseek/deepseek-v4-pro), [DeepSeek pricing notes](https://api-docs.deepseek.com/quick_start/pricing/). ### Multi-agent memory (`AGENTMEMORY_AGENT_ID` / `AGENT_ID` + `AGENTMEMORY_AGENT_SCOPE`) diff --git a/docs/todos/2026-06-18-issue-498-output-lang/plan.md b/docs/todos/2026-06-18-issue-498-output-lang/plan.md new file mode 100644 index 000000000..e69f201db --- /dev/null +++ b/docs/todos/2026-06-18-issue-498-output-lang/plan.md @@ -0,0 +1,65 @@ +# Issue 498 Output Language Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Add opt-in `AGENTMEMORY_OUTPUT_LANG` control for generated memory text. + +**Architecture:** Keep the change centralized in `ResilientProvider`, because all configured providers created by `createProvider()` / `createFallbackProvider()` are wrapped there. Add a small prompt helper that returns an empty string for unset/blank config so default prompts are unchanged, and append the directive to `compress()` and `summarize()` system prompts only when configured. + +**Tech Stack:** TypeScript ESM, vitest, pnpm. + +--- + +## Files + +- Create: `src/prompts/output-language.ts` for env parsing and directive generation. +- Create: `test/output-language.test.ts` for helper behavior. +- Create or modify: `test/resilient-provider.test.ts` for prompt forwarding through `ResilientProvider`. +- Modify: `src/providers/resilient.ts` to append the directive for `compress()` and `summarize()`. +- Modify: `.env.example` and `README.md` to document `AGENTMEMORY_OUTPUT_LANG`. +- Modify: `docs/todos/2026-06-18-issue-498-output-lang/todo.md` with evidence and review notes. + +## Task 1: Output Language Helper + +- [x] Add `src/prompts/output-language.ts` exporting `outputLanguageDirective(value: string | undefined): string`. +- [x] Implement behavior: + - accept the caller-resolved `AGENTMEMORY_OUTPUT_LANG` value. + - return `""` for unset or whitespace-only. + - trim configured values. + - map `match` case-insensitively to "match the language of the user's input or observation". + - map common ISO-style language codes case-insensitively to readable language names: `en`, `zh`, `zh-cn`, `zh-tw`, `ja`, `ko`, `de`, `fr`, `es`, `pt`, `pt-br`, `it`, `ru`, `hi`, `tr`, `ar`, `nl`, `pl`, `uk`. + - use any other value verbatim as the target language name. + - include a preservation sentence for code, identifiers, file paths, XML tags, and schema names. +- [x] Add tests in `test/output-language.test.ts` for unset, blank, `match`, uppercase known code, regional code, unknown verbatim value, and preservation text. +- [x] Run `corepack pnpm exec vitest run test/output-language.test.ts`. + +## Task 2: Resilient Provider Integration + +- [x] Modify `src/providers/resilient.ts` to import `outputLanguageDirective`. +- [x] Add a private method that appends the directive to the system prompt only when the helper returns non-empty text. +- [x] Keep `compress()` and `summarize()` default forwarding byte-for-byte identical when the env is unset. +- [x] Add `test/resilient-provider.test.ts` covering: + - unset env forwards `systemPrompt` unchanged for `compress()`. + - unset env forwards `systemPrompt` unchanged for `summarize()`. + - configured env appends the directive for `compress()`. + - configured env appends the directive for `summarize()`. + - user prompt is not modified. +- [x] Run `corepack pnpm exec vitest run test/output-language.test.ts test/resilient-provider.test.ts`. + +## Task 3: Documentation + +- [x] Add `.env.example` entry near provider/model generation options: + - `# AGENTMEMORY_OUTPUT_LANG=match` + - explain unset/empty default, `match`, known language codes, and custom names. +- [x] Add README configuration text near `AGENTMEMORY_COMPRESS_MODEL`, because this is a generated-text/provider prompt setting. +- [x] Search README and `.env.example` for stale/conflicting references. + +## Task 4: Review, Verification, And PR Prep + +- [x] Update `todo.md` with final explorer result, assumptions, and verification evidence. +- [x] Run targeted tests. +- [x] Run `corepack pnpm test` if dependencies are available and pnpm is not blocked. +- [x] Run required security gates for code/config/doc changes: Semgrep; after staging, Gitleaks staged scan. OSV is not required unless dependency or lockfile surfaces changed. +- [x] Do focused simplification pass over touched code. +- [x] Commit task-owned files only. +- [ ] Push branch to `origin`, create PR against `origin/main`, wait for checks, merge if checks pass, and update/close issue #498 with evidence. diff --git a/docs/todos/2026-06-18-issue-498-output-lang/todo.md b/docs/todos/2026-06-18-issue-498-output-lang/todo.md new file mode 100644 index 000000000..0ff24a004 --- /dev/null +++ b/docs/todos/2026-06-18-issue-498-output-lang/todo.md @@ -0,0 +1,81 @@ +# Issue 498 Output Language + +## Scope + +- Repository: `/Users/A1538552/.codex/worktrees/5da3/agentmemory` +- Branch: `issue/498-output-lang` +- Issue: GitHub issue #498, upstream PR 711 mirror +- Target: `origin/main` only + +## Sprint Contract + +- Goal: add an opt-in `AGENTMEMORY_OUTPUT_LANG` setting that guides LLM-generated memory text without changing default behavior. +- Scope: generated text routed through `MemoryProvider.compress()` and `MemoryProvider.summarize()`, docs for the new environment variable, and focused tests. +- Non-goals: no provider rewrites, no schema or persisted-data migration, no MCP/REST surface change, no automatic language detection outside the LLM instruction, no upstream PR target. +- Acceptance criteria: + - unset or blank `AGENTMEMORY_OUTPUT_LANG` leaves system prompts byte-for-byte unchanged. + - `match` asks generated text to match the user's input/observation language. + - known language codes map case-insensitively to readable names. + - unknown non-empty values are used verbatim as the language name. + - code, identifiers, file paths, XML tags, and schema names must be preserved verbatim. +- Intended verification: + - targeted vitest coverage for output-language helper and `ResilientProvider`. + - targeted docs/search checks. + - project-native `corepack pnpm test` if dependencies are materialized; otherwise closest targeted checks with blocker recorded. + - required security gates before final handoff/commit. +- Known boundaries: environment-only behavior; generated prompt content changes only when opt-in variable is set. +- Stop conditions: evidence shows the feature already exists, implementation would require broad LLM architecture changes, or verification/security gates produce unresolved blocking findings. + +## Feature / Verification Matrix + +| Change | Verification method | Status | Evidence | +| --- | --- | --- | --- | +| Validate issue and generated-text surfaces | Local grep/code inspection plus read-only explorer | Done | Explorer found no runtime/docs implementation outside this task; primary generated surfaces route through provider `compress()` / `summarize()` | +| Add env-to-directive helper | Unit tests for unset, blank, `match`, known code, unknown string | Done | `corepack pnpm exec vitest run test/output-language.test.ts test/resilient-provider.test.ts` passed: 2 files, 11 tests | +| Apply directive centrally to provider prompts | `ResilientProvider` unit tests for compress/summarize prompt forwarding and default unchanged behavior | Done | Same targeted vitest run passed; default unset prompts remain unchanged | +| Document env variable | `.env.example` and README search/readback | Done | README and `.env.example` updated; `corepack pnpm run skills:gen` updated generated config reference to 70 env vars | +| Final verification and GitHub workflow | Targeted tests, full tests/security gates as available, commit/push/PR/merge | In progress | Local verification, final review, two base merges, staged Gitleaks, branch push, PR creation, and initial remote checks passed; final merge pending | + +## Subagent Ledger + +| Workstream | Scope | Edits allowed | Expected output | Result | Residual risk | +| --- | --- | --- | --- | --- | --- | +| Read-only explorer | `src/`, `test/`, README/docs for generated-text and env language support | No | Determine whether feature exists, affected generated-text surfaces, minimal implementation recommendation | Done: issue valid; no existing `AGENTMEMORY_OUTPUT_LANG`; central `ResilientProvider` implementation recommended | Context labels/UI/i18n and `memory_compress_file` translation are out of scope | + +## Progress + +- 2026-06-18: Read AGENTS.md, `github-feature-loop`, `writing-plans`, `review-and-implement`, `subagent-driven-development`, `simple-code`, and `verification-before-completion`. +- 2026-06-18: Confirmed worktree was detached at `a029b7e`; created branch `issue/498-output-lang` from current `origin/main` tracking state. +- 2026-06-18: Fetched issue #498 with `gh issue view`; issue is open and requests opt-in `AGENTMEMORY_OUTPUT_LANG`. +- 2026-06-18: Local search found no existing `AGENTMEMORY_OUTPUT_LANG` implementation. +- 2026-06-18: Explorer independently validated issue as runtime-valid and recommended central `ResilientProvider` prompt injection. +- 2026-06-18: Added `src/prompts/output-language.ts`, wired `src/providers/resilient.ts`, added focused tests, and documented the env var in `.env.example`, README, and generated config skill reference. +- 2026-06-18: Initial targeted test command was blocked by pnpm ignored-build hardening during implicit install. Followed repo instruction with `corepack pnpm install --frozen-lockfile --ignore-scripts`, then reran targeted tests successfully. +- 2026-06-18: `corepack pnpm exec vitest run test/output-language.test.ts test/resilient-provider.test.ts` passed with 2 test files and 11 tests. +- 2026-06-18: `corepack pnpm test` passed with 200 test files and 2778 tests. +- 2026-06-18: `corepack pnpm run lint` passed. +- 2026-06-18: `corepack pnpm run build` passed. Build emitted existing plugin timing / dynamic import warnings only. +- 2026-06-18: `corepack pnpm run skills:check` passed after regenerating `plugin/skills/agentmemory-config/REFERENCE.md`. +- 2026-06-18: `semgrep scan --config p/default --error --metrics=off .` passed: 0 findings, 544 rules, 867 tracked targets. +- 2026-06-18: `git diff --check` passed. +- 2026-06-18: Final security reviewer returned ACCEPT. +- 2026-06-18: Final test reviewer found a valid Medium test gap for configured `summarize()` prompt composition; fixed by asserting the exact appended prompt. Re-review returned ACCEPT. +- 2026-06-18: Final maintainability reviewer found a valid Medium config-loading gap; fixed by resolving `AGENTMEMORY_OUTPUT_LANG` through `getEnvVar()` in `ResilientProvider` while keeping the helper pure. Re-review returned ACCEPT. +- 2026-06-18: Post-review `corepack pnpm exec vitest run test/output-language.test.ts test/resilient-provider.test.ts` passed with 2 files and 11 tests. +- 2026-06-18: Post-review `corepack pnpm run lint`, `corepack pnpm run build`, and `corepack pnpm test` passed. Full test suite remained at 200 files and 2778 tests. Build emitted existing plugin timing / dynamic import warnings only. +- 2026-06-18: Post-review `semgrep scan --config p/default --error --metrics=off .` passed: 0 findings, 544 rules, 867 tracked targets. +- 2026-06-18: Staged task-owned files only. `git diff --cached --check` passed. `gitleaks protect --staged --redact` scanned about 17.86 KB and found no leaks. +- 2026-06-18: Created commit `e8c4b3c4` (`feat: add output language control`). +- 2026-06-18: Fetched `origin`; `origin/main` had advanced to `ee72dba7`. Merged `origin/main` into `issue/498-output-lang`, producing merge commit `aebeda3b`. +- 2026-06-18: Post-merge targeted tests passed: `corepack pnpm exec vitest run test/output-language.test.ts test/resilient-provider.test.ts` with 2 files and 11 tests. +- 2026-06-18: Post-merge `corepack pnpm run lint`, `corepack pnpm run build`, `corepack pnpm run skills:check`, and `semgrep scan --config p/default --error --metrics=off .` passed. Semgrep scanned 869 tracked targets with 0 findings. +- 2026-06-18: A post-merge full-suite run executed concurrently with build/semgrep/lint and timed out in six unrelated tests. Rerunning `corepack pnpm test` alone passed with 201 test files and 2779 tests. +- 2026-06-18: Staged the post-merge verification record only. `gitleaks protect --staged --redact` scanned about 1.08 KB and found no leaks. Created commit `a75a0b63` (`docs: record output language verification`). +- 2026-06-18: `gitleaks detect --source . --redact --log-opts=origin/main..HEAD` scanned the branch range and found no leaks. +- 2026-06-18: Pushed `issue/498-output-lang` to `origin` and opened PR #1009 against `main`. +- 2026-06-18: PR #1009 initial GitHub checks passed: `test (ubuntu-latest, 22)` and `test (macos-latest, 22)`. +- 2026-06-18: Initial PR merge attempt was blocked because branch protection reported required checks expected while PR status had become `BEHIND`. +- 2026-06-18: Fetched `origin` and merged the latest `origin/main` into `issue/498-output-lang` again, producing a second base-integration merge. The merge brought in unrelated main changes for issue #494 docs and `website/components/Features.tsx`. +- 2026-06-18: After the second base merge, targeted tests passed: `corepack pnpm exec vitest run test/output-language.test.ts test/resilient-provider.test.ts` with 2 files and 11 tests. +- 2026-06-18: After the second base merge, `corepack pnpm test` passed with 201 test files and 2779 tests. +- 2026-06-18: After the second base merge, `corepack pnpm run lint` and `corepack pnpm run build` passed. Build emitted existing plugin timing / dynamic import warnings only. diff --git a/plugin/skills/agentmemory-config/REFERENCE.md b/plugin/skills/agentmemory-config/REFERENCE.md index 60a45c057..ccb176686 100644 --- a/plugin/skills/agentmemory-config/REFERENCE.md +++ b/plugin/skills/agentmemory-config/REFERENCE.md @@ -3,7 +3,7 @@ Generated by scanning `src/` for `AGENTMEMORY_*` usage. Do not edit the block below by hand; run `corepack pnpm run skills:gen` after adding or removing a variable. Internal markers ending in two underscores are excluded. -Configuration is read from the environment and from `~/.agentmemory/.env` (no `export` prefix). 69 recognized variables: +Configuration is read from the environment and from `~/.agentmemory/.env` (no `export` prefix). 70 recognized variables: - `AGENTMEMORY_AGENT_ID` - `AGENTMEMORY_AGENT_SCOPE` @@ -50,6 +50,7 @@ Configuration is read from the environment and from `~/.agentmemory/.env` (no `e - `AGENTMEMORY_LIVEZ_TIMEOUT_MS` - `AGENTMEMORY_LLM_TIMEOUT_MS` - `AGENTMEMORY_MCP_BLOCK` +- `AGENTMEMORY_OUTPUT_LANG` - `AGENTMEMORY_PREFER_CODEX_SDK` - `AGENTMEMORY_PROBE_TIMEOUT_MS` - `AGENTMEMORY_PROJECT_ID` diff --git a/src/prompts/output-language.ts b/src/prompts/output-language.ts new file mode 100644 index 000000000..91ef0cdea --- /dev/null +++ b/src/prompts/output-language.ts @@ -0,0 +1,38 @@ +const LANGUAGE_NAMES: Record = { + en: "English", + zh: "Chinese", + "zh-cn": "Simplified Chinese", + "zh-tw": "Traditional Chinese", + ja: "Japanese", + ko: "Korean", + de: "German", + fr: "French", + es: "Spanish", + pt: "Portuguese", + "pt-br": "Brazilian Portuguese", + it: "Italian", + ru: "Russian", + hi: "Hindi", + tr: "Turkish", + ar: "Arabic", + nl: "Dutch", + pl: "Polish", + uk: "Ukrainian", +}; + +export function outputLanguageDirective(value: string | undefined): string { + const configured = value?.trim(); + if (!configured) return ""; + + const normalized = configured.toLowerCase(); + const languageInstruction = + normalized === "match" + ? "Match the language of the user's input or observation." + : `Generate human-readable text in ${LANGUAGE_NAMES[normalized] ?? configured}.`; + + return [ + "Output language:", + `- ${languageInstruction}`, + "- Preserve code, identifiers, file paths, XML tags, and schema names verbatim.", + ].join("\n"); +} diff --git a/src/providers/resilient.ts b/src/providers/resilient.ts index 95ece40c9..91905ab12 100644 --- a/src/providers/resilient.ts +++ b/src/providers/resilient.ts @@ -1,4 +1,6 @@ import type { MemoryProvider, CircuitBreakerState } from "../types.js"; +import { getEnvVar } from "../config.js"; +import { outputLanguageDirective } from "../prompts/output-language.js"; import { CircuitBreaker } from "./circuit-breaker.js"; export class ResilientProvider implements MemoryProvider { @@ -23,12 +25,23 @@ export class ResilientProvider implements MemoryProvider { } } + private withOutputLanguage(systemPrompt: string): string { + const directive = outputLanguageDirective( + getEnvVar("AGENTMEMORY_OUTPUT_LANG"), + ); + return directive ? `${systemPrompt}\n\n${directive}` : systemPrompt; + } + async compress(systemPrompt: string, userPrompt: string): Promise { - return this.call(() => this.inner.compress(systemPrompt, userPrompt)); + return this.call(() => + this.inner.compress(this.withOutputLanguage(systemPrompt), userPrompt), + ); } async summarize(systemPrompt: string, userPrompt: string): Promise { - return this.call(() => this.inner.summarize(systemPrompt, userPrompt)); + return this.call(() => + this.inner.summarize(this.withOutputLanguage(systemPrompt), userPrompt), + ); } get circuitState(): CircuitBreakerState { diff --git a/test/output-language.test.ts b/test/output-language.test.ts new file mode 100644 index 000000000..4a73a7ec1 --- /dev/null +++ b/test/output-language.test.ts @@ -0,0 +1,37 @@ +import { describe, expect, it } from "vitest"; + +import { outputLanguageDirective } from "../src/prompts/output-language.js"; + +describe("outputLanguageDirective", () => { + it("returns an empty directive when AGENTMEMORY_OUTPUT_LANG is unset", () => { + expect(outputLanguageDirective(undefined)).toBe(""); + }); + + it("returns an empty directive when AGENTMEMORY_OUTPUT_LANG is blank", () => { + expect(outputLanguageDirective(" ")).toBe(""); + }); + + it("asks the model to match the input language for match", () => { + expect(outputLanguageDirective("match")).toContain( + "Match the language of the user's input or observation.", + ); + }); + + it("maps known language codes case-insensitively", () => { + expect(outputLanguageDirective("DE")).toContain("in German"); + }); + + it("maps regional language codes", () => { + expect(outputLanguageDirective("pt-BR")).toContain("in Brazilian Portuguese"); + }); + + it("uses unknown non-empty values verbatim", () => { + expect(outputLanguageDirective("Português")).toContain("in Português"); + }); + + it("preserves structured and code-like output tokens", () => { + expect(outputLanguageDirective("ja")).toContain( + "Preserve code, identifiers, file paths, XML tags, and schema names verbatim.", + ); + }); +}); diff --git a/test/resilient-provider.test.ts b/test/resilient-provider.test.ts new file mode 100644 index 000000000..231b03e4c --- /dev/null +++ b/test/resilient-provider.test.ts @@ -0,0 +1,109 @@ +import { afterEach, describe, expect, it } from "vitest"; + +import { ResilientProvider } from "../src/providers/resilient.js"; +import type { MemoryProvider } from "../src/types.js"; + +function restoreOutputLang(value: string | undefined): void { + if (value === undefined) { + delete process.env["AGENTMEMORY_OUTPUT_LANG"]; + } else { + process.env["AGENTMEMORY_OUTPUT_LANG"] = value; + } +} + +function makeCapturingProvider(): MemoryProvider & { + calls: Array<{ method: "compress" | "summarize"; system: string; user: string }>; +} { + const calls: Array<{ + method: "compress" | "summarize"; + system: string; + user: string; + }> = []; + + return { + name: "capture", + calls, + async compress(system: string, user: string): Promise { + calls.push({ method: "compress", system, user }); + return "compressed"; + }, + async summarize(system: string, user: string): Promise { + calls.push({ method: "summarize", system, user }); + return "summarized"; + }, + }; +} + +describe("ResilientProvider output language", () => { + const savedOutputLang = process.env["AGENTMEMORY_OUTPUT_LANG"]; + + afterEach(() => { + restoreOutputLang(savedOutputLang); + }); + + it("forwards compress system prompts unchanged when output language is blank", async () => { + process.env["AGENTMEMORY_OUTPUT_LANG"] = ""; + const inner = makeCapturingProvider(); + const provider = new ResilientProvider(inner); + + await provider.compress("system prompt", "user prompt"); + + expect(inner.calls[0]).toEqual({ + method: "compress", + system: "system prompt", + user: "user prompt", + }); + }); + + it("forwards summarize system prompts unchanged when output language is blank", async () => { + process.env["AGENTMEMORY_OUTPUT_LANG"] = ""; + const inner = makeCapturingProvider(); + const provider = new ResilientProvider(inner); + + await provider.summarize("system prompt", "user prompt"); + + expect(inner.calls[0]).toEqual({ + method: "summarize", + system: "system prompt", + user: "user prompt", + }); + }); + + it("appends the output language directive to compress system prompts", async () => { + process.env["AGENTMEMORY_OUTPUT_LANG"] = "de"; + const inner = makeCapturingProvider(); + const provider = new ResilientProvider(inner); + + await provider.compress("system prompt", "user prompt"); + + expect(inner.calls[0].system).toBe( + [ + "system prompt", + "", + "Output language:", + "- Generate human-readable text in German.", + "- Preserve code, identifiers, file paths, XML tags, and schema names verbatim.", + ].join("\n"), + ); + expect(inner.calls[0].user).toBe("user prompt"); + }); + + it("appends the output language directive to summarize system prompts", async () => { + process.env["AGENTMEMORY_OUTPUT_LANG"] = "match"; + const inner = makeCapturingProvider(); + const provider = new ResilientProvider(inner); + + await provider.summarize("system prompt", "user prompt"); + + expect(inner.calls[0].system).toBe( + [ + "system prompt", + "", + "Output language:", + "- Match the language of the user's input or observation.", + "- Preserve code, identifiers, file paths, XML tags, and schema names verbatim.", + ].join("\n"), + ); + expect(inner.calls[0].user).toBe("user prompt"); + }); +});