Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 11 additions & 3 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,17 @@ LLM_PROVIDER=
# ── Agent config ─────────────────────────────────────────────────────────────
AGENT_TIMEZONE=America/Chicago

# ── Optional: Bluesky credentials for post-to-site ───────────────────────────
# Leave empty to disable Bluesky posting end-to-end (post-to-site will respond
# "Credential '…' is not configured"). Use a Bluesky app password from
# ── Optional: Foragent embeddings (for skill + memory semantic retrieval) ────
# Separate from FORAGENT_LLM_* so embeddings can live on a different Azure
# Foundry subscription or deployment. If any are left empty, Foragent falls
# back to BM25-only retrieval (logged once at startup).
FORAGENT_EMBEDDINGS_ENDPOINT=
FORAGENT_EMBEDDINGS_MODEL_ID=text-embedding-3-small
FORAGENT_EMBEDDINGS_API_KEY=

# ── Optional: Bluesky credentials ────────────────────────────────────────────
# The generalist browser-task can use credentials by id (once a credentialed
# login tool lands in a later step). Use a Bluesky app password from
# https://bsky.app/settings/app-passwords, never your account password.
# Stored only in-process by Foragent's InMemoryCredentialBroker — never logged,
# never sent over A2A. For prod, swap in a k8s-secrets or vault broker.
Expand Down
27 changes: 22 additions & 5 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co

## Status

Foragent is at **milestone 6 shipped, step 7 next**. Four capabilities are live (`browser-task`, `fetch-page-title`, `extract-structured-data`, `post-to-site`); the A2A loop is wired end-to-end against RockBot via the `docker-compose.yml` harness pinned to `rockylhotka/rockbot-agent:0.8.5`. Step 6 shipped the generalist `browser-task` planner (LLM-in-the-loop over ref-annotated aria snapshots + `aria-ref=eN` locator resolution, built on `Microsoft.Playwright` 1.59 — bumped from 1.50 for the Ai aria-snapshot mode; see Appendix A #16). Tiered chat clients are wired via `AddRockBotTieredChatClients` with one model aliased across Low/Balanced/High per spec §3.7. The governing spec is `docs/foragent-specification.md` **v0.2**. Step 7 wires `ISkillStore` + `ILongTermMemory` priming; `post-to-site` is removed from the advertised skill list once `browser-task` + the learned bsky skill cover it. Storage-state persistence, 2FA input-required flow, k8s-secrets broker, and per-tenant credential namespaces remain deferred — tracked in `docs/framework-feedback.md`. Framework-level observations from each milestone are captured in `docs/framework-feedback.md`.
Foragent is at **milestone 7 shipped, step 8 next**. Three capabilities are advertised (`browser-task`, `fetch-page-title`, `extract-structured-data`); the A2A loop is wired end-to-end against RockBot via the `docker-compose.yml` harness pinned to `rockylhotka/rockbot-agent:0.8.5`. Step 6 shipped the generalist `browser-task` planner (LLM-in-the-loop over ref-annotated aria snapshots + `aria-ref=eN` locator resolution, built on `Microsoft.Playwright` 1.59 — bumped from 1.50 for the Ai aria-snapshot mode; see Appendix A #16). Tiered chat clients are wired via `AddRockBotTieredChatClients` with one model aliased across Low/Balanced/High per spec §3.7. Step 7 wired the learning substrate: `ISkillStore` + `ILongTermMemory` via `WithSkills()` + `WithLongTermMemory()`, `BrowserTaskPriming` injects retrieved skill + memory content into the planner prompt, successful tasks write a learned skill at `sites/{host}/learned/{slug}`, and `BskySeedSkillService` seeds `sites/bsky.app/login` on first start (idempotent — only writes when absent). Embeddings are optional and configured separately under `ForagentEmbeddings` so they can live on a different Azure Foundry subscription than the chat model; missing embeddings downgrade retrieval to BM25-only with a single startup warning. The step-6 unaided benchmark (3/3) still passes after the priming wiring. `post-to-site` has been removed from both the advertised skill list and the codebase (greenfield deletion — `browser-task` + the learned bsky skill cover the use case). The governing spec is `docs/foragent-specification.md` **v0.2**. Storage-state persistence, 2FA input-required flow, k8s-secrets broker, and per-tenant credential namespaces remain deferred — tracked in `docs/framework-feedback.md`. Framework-level observations from each milestone are captured in `docs/framework-feedback.md`.

## Build / test

Expand Down Expand Up @@ -47,7 +47,7 @@ Four library/host projects with a strict layering:

```
Foragent.Agent (executable, A2A server host, DI composition root)
├─ Foragent.Capabilities (task-level verbs: fetch-page-content, post-to-site, …)
├─ Foragent.Capabilities (task-level verbs: browser-task, fetch-page-title, …)
│ └─ Foragent.Browser (Playwright wrapper; owns browser + per-task BrowserContext)
└─ Foragent.Credentials (ICredentialBroker abstraction + built-in brokers)
```
Expand All @@ -68,14 +68,15 @@ Key framework pieces Foragent uses today:
- `RockBot.Host.AddRockBotHost` + `AgentHostBuilder.AddA2A` — bus-side agent registration. Subscribes to `agent.task.{agentName}` on RabbitMQ.
- `RockBot.A2A.IAgentTaskHandler` — the single per-agent extension point. `ForagentTaskHandler` (in `Foragent.Capabilities`) implements this and dispatches on `request.Skill`.
- `RockBot.A2A.Gateway.AddA2AHttpGateway` + `MapA2AHttpGateway` — the in-process HTTP surface. Published as NuGet in RockBot 0.8.4 (see `docs/framework-feedback.md`).
- `RockBot.Host.AgentMemoryExtensions.WithSkills` / `WithLongTermMemory` — file-backed `ISkillStore` + `ILongTermMemory` (step 7). `ISkillStore.SearchAsync` takes an explicit `float[]? queryEmbedding`; callers compute the embedding. `Skill` record is lean (`Name, Summary, Content, CreatedAt, UpdatedAt?, LastUsedAt?, SeeAlso`) — no tags or importance field.

Foragent requires an LLM. Config lives under `ForagentLlm` — separate from any rockbot-side `LLM` config so the two agents can point at different models. Program.cs fails fast at startup if `ForagentLlm:Endpoint`/`ModelId`/`ApiKey` are missing. Starting step 6 the single configured model is wired via `AddRockBotTieredChatClients(low, balanced, high)` aliased to the same inner `IChatClient`; that one call registers both `IChatClient` (wrapped with `RockBotFunctionInvokingChatClient` for automatic tool invocation) and `TieredChatClientRegistry` (per spec §3.7). Don't also call `AddRockBotChatClient` — it would swap out the wrapped registration. Capabilities that want to escalate/de-escalate per request can resolve `TieredChatClientRegistry` and call `GetClient(ModelTier.Low|Balanced|High)`; none do today.

## Browser

`Foragent.Browser` wraps Playwright. `AddForagentBrowser()` in `Foragent.Agent/Program.cs` registers `PlaywrightBrowserHost` (`IHostedService` owning one shared Chromium per process) and `IBrowserSessionFactory` (hands out a fresh `IBrowserContext` per A2A task — isolation guarantee from spec §3.5). `IBrowserSession` exposes `FetchPageTitleAsync` / `CapturePageSnapshotAsync` for one-shot reads, `OpenPageAsync` → `IBrowserPage` (navigate / fill / click / wait / read) for multi-step flows like login + post, and `OpenAgentPageAsync` → `IBrowserAgentPage` for LLM-in-the-loop planners (ref-annotated aria snapshots + `aria-ref=eN` locator resolution). The snapshot uses Chromium's aria-snapshot (via `Locator.AriaSnapshotAsync`; `Mode = AriaSnapshotMode.Ai` gets the ref-annotated form) and falls back to `<body>` inner text when the tree is empty. Selectors passed to `IBrowserPage` use Playwright's string-selector dialect (CSS + `role=role[name="..."]`); **regex is not accepted in string form**, use exact attribute matches. `Foragent.Browser` has `InternalsVisibleTo("Foragent.Browser.Tests")` so tests drive the real `PlaywrightBrowserSessionFactory` without promoting its implementation types to public.

`CreateSessionAsync(Func<Uri,bool> allowedHost, ...)` is the step-6 entry point for allowlist-scoped sessions. The factory installs a context-wide `RouteAsync("**/*", ...)` that aborts off-list document/subframe navigations before Playwright issues the request (spec §7.1). The no-argument overload accepts any host and stays available for specialists that enforce narrower rules elsewhere (e.g. `post-to-site` where the site id selects the host).
`CreateSessionAsync(Func<Uri,bool> allowedHost, ...)` is the step-6 entry point for allowlist-scoped sessions. The factory installs a context-wide `RouteAsync("**/*", ...)` that aborts off-list document/subframe navigations before Playwright issues the request (spec §7.1). The no-argument overload accepts any host and stays available for specialists that enforce narrower rules elsewhere.

## Capabilities

Expand All @@ -84,10 +85,26 @@ Foragent requires an LLM. Config lives under `ForagentLlm` — separate from any
- Each capability implements `ICapability` — owns its own `AgentSkill` metadata (exposed as a static `SkillDefinition`) and its own `ExecuteAsync` logic.
- `ForagentTaskHandler` is a pure dispatcher that resolves `IEnumerable<ICapability>` from DI and routes on `SkillId`. **Do not add skill-specific logic to the handler.** New capabilities go in new `ICapability` classes.
- `ForagentCapabilities.Skills` (static array) is the single source of truth for advertised skills — both the bus-side `AgentCard.Skills` and the HTTP gateway's `opts.Skills` read from it.
- `CapabilityInput.Parse` is the shared URL + description shim used by `fetch-page-title` and `extract-structured-data`. Capabilities with different input shapes (e.g. `post-to-site` needing `site` / `credentialId` / `content`) parse their own input near the capability — see `PostToSiteInput` in `PostToSiteCapability.cs`. Don't overload `CapabilityInput` for unrelated shapes.
- `post-to-site` dispatches to an `ISitePoster` keyed on `Site` (in `SitePosting/`). `BlueskySitePoster` is the only implementation today; add new sites by registering another `ISitePoster` in `AddForagentCapabilities()`. The capability never echoes exception messages from posters back to callers — they may contain credential material; operators read the full exception in logs.
- `CapabilityInput.Parse` is the shared URL + description shim used by `fetch-page-title` and `extract-structured-data`. Capabilities with different input shapes parse their own input near the capability (e.g. `BrowserTaskInput` in `BrowserTask/`). Don't overload `CapabilityInput` for unrelated shapes.
- `browser-task` (in `BrowserTask/`) is the generalist planner (spec §5.2). `BrowserTaskInput` parses intent + mandatory `allowedHosts` + optional `url` / `credentialId` / `maxSteps` (default 60, ceiling 150) / `maxSeconds` (default 120, ceiling 600). `BrowserTaskTools` wraps `snapshot` / `navigate` / `click` / `type` / `wait_for` / `done` / `fail` as `AIFunction`s via `AIFunctionFactory.Create` and passes them in `ChatOptions.Tools`; the RockBot-wrapped function-invoking `IChatClient` runs the full model ↔ tool loop inside one `GetResponseAsync` call. Budget is enforced tool-side (each tool checks `BrowserTaskState.BudgetExhausted`) because Microsoft.Extensions.AI does not surface per-request iteration caps through `ChatOptions`; wall-clock is a linked `CancellationTokenSource`. **Never log tool arguments verbatim** — `type` carries user-supplied values that may be sensitive (log length only). Refs from a snapshot are valid only until the next mutating call; the system prompt and tool descriptions both state this, but don't code anything that assumes cross-snapshot ref stability.

## Learning substrate (step 7)

Two RockBot framework stores are wired into the host via `AgentHostBuilder`:

- `ISkillStore` (`agent.WithSkills(opts => opts.BasePath = …)`) — file-backed skill store for markdown site primers. Content root defaults to `ForagentMemory:SkillsPath` or `data/skills`.
- `ILongTermMemory` (`agent.WithLongTermMemory(opts => opts.BasePath = …)`) — file-backed memory for declarative observations. `ForagentMemory:MemoryPath` or `data/memory`.

`BrowserTaskPriming` (DI-scoped) runs before each `browser-task` planner call: it derives a query from intent + primary allowlist host, optionally computes an embedding via `IEmbeddingGenerator<string, Embedding<float>>`, and calls `ISkillStore.SearchAsync` + `ILongTermMemory.SearchAsync` in parallel. Retrieved content is injected as a "Known site knowledge" section in the user prompt. Fail-soft: either store throwing is logged at debug and skipped, so a broken priming path never fails a task.

Embeddings are optional. `ForagentEmbeddings:Endpoint` / `ModelId` / `ApiKey` are all-or-nothing; missing any one drops back to BM25-only with a startup warning. The embeddings config is a separate section from `ForagentLlm` because the user's subscription for embeddings lives elsewhere from the chat deployment — keep them split.

On successful completion (`state.IsDone`), `BrowserTaskCapability.TryWriteLearnedSkillAsync` runs one extra synthesizer LLM turn (same `IChatClient`, no tools) to author a reusable skill at `sites/{primaryHost}/learned/{intent-slug}`. The synthesizer prompt forbids including credential values or typed field contents. Writes are skipped when the task was trivial (≤1 navigation) or the primary host can't be determined. Errors are logged but never fail the completed task.

`BskySeedSkillService` (IHostedService) seeds `sites/bsky.app/login` on first start by calling `ISkillStore.GetAsync` and only writing if absent — docker volume recreation reseeds cleanly; operator edits to the skill through other channels are preserved.

Skill naming follows spec §5.6: `sites/{host}/{intent}` for human-authored primers, `sites/{host}/learned/{slug}` for agent-generated. `Skill.SeeAlso` cross-references related skills to surface clusters rather than single entries. **Note:** `Skill` (from `RockBot.Host 0.8.5`) does not carry tags, metadata, or importance — the `agent-learned` distinction is encoded in the name prefix only.

## Credentials

`Foragent.Credentials` ships `ICredentialBroker` + `CredentialReference(Id, Kind, Values)`. `AddForagentCredentials(configuration, "Credentials")` wires an `InMemoryCredentialBroker` bound to the config section — dev/test only per spec §6.3. Populate via user-secrets (`dotnet user-secrets set "Credentials:bluesky-rocky:Kind" username-password`, etc.), never appsettings.json. **Never log `CredentialReference.Values`**, never include them in A2A responses, never embed them in exception messages. `CredentialReference.ToString()` deliberately does not expose values. Missing credentials throw `CredentialNotFoundException` carrying only the id.
Expand Down
2 changes: 1 addition & 1 deletion deploy/rockbot-seed/agent-trust.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
{
"agentId": "Foragent",
"level": 4,
"approvedSkills": ["browser-task", "fetch-page-title", "extract-structured-data", "post-to-site"],
"approvedSkills": ["browser-task", "fetch-page-title", "extract-structured-data"],
"firstSeen": "2026-04-21T00:00:00+00:00",
"lastInteraction": "2026-04-21T00:00:00+00:00",
"interactionCount": 0
Expand Down
5 changes: 0 additions & 5 deletions deploy/rockbot-seed/well-known-agents.json
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,6 @@
"id": "extract-structured-data",
"name": "Extract Structured Data",
"description": "Navigate to a URL and extract data matching a natural-language description, returning JSON. Input the target URL and a description of what to extract."
},
{
"id": "post-to-site",
"name": "Post to Site",
"description": "Authenticate against a configured site (by credential identifier) and publish a post. Input JSON {\"site\":\"bluesky\",\"credentialId\":\"...\",\"content\":\"...\"} or message metadata fields site / credentialId / content. Credential values never cross the A2A boundary."
}
]
}
Expand Down
25 changes: 18 additions & 7 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ services:
RabbitMq__VirtualHost: /
Gateway__AgentName: Foragent
Gateway__InternalAgentName: Foragent
Gateway__Description: "Browser agent — browser-task (generalist), fetch-page-title, extract-structured-data, post-to-site"
Gateway__Description: "Browser agent — browser-task (generalist), fetch-page-title, extract-structured-data"
# RockBot will call Foragent with header X-Api-Key: rockbot-calls-foragent
ApiKeys__rockbot-calls-foragent__AgentId: RockBot
ApiKeys__rockbot-calls-foragent__DisplayName: RockBot
Expand All @@ -70,15 +70,25 @@ services:
ForagentLlm__Endpoint: ${FORAGENT_LLM_ENDPOINT:?FORAGENT_LLM_ENDPOINT is required}
ForagentLlm__ModelId: ${FORAGENT_LLM_MODEL_ID:?FORAGENT_LLM_MODEL_ID is required}
ForagentLlm__ApiKey: ${FORAGENT_LLM_API_KEY:?FORAGENT_LLM_API_KEY is required}
# Optional Bluesky credential for post-to-site. Callers invoke post-to-site
# with credentialId: "bluesky-rocky". Flat id (no slashes) because env-var
# keys use __ to separate config path segments — ids with slashes work via
# appsettings / user-secrets but not via env vars. Leave unset to disable;
# post-to-site will report "Credential '…' is not configured."
# For prod, replace InMemoryCredentialBroker with k8s-secrets.
# Optional embeddings. Empty values (default) → BM25-only skill + memory
# retrieval. Set any one of Endpoint/ModelId/ApiKey empty and the others
# are ignored (all-or-nothing at startup). Separate subscription from
# ForagentLlm because embedding deployments often live elsewhere.
ForagentEmbeddings__Endpoint: ${FORAGENT_EMBEDDINGS_ENDPOINT:-}
ForagentEmbeddings__ModelId: ${FORAGENT_EMBEDDINGS_MODEL_ID:-}
ForagentEmbeddings__ApiKey: ${FORAGENT_EMBEDDINGS_API_KEY:-}
# Step 7: skills + long-term memory (spec §5.6). Paths align with the
# mounted volume below so learned site knowledge survives restarts.
ForagentMemory__SkillsPath: /data/foragent/skills
ForagentMemory__MemoryPath: /data/foragent/memory
# Optional Bluesky credential used by future credentialed browser-task
# runs. Flat id (no slashes) because env-var keys use __ to separate
# config segments. Leave unset to disable.
Credentials__bluesky-rocky__Kind: username-password
Credentials__bluesky-rocky__Values__identifier: ${FORAGENT_BLUESKY_IDENTIFIER:-}
Credentials__bluesky-rocky__Values__password: ${FORAGENT_BLUESKY_APP_PASSWORD:-}
volumes:
- foragent-data:/data/foragent

rockbot-init:
image: rockylhotka/rockbot-agent:0.8.5
Expand Down Expand Up @@ -177,3 +187,4 @@ services:
volumes:
rockbot-data:
rockbot-shared:
foragent-data:
Loading
Loading