Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,15 @@ LLM_PROVIDER=
# ── Agent config ─────────────────────────────────────────────────────────────
AGENT_TIMEZONE=America/Chicago

# ── Optional: Bluesky credentials for post-to-site ───────────────────────────
# Leave empty to disable Bluesky posting end-to-end (post-to-site will respond
# "Credential '…' is not configured"). Use a Bluesky app password from
# https://bsky.app/settings/app-passwords, never your account password.
# Stored only in-process by Foragent's InMemoryCredentialBroker — never logged,
# never sent over A2A. For prod, swap in a k8s-secrets or vault broker.
FORAGENT_BLUESKY_IDENTIFIER=
FORAGENT_BLUESKY_APP_PASSWORD=

# ── Optional ─────────────────────────────────────────────────────────────────
# Only needed if RockBot is configured to use GitHub Copilot / GitHub Models.
GITHUB_TOKEN=
15 changes: 12 additions & 3 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co

## Status

Foragent is at **milestone 3** (spec §9.1): two capabilities now — `fetch-page-title` (step 2, Playwright) and `extract-structured-data` (step 3, Playwright + LLM). Credentials are milestone 4. The authoritative design document is `docs/foragent-specification.md` — read it before making non-trivial changes. Framework-level observations from each milestone are captured in `docs/framework-feedback.md`.
Foragent is at **milestone 4** (spec §9.1): three capabilities now — `fetch-page-title` (step 2, Playwright), `extract-structured-data` (step 3, Playwright + LLM), and `post-to-site` (step 4, Playwright + credential broker). The credential broker + first `ISitePoster` (Bluesky) are in. Storage-state persistence, 2FA input-required flow, k8s-secrets broker, and per-tenant credential namespaces are all deferred — tracked in `docs/framework-feedback.md` step 4. The authoritative design document is `docs/foragent-specification.md` — read it before making non-trivial changes. Framework-level observations from each milestone are captured in `docs/framework-feedback.md`.

## Build / test

Expand Down Expand Up @@ -73,7 +73,7 @@ Foragent requires an LLM (for `extract-structured-data` and future capabilities)

## Browser

`Foragent.Browser` wraps Playwright. `AddForagentBrowser()` in `Foragent.Agent/Program.cs` registers `PlaywrightBrowserHost` (`IHostedService` owning one shared Chromium per process) and `IBrowserSessionFactory` (hands out a fresh `IBrowserContext` per A2A task — isolation guarantee from spec §3.5). `IBrowserSession` exposes `FetchPageTitleAsync` and `CapturePageSnapshotAsync`; the snapshot uses Chromium's aria-snapshot (via `Locator.AriaSnapshotAsync`) and falls back to `<body>` inner text when the tree is empty. `Foragent.Browser` has `InternalsVisibleTo("Foragent.Browser.Tests")` so tests drive the real `PlaywrightBrowserSessionFactory` without promoting its implementation types to public.
`Foragent.Browser` wraps Playwright. `AddForagentBrowser()` in `Foragent.Agent/Program.cs` registers `PlaywrightBrowserHost` (`IHostedService` owning one shared Chromium per process) and `IBrowserSessionFactory` (hands out a fresh `IBrowserContext` per A2A task — isolation guarantee from spec §3.5). `IBrowserSession` exposes `FetchPageTitleAsync` / `CapturePageSnapshotAsync` for one-shot reads, plus `OpenPageAsync` → `IBrowserPage` (navigate / fill / click / wait / read) for multi-step flows like login + post. The snapshot uses Chromium's aria-snapshot (via `Locator.AriaSnapshotAsync`) and falls back to `<body>` inner text when the tree is empty. Selectors passed to `IBrowserPage` use Playwright's string-selector dialect (CSS + `role=role[name="..."]`); **regex is not accepted in string form**, use exact attribute matches. `Foragent.Browser` has `InternalsVisibleTo("Foragent.Browser.Tests")` so tests drive the real `PlaywrightBrowserSessionFactory` without promoting its implementation types to public.

## Capabilities

Expand All @@ -82,7 +82,16 @@ Foragent requires an LLM (for `extract-structured-data` and future capabilities)
- Each capability implements `ICapability` — owns its own `AgentSkill` metadata (exposed as a static `SkillDefinition`) and its own `ExecuteAsync` logic.
- `ForagentTaskHandler` is a pure dispatcher that resolves `IEnumerable<ICapability>` from DI and routes on `SkillId`. **Do not add skill-specific logic to the handler.** New capabilities go in new `ICapability` classes.
- `ForagentCapabilities.Skills` (static array) is the single source of truth for advertised skills — both the bus-side `AgentCard.Skills` and the HTTP gateway's `opts.Skills` read from it.
- `CapabilityInput.Parse` is the input-parsing shim until rockbot#281 ships real metadata pass-through. Capabilities that need a URL + description accept a `{"url":"...","description":"..."}` JSON blob in the single text part today; capabilities that only need a URL also accept a bare URL string. When the framework change lands, swap this helper — capability contracts don't need to change.
- `CapabilityInput.Parse` is the shared URL + description shim used by `fetch-page-title` and `extract-structured-data`. Capabilities with different input shapes (e.g. `post-to-site` needing `site` / `credentialId` / `content`) parse their own input near the capability — see `PostToSiteInput` in `PostToSiteCapability.cs`. Don't overload `CapabilityInput` for unrelated shapes.
- `post-to-site` dispatches to an `ISitePoster` keyed on `Site` (in `SitePosting/`). `BlueskySitePoster` is the only implementation today; add new sites by registering another `ISitePoster` in `AddForagentCapabilities()`. The capability never echoes exception messages from posters back to callers — they may contain credential material; operators read the full exception in logs.

## Credentials

`Foragent.Credentials` ships `ICredentialBroker` + `CredentialReference(Id, Kind, Values)`. `AddForagentCredentials(configuration, "Credentials")` wires an `InMemoryCredentialBroker` bound to the config section — dev/test only per spec §6.3. Populate via user-secrets (`dotnet user-secrets set "Credentials:bluesky-rocky:Kind" username-password`, etc.), never appsettings.json. **Never log `CredentialReference.Values`**, never include them in A2A responses, never embed them in exception messages. `CredentialReference.ToString()` deliberately does not expose values. Missing credentials throw `CredentialNotFoundException` carrying only the id.

`CredentialReference.Values` is `IReadOnlyDictionary<string, ReadOnlyMemory<byte>>` — byte-shaped so backends like k8s Secrets (byte-native), cert stores, and storage-state blobs pass through without lossy text conversion. Text-origin credentials go in via `CredentialReference.FromText(id, kind, stringDict)` (UTF-8 encodes at the boundary); text-shaped fields come out via `cred.RequireText(key)` (UTF-8 decodes). Use `cred.Require(key)` for raw bytes. `InMemoryCredentialBroker`'s config binding stays text (user-secrets / env vars are string-native); UTF-8 encoding happens at the broker boundary, not at config time.

Credential ids are free-form via user-secrets/appsettings (slashes are fine — `rockbot/social/bluesky-rocky` matches spec §6.2's example). Via env vars / docker-compose, ids must be single-segment: `__` separates config-path segments, so `Credentials__rockbot__social__bluesky-rocky__Kind` becomes the config path `Credentials:rockbot:social:bluesky-rocky:Kind` and fails to bind as an id. Stick with flat ids (`bluesky-rocky`) in the compose harness.

## Conventions

Expand Down
3 changes: 3 additions & 0 deletions Directory.Packages.props
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,12 @@
<ItemGroup>
<PackageVersion Include="Microsoft.Playwright" Version="1.50.0" />
<PackageVersion Include="Microsoft.Extensions.AI" Version="10.*" />
<PackageVersion Include="Microsoft.Extensions.Configuration.Abstractions" Version="10.0.*" />
<PackageVersion Include="Microsoft.Extensions.DependencyInjection.Abstractions" Version="10.0.*" />
<PackageVersion Include="Microsoft.Extensions.Hosting.Abstractions" Version="10.0.*" />
<PackageVersion Include="Microsoft.Extensions.Logging.Abstractions" Version="10.0.*" />
<PackageVersion Include="Microsoft.Extensions.Options" Version="10.0.*" />
<PackageVersion Include="Microsoft.Extensions.Options.ConfigurationExtensions" Version="10.0.*" />
<PackageVersion Include="Microsoft.Extensions.AI.Abstractions" Version="10.*" />
<PackageVersion Include="Microsoft.Extensions.AI.OpenAI" Version="10.*" />
<PackageVersion Include="OpenAI" Version="2.*" />
Expand Down
9 changes: 9 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,15 @@ services:
ForagentLlm__Endpoint: ${FORAGENT_LLM_ENDPOINT:?FORAGENT_LLM_ENDPOINT is required}
ForagentLlm__ModelId: ${FORAGENT_LLM_MODEL_ID:?FORAGENT_LLM_MODEL_ID is required}
ForagentLlm__ApiKey: ${FORAGENT_LLM_API_KEY:?FORAGENT_LLM_API_KEY is required}
# Optional Bluesky credential for post-to-site. Callers invoke post-to-site
# with credentialId: "bluesky-rocky". Flat id (no slashes) because env-var
# keys use __ to separate config path segments — ids with slashes work via
# appsettings / user-secrets but not via env vars. Leave unset to disable;
# post-to-site will report "Credential '…' is not configured."
# For prod, replace InMemoryCredentialBroker with k8s-secrets.
Credentials__bluesky-rocky__Kind: username-password
Credentials__bluesky-rocky__Values__identifier: ${FORAGENT_BLUESKY_IDENTIFIER:-}
Credentials__bluesky-rocky__Values__password: ${FORAGENT_BLUESKY_APP_PASSWORD:-}

rockbot-init:
image: rockylhotka/rockbot-agent:latest
Expand Down
110 changes: 110 additions & 0 deletions docs/framework-feedback.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,116 @@ feedback. Capture it."
beyond step 2 will make this more painful; the `switch (request.Skill)` in
`ForagentTaskHandler` is already starting to accumulate per-skill setup.

## Step 4 — Credentials + first credentialed capability (post-to-site / Bluesky)

### Framework observations

- **`ICredentialBroker` is Foragent-local, not framework.** We deliberately did
not propose this for RockBot yet — spec §6.2 treats the broker as a Foragent
concept, and no second consumer exists. If future agents (RockBot or
third-party) grow similar needs, consider lifting a broker abstraction
upstream with the same value-dictionary shape (see below).
- **`ISitePoster` dispatch is a repeat of the step-3 pattern.** We added a
small in-capability dispatcher (`PostToSiteCapability` → keyed
`IReadOnlyDictionary<string, ISitePoster>`) to route a single A2A skill to
a family of site-specific implementations. Together with the step-3
`ICapability` dispatcher, this is now the second hand-rolled skill-to-impl
dispatch inside Foragent. A framework helper (e.g. `AddRockBotCapability<T>`,
`AddRockBotCapabilityVariant<T>`) would fold both patterns down.
- **No framework hook for per-tenant broker scoping.** Spec §7.5 calls for
tenant identity from A2A caller, not request payload. The RockBot framework
exposes the caller identity on `AgentTaskContext.MessageContext.Agent`, but
there's no established pattern for a broker to receive it. Today Foragent's
broker ignores tenancy; see "Deferred" below.
- **Playwright string-selector dialect is regex-free.** The first cut of
`BlueskySitePoster` used `role=button[name=/sign in/i]`-style selectors;
Playwright's string parser does not accept regex. `getByRole(…, new() { NameRegex = … })`
works on `IPage` but not in `WaitForSelectorAsync` string form. Switched to
exact attribute matches. Worth a note in a future `RockBot.Browser` helper if
one materialises, so consumers don't repeat the mistake.
- **`contenteditable` + Playwright `FillAsync` works for text but not rich
content.** Bluesky's real composer uses a ProseMirror editor that rejects
naive `FillAsync`. Our selector targets the contenteditable host, which the
test fake also uses. Real-world posting may require typing or scripting the
editor — when we exercise against real bsky.app we'll learn whether this
path holds. Flagged here so the next session doesn't chase it as a new bug.

### Credential abstraction — backend generality check

Before finalizing step 4 we sanity-checked whether
`ICredentialBroker.ResolveAsync(id) → CredentialReference(Id, Kind, Values)`
is general enough to back alternative secret stores beyond in-memory and
k8s. The shape bends:

- **k8s Secrets** — Secret name → `Id`. `data` map (base64-decoded) → `Values`. Clean fit.
- **Azure Key Vault** — One vault secret per credential holding a JSON blob, deserialized into `Values`. Or naming convention (`bluesky-rocky-identifier`, `bluesky-rocky-password`); broker collates. Both work.
- **AWS Secrets Manager** — Native JSON `SecretString` maps directly to `Values`.
- **HashiCorp Vault (KV v2)** — `secret/data/<id>` → string map → `Values`. Direct fit.
- **File-based dev broker** — Gitignored JSON file, one-to-one with `Values`.

`Values` was switched from `IReadOnlyDictionary<string, string>` to
`IReadOnlyDictionary<string, ReadOnlyMemory<byte>>` pre-emptively. Most real
backends (k8s Secrets, cert stores, storage-state blobs) are byte-native;
text is the common case but not the *only* case. `CredentialReference.FromText`
+ `RequireText` cover the UTF-8 path at the edges without forcing every
broker / consumer to care.

### Known gaps in the credential interface (not yet fixed)

These are not blocking step 4 but will force changes as the spec is filled
in. Captured here so they aren't rediscovered:

1. **No catalog / list.** Spec §6.4 calls for advertising which credential
ids exist (without values) so a caller can say "I'd need a Bluesky
credential, none is configured." Today's interface is `Resolve` only.
Every non-toy backend supports listing. Will need
`IAsyncEnumerable<string> ListAsync(CancellationToken)` or equivalent.
2. **No write path for storage state (§6.5).** Storage-state-as-credential
requires the broker to *persist* post-login session bytes. Will need a
`Task WriteAsync(CredentialReference)` — and some backends are read-only
(Key Vault read role), so the interface should signal write capability
(either a feature flag or a separate `IWritableCredentialBroker`).
3. **Tenancy isn't on the interface.** `ResolveAsync(string id)` has no
tenant parameter. Production backends need to scope lookups to the A2A
caller's tenant id. Either `ResolveAsync(TenantId, string id)` or
per-tenant broker scoping. Blocked on the spec-level tenant-identity
decision (spec §12 open question 5).

### Deferred (tracked so we don't lose them)

All of these are on the step-4 line in spec §9.1 but intentionally punted to
later iterations to keep the PR reviewable. Each is wired into the current
design in a way that allows adding it without breaking changes:

- **Storage state as a credential (spec §6.5).** `BlueskySitePoster` re-auths
every post. The fix is to call `IBrowserContext.StorageStateAsync()` after
successful login, persist it back through the broker under a new `Kind`
(`storage-state`), and re-apply via `Browser.NewContextAsync(new { StorageState = … })`
on subsequent runs. Requires either an `IBrowserSessionFactory.CreateSessionAsync(storageState)`
overload or a session-level "import" method. Keeping the broker
value-shape as `IReadOnlyDictionary<string,string>` means storage state
(a JSON blob) just becomes `Values["json"]`.
- **2FA via A2A `input-required` (spec §6.6).** RockBot's framework exposes
the `input-required` state on `AgentTaskContext`, but we haven't wired
BlueskySitePoster to detect a 2FA prompt and suspend. App passwords bypass
2FA for now, which is why spec §6.6 recommends them — but the input-required
path is what unlocks non-app-password sites.
- **Kubernetes secrets broker (spec §6.3).** Only `InMemoryCredentialBroker`
is implemented; prod deploy will need a `KubernetesCredentialBroker` reading
from a scoped service account. No deployment target exists yet (spec §9.2).
- **Per-tenant credential namespaces (spec §7.5).** `ICredentialBroker.ResolveAsync`
takes only the credential id. A production broker should also take a tenant
id derived from `AgentTaskContext.MessageContext.Agent`, and scope its
lookup. Foragent is currently single-tenant by omission.
- **Audit logging (spec §7.4).** We log capability invocation + credential id
via `ILogger`, but there's no dedicated audit sink separate from diagnostic
logging. Spec §7.4 calls for a per-tenant audit log with structured fields;
current logs are prose.
- **Domain allowlists (spec §7.1).** `post-to-site` hard-codes the Bluesky
login URL; no request-level or tenant-level allowlist. When we add a second
poster, promote the URL to config and add an allowlist check around
`IBrowserSession.OpenPageAsync`.

## Step 3 — Second capability (extract-structured-data)

- **A2A metadata pass-through.** Filed as [rockbot#281](https://github.com/MarimerLLC/rockbot/issues/281),
Expand Down
1 change: 1 addition & 0 deletions src/Foragent.Agent/Foragent.Agent.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -20,5 +20,6 @@
<ItemGroup>
<ProjectReference Include="..\Foragent.Browser\Foragent.Browser.csproj" />
<ProjectReference Include="..\Foragent.Capabilities\Foragent.Capabilities.csproj" />
<ProjectReference Include="..\Foragent.Credentials\Foragent.Credentials.csproj" />
</ItemGroup>
</Project>
7 changes: 7 additions & 0 deletions src/Foragent.Agent/Program.cs
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
using System.ClientModel;
using Foragent.Browser;
using Foragent.Capabilities;
using Foragent.Credentials;
using Microsoft.Extensions.AI;
using OpenAI;
using RockBot.A2A;
Expand Down Expand Up @@ -67,6 +68,12 @@

builder.Services.AddForagentBrowser();

// ── Credentials ─────────────────────────────────────────────────────────────
// In-memory broker bound to the "Credentials" config section (populated via
// user-secrets in dev). Production deployments should swap in a k8s-secrets /
// vault broker; tracked in docs/framework-feedback.md step 4.
builder.Services.AddForagentCredentials(builder.Configuration);

// ── HTTP A2A gateway (in-process) ────────────────────────────────────────────

builder.Services.Configure<Dictionary<string, ApiKeyEntry>>(
Expand Down
3 changes: 2 additions & 1 deletion src/Foragent.Agent/appsettings.json
Original file line number Diff line number Diff line change
Expand Up @@ -26,5 +26,6 @@
"ModelId": "",
"ApiKey": ""
},
"ApiKeys": {}
"ApiKeys": {},
"Credentials": {}
}
Loading
Loading