LayerLens · m-peko · May 12, 2026 · Apr 27, 2026
diff --git a/docs/adapters/frameworks-browser_use.md b/docs/adapters/frameworks-browser_use.md
@@ -0,0 +1,222 @@
+# browser_use framework adapter
+
+`layerlens.instrument.adapters.frameworks.browser_use.BrowserUseAdapter`
+instruments [browser-use](https://github.com/browser-use/browser-use) —
+the LLM-driven Playwright agent that performs autonomous web
+navigation, form filling, and content extraction. The adapter wraps
+`Agent.run()` (and `Agent.run_sync()` when present), threads per-step
+browser / action / screenshot / DOM / model events through the
+LayerLens pipeline, and applies the field-specific truncation policy
+so multi-megabyte screenshot / DOM payloads cannot blow past the
+ingestion sink limits.
+
+## Install
+
+```bash
+pip install 'layerlens[browser-use]'
+```
+
+Pulls `browser-use>=0.1.0,<2`. Requires Python 3.11+ (browser_use's
+own constraint) and Playwright (the runtime SDK pulls it transitively
+and runs `playwright install chromium` on first use).
+
+## Quick start
+
+```python
+import asyncio
+
+from browser_use import Agent
+from langchain_openai import ChatOpenAI
+
+from layerlens.instrument.adapters.frameworks.browser_use import (
+    BrowserUseAdapter,
+    instrument_agent,
+)
+from layerlens.instrument.transport.sink_http import HttpEventSink
+
+sink = HttpEventSink(adapter_name="browser_use")
+
+agent = Agent(
+    task="find the price of a Logitech MX Master 3S on a demo store",
+    llm=ChatOpenAI(model="gpt-4o-mini"),
+)
+
+# One-liner: construct adapter, connect, wrap agent, return adapter.
+adapter = instrument_agent(agent, org_id="org_acme")
+adapter.add_sink(sink)
+
+result = asyncio.run(agent.run())
+
+adapter.disconnect()
+sink.close()
+```
+
+For an offline reproduction (no `browser-use` install required) see
+`samples/instrument/browser_use/`.
+
+## What's wrapped
+
+`adapter.instrument_agent(agent)` patches the following on each Agent:
+
+- `run` — async entry point. Emits the full session lifecycle plus
+  per-step browser / action / screenshot / model events.
+- `run_sync` — sync entry point (when present in the browser_use
+  build). Same semantics.
+
+`disconnect()` restores all originals and clears wrapping state.
+
+## Capabilities
+
+| Capability                         | Declared |
+|------------------------------------|----------|
+| `AdapterCapability.TRACE_TOOLS`    | Yes      |
+| `AdapterCapability.TRACE_MODELS`   | Yes      |
+| `AdapterCapability.TRACE_STATE`    | Yes      |
+| `AdapterCapability.STREAMING`      | Yes      |
+| `AdapterCapability.REPLAY`         | Yes      |
+| `AdapterCapability.TRACE_HANDOFFS` | No (browser_use is single-agent) |
+
+## Events emitted
+
+| Event                    | Layer         | When                                                              |
+|--------------------------|---------------|-------------------------------------------------------------------|
+| `environment.config`     | L4a           | First time an agent is registered. Captures model, browser, task. |
+| `browser.session.start`  | L1            | Beginning of every `run`. Includes a generated `session_id`.      |
+| `agent.input`            | L1            | Same boundary as `browser.session.start`. Carries the task.       |
+| `browser.navigate`       | L5a           | Per page-load (URL change).                                       |
+| `browser.action`         | L5a           | Per click / type / select / scroll. Mirrored as `tool.call`.      |
+| `tool.call`              | L5a           | Mirror of `browser.action` for unified analytics.                 |
+| `browser.screenshot`     | L5c           | Per screenshot. **Bytes DROPPED** to a SHA-256 reference.         |
+| `browser.dom.extract`    | L5c           | Per DOM snapshot. HTML capped at 16 KiB.                          |
+| `model.invoke`           | L3            | Per LLM call inside the reasoning loop.                           |
+| `cost.record`            | cross-cutting | Per LLM call (when token counts are available).                   |
+| `agent.output`           | L1            | End of every `run`. Includes `duration_ns` and any `error`.       |
+| `agent.state.change`     | cross-cutting | After `agent.output` — `session_complete` or `session_failed`.    |
+| `agent.error`            | L1            | When `run` raises. Emitted BEFORE the exception propagates.       |
+| `tool.error`             | L5a           | When a browser action raises. Paired with the failed `tool.call`. |
+| `model.error`            | L3            | When the LLM call raises. Paired with the failed `model.invoke`.  |
+
+## Truncation policy (CRITICAL)
+
+browser_use payloads are uniquely susceptible to unbounded data — a
+single navigation step can produce multi-megabyte base64 PNG
+screenshots, DOM HTML over 100 KB, and verbose page content. The
+adapter wires `DEFAULT_POLICY` from
+`layerlens.instrument.adapters._base.truncation` from day one with
+the following per-field caps:
+
+| Field                                                   | Cap                  |
+|---------------------------------------------------------|----------------------|
+| `screenshot`, `image_data`, `image_b64`, `binary_data`  | DROPPED → SHA-256 ref |
+| `html`, `dom`, `page_content`                           | 16 KiB               |
+| `prompt`, `completion`, `messages`, `output`, `input`   | 4 KiB                |
+| `tool_input`, `tool_output`, `arguments`                | 2 KiB                |
+| `state`, `context`, `history`                           | 8 KiB                |
+| `traceback`, `stacktrace`                               | 8 frames             |
+
+Truncations are NEVER silent — every clipped field appears in the
+`_truncated_fields` audit list attached to the emitted payload.
+Customers who need full-fidelity screenshots should ship them through
+a separate object store (S3 / R2) and embed only the storage
+reference in events.
+
+## Multi-tenancy
+
+The adapter binds an `org_id` at construction (`org_id` kwarg or
+resolved from `stratix.org_id` / `stratix.organization_id`) and
+stamps it onto every emitted payload. Caller-supplied `org_id` values
+are overwritten defensively to prevent cross-tenant leaks via misuse.
+
+```python
+adapter = BrowserUseAdapter(stratix=client, org_id="org_acme")
+# Every event payload carries org_id="org_acme".
+```
+
+## Resilience
+
+Every public hook is wrapped in `try / except` so an exception in our
+observability code can NEVER crash the customer's browser_use agent.
+Failures bump the per-callback resilience counter:
+
+```python
+adapter.resilience_snapshot()
+# {
+#   "resilience_failures_total": 0,
+#   "resilience_failures_by_callback": {},
+#   "resilience_last_error": None,
+# }
+```
+
+Operators surface this through the adapter health endpoint to detect
+silent observability degradation early.
+
+## Error-aware emission
+
+When the wrapped agent raises (rate limit, page-load timeout, LLM
+outage, malformed prompt), the adapter emits a structured `agent.error`
+event BEFORE re-raising the exception. Dashboards always see a complete
+`agent.input` → `agent.error` → `agent.output` triple — never a hung
+"start" with no matching "end".
+
+The same contract applies to `tool.error` (action failures) and
+`model.error` (LLM call failures).
+
+## Capture config
+
+```python
+from layerlens.instrument.adapters._base import CaptureConfig
+
+# Recommended.
+adapter = BrowserUseAdapter(capture_config=CaptureConfig.standard())
+
+# Heavy: include screenshot + DOM extract events. They still respect
+# the truncation policy — DROP for screenshots, 16 KiB for HTML.
+adapter = BrowserUseAdapter(
+    capture_config=CaptureConfig(
+        l1_agent_io=True,
+        l3_model_metadata=True,
+        l4a_environment_config=True,
+        l5a_tool_calls=True,
+        l5c_tool_environment=True,  # screenshot + DOM events
+    ),
+)
+```
+
+## browser_use specifics
+
+- **Async-only by default.** browser_use's `Agent.run` is async. The
+  adapter exposes both sync (`_create_traced_run_sync`) and async
+  (`_create_traced_run_async`) wrappers; instrumentation auto-detects
+  which methods are present on the agent.
+- **History walk fallback.** browser_use returns an
+  `AgentHistoryList` from `Agent.run`. The adapter walks the history
+  at the end of every run to backfill per-step events in case the
+  customer constructed an agent before the per-step hooks existed.
+- **Pydantic v2.** browser_use uses Pydantic v2 internally. The
+  adapter declares `requires_pydantic = PydanticCompat.V2_ONLY` so
+  the catalog UI warns customers pinning v1.
+- **No native callback bus.** browser_use does not expose a callback
+  registration API today — the adapter uses the wrapper pattern
+  (preserve-then-restore on `disconnect`). When upstream adds a
+  callback bus the adapter will switch to it without breaking the
+  public surface.
+
+## BYOK
+
+browser_use's LLM client (LangChain `ChatOpenAI`, `ChatAnthropic`,
+etc.) reads its own credentials. The adapter does not own them.
+For platform-managed BYOK see `docs/adapters/byok.md` (atlas-app M1.B).
+
+## Replay
+
+The adapter implements `serialize_for_replay()` and declares
+`AdapterCapability.REPLAY`. The serialized trace contains every
+emitted event (with truncation already applied — replays do not pay
+the bytes cost twice) plus the bound `org_id` and framework version.
+
+```python
+trace = adapter.serialize_for_replay()
+# trace.events  -> list of {"event_type", "payload", "timestamp_ns"}
+# trace.config  -> {"capture_config", "org_id", "framework_version"}
+# trace.metadata -> {"resilience_failures": {...}}
+```
diff --git a/pyproject.toml b/pyproject.toml
@@ -49,6 +49,7 @@ google-adk = ["google-adk>=0.1,<1.0; python_version >= '3.10'"]
 bedrock-agents = ["boto3>=1.34"]
 embedding = []  # vector store hooks; deps come from the underlying store
 benchmark-import = []  # replay-based; no extra deps
+browser-use = ["browser-use>=0.1.0,<2; python_version >= '3.11'"]
 
 [project.urls]
 Homepage = "https://github.com/LayerLens/stratix-python"

diff --git a/samples/instrument/browser_use/README.md b/samples/instrument/browser_use/README.md
@@ -0,0 +1,89 @@
+# browser_use instrumentation sample
+
+End-to-end demo of `BrowserUseAdapter` — runs **offline** with no
+`browser-use` install, no Playwright, no OpenAI key, no network calls.
+It uses a duck-typed `_FakeAgent` so the wrapper, lifecycle hooks,
+truncation policy, and event emission can be exercised on any
+developer laptop.
+
+## Run
+
+```bash
+# Happy path — three-step navigation.
+python -m samples.instrument.browser_use.main
+
+# Failure path — exercises agent.error emission before re-raise.
+python -m samples.instrument.browser_use.main --fail
+```
+
+Expected output for the happy path (event count and order are
+deterministic):
+
+```text
+Agent finished. 3 step(s) executed.
+
+Emitted 14 event(s):
+  -        environment.config  org=org_demo  agent=demo-bot model=gpt-4o-mini
+  -    browser.session.start  org=org_demo  session=...
+  -              agent.input  org=org_demo  task=find the price of a Logitech mouse...
+  -          browser.navigate  org=org_demo  url=https://store.example.com/
+  -            browser.action  org=org_demo  action=navigate
+  ...
+  -             agent.output  org=org_demo  duration_ns=...
+  -      agent.state.change  org=org_demo
+```
+
+Notice the screenshot lines render as
+`<dropped:screenshot:sha256:...>` rather than the multi-megabyte
+PNG bytes — the truncation policy refuses to embed binary blobs in
+events.
+
+## What the sample exercises
+
+| Component | What it proves |
+|---|---|
+| `BrowserUseAdapter.connect()` | Adapter reaches `HEALTHY` even when `browser-use` is not installed. |
+| `BrowserUseAdapter.instrument_agent(agent)` | `agent.run` is wrapped with the traced async shim. |
+| Lifecycle hooks | `browser.session.start`, `agent.input`, `browser.navigate`, `browser.action`, `browser.screenshot`, `model.invoke`, `cost.record`, `agent.output`, `agent.state.change` all emitted in order. |
+| Truncation policy | The 50 KB screenshot blob is replaced by a SHA-256 reference; the same blob across steps produces the same hash (replay correlation). |
+| Multi-tenant org_id | Every emit carries `org_id="org_demo"`, demonstrating the PR #118 contract. |
+| PR #115 error path (`--fail`) | When the agent raises, `agent.error` is emitted BEFORE the exception propagates so the dashboard sees a complete pair. |
+| `BrowserUseAdapter.disconnect()` | `agent.run` is restored to the original. |
+| `resilience_snapshot()` | Per-callback failure counters surface for operators. |
+
+## Going to a real run
+
+Swap `_FakeAgent` for a real browser_use Agent and route events to
+the LayerLens dashboard via `HttpEventSink`:
+
+```python
+from browser_use import Agent
+from langchain_openai import ChatOpenAI
+
+from layerlens.instrument.transport.sink_http import HttpEventSink
+from layerlens.instrument.adapters.frameworks.browser_use import (
+    BrowserUseAdapter,
+    instrument_agent,
+)
+
+sink = HttpEventSink(adapter_name="browser_use")
+
+agent = Agent(
+    task="find the price of a Logitech MX Master 3S on a demo store",
+    llm=ChatOpenAI(model="gpt-4o-mini"),
+)
+
+adapter = instrument_agent(agent, org_id="org_acme")
+adapter.add_sink(sink)
+
+result = await agent.run()
+
+adapter.disconnect()
+sink.close()
+```
+
+Required env for the live path: `LAYERLENS_STRATIX_API_KEY`,
+`LAYERLENS_STRATIX_BASE_URL`, plus whatever credentials your LLM
+provider needs (e.g. `OPENAI_API_KEY`).
+
+Install with: `pip install 'layerlens[browser-use]'`.
diff --git a/samples/instrument/browser_use/__init__.py b/samples/instrument/browser_use/__init__.py