browser-use · Alezander9 · Apr 30, 2026 · Apr 30, 2026 · cubic-dev-ai · Apr 30, 2026
diff --git a/UPSTREAM.md b/UPSTREAM.md
@@ -87,6 +87,7 @@ Each upstream has its own append-only table. Add a row every time you pull.
 | 2026-04-28 | `fefca43` | `04f7716` | bcode | 7 upstream commits. Windows fixes (PRs #232, #240) + skill rename (PR #242). Files: `src/browser_harness/_ipc.py` (BH_TMP_DIR override for sock/port/pid/log/screenshot dir; drop DETACHED_PROCESS to suppress empty Windows console window), `src/browser_harness/admin.py` (route `ensure_daemon` warm probe through `ipc.connect` so Windows TCP loopback works; new `_open_inspect=False` flag on `ensure_daemon` used by `run_setup` to prevent chrome://inspect tab flooding; drop unused `_paths()` helper), `src/browser_harness/helpers.py` (`capture_screenshot` and click-debug overlay route through `ipc._TMP` instead of `tempfile.gettempdir()` so BH_TMP_DIR covers them too), `SKILL.md` (`name: browser-harness` → `name: browser`), `install.md` (`name: browser-harness-install` → `name: browser-install`). All in protected `src/browser_harness/*.py` zone — taken verbatim. SKILL/install frontmatter rename only affects how end-users invoke the skill (`/browser` vs `/browser-harness`); our `browser-execute.txt` references SKILL.md by file path, so no integration code changes. Divergences touched: none. PR #240 e2e tested separately on Linux against headless Chrome before sync. |
 | 2026-04-28 | `04f7716` | `2125cea` | bcode | 1 upstream commit (PR #243). `src/browser_harness/_ipc.py`: `_TMP.mkdir(parents=True, exist_ok=True)` at module load so a caller-supplied `BH_TMP_DIR` pointing at a non-existent directory no longer fails the first sock/port/pid/log/screenshot write. Prerequisite for browsercode's per-session scratch-dir use case. Protected zone — taken verbatim. Divergences touched: none. |
 | 2026-04-29 | `2125cea` | `997ee45` | bcode | 6 upstream commits (PRs #241, #244, #245). `src/browser_harness/_ipc.py`: when `BH_TMP_DIR` is set, drop the `bu-<NAME>` filename prefix (caller-isolated dir means no shared-tmpdir disambiguation needed); without `BH_TMP_DIR` the original `bu-<NAME>` scheme is unchanged. `src/browser_harness/admin.py`: `_daemon_endpoint_names` short-circuits to the local NAME when `BH_TMP_DIR` is set (no glob); plus catch `SystemError` from `os.kill` on Windows during `restart_daemon`. `src/browser_harness/daemon.py`: discover DevToolsActivePort in Comet and Arc profiles on macOS. `tests/unit/test_admin.py`: 2 new tests for the `BH_TMP_DIR` discovery path. All in protected `src/browser_harness/*.py` + tests — taken verbatim. Smoke test + 12 admin unit tests pass. The `_ipc` filename change pairs with our recent per-session BH_TMP_DIR work (browsercode PR #22) — caller isolation now extends to filenames as well as the dir. Divergences touched: none. |
+| 2026-04-30 | `997ee45` | `660827d` | bcode | 11 upstream commits (PRs #246, #247, #251, #254, #256, #260). `src/browser_harness/daemon.py`: resolve WS via `/json/version` to avoid stale `DevToolsActivePort` path (PR #260) + report `cdp_disconnected` on stale CDP probe in `connection_status` (PR #254) + cleanup remote browser when daemon startup fails (PR #251). `src/browser_harness/admin.py`: companion changes for the daemon fixes. `tests/unit/test_admin.py`: 7 new tests. New domain skills: `agent-workspace/domain-skills/xiaohongshu/scraping.md` (PR #246), and a top-level `domain-skills/shopify-admin/` tree (PR #247: README, embedded-apps, knowledge-base, polaris-inputs). Note: PR #247 added skills at the top-level `domain-skills/` path, not under `agent-workspace/domain-skills/` as the post-#229 layout would suggest — vendored verbatim to match upstream layout. Doc updates: README operator framing (PR #255), install.md heredoc → `-c` flag (PR #256), profile-sync.md same. All files outside divergences — taken verbatim. Smoke test + 19 admin unit tests pass. Divergences touched: none. |
 
 ---
 

diff --git a/packages/bcode-browser/harness/README.md b/packages/bcode-browser/harness/README.md
@@ -2,9 +2,9 @@
 
 # Browser Harness ♞
 
-The simplest, thinnest, **self-healing** harness that gives LLM **complete freedom** to complete any browser task. Built directly on CDP.
+Connect an LLM directly to your real browser with a thin, editable CDP harness. For browser tasks where you need **complete freedom**.
 
-The agent writes what's missing, mid-task, inside `agent-workspace/`. No framework, no recipes, no rails. One websocket to Chrome, nothing between.
+One websocket to Chrome, nothing between. The agent writes what's missing during execution. The harness improves itself every run.
 
 ```
   ● agent: wants to upload a file

diff --git a/...ges/bcode-browser/harness/agent-workspace/domain-skills/xiaohongshu/scraping.md b/...ges/bcode-browser/harness/agent-workspace/domain-skills/xiaohongshu/scraping.md
@@ -0,0 +1,84 @@
+# Xiaohongshu — Search and Sort
+
+URL patterns:
+- Home / discovery: `https://www.xiaohongshu.com/explore`
+- Search results: `https://www.xiaohongshu.com/search_result?keyword=...`
+
+## Search flow
+
+- Prefer direct navigation to the desktop search results page over automating the home-page search box.
+- Reliable primary path: `https://www.xiaohongshu.com/search_result?keyword=<url-encoded keyword>&source=web_explore_feed`
+- This route loads the normal desktop results page and avoids home-page input flakiness.
+- The search results page can also appear with variants such as `type=51` or other `source` values after in-app navigation; do not treat those as suspicious if the rendered results are correct.
+- The top search box on `explore` can work, and searching from the home page has transitioned to `search_result` without a login wall in some sessions.
+- The page exposes duplicate search inputs in the DOM with the same placeholder `搜索小红书`.
+- The home-page search input can behave like a tightly controlled app field: direct DOM value assignment may be cleared immediately, and harness `type_text()` may fail to populate it even when the input is focused.
+- Treat the home-page input as best-effort only. Use it when a human-like interactive flow matters, but for automation default to constructing the `search_result` URL directly.
+
+## Sort behavior
+
+- On the current desktop results layout, `最新` is **not** a top-level tab beside `综合`.
+- Open the `筛选` control in the upper-right of the results header to access sort options.
+- Inside `筛选`, `排序依据` contains:
+  - `综合`
+  - `最新`
+  - `最多点赞`
+  - `最多评论`
+  - `最多收藏`
+- The `排序依据` row can render duplicate DOM nodes for the same pill text, including non-interactive clones.
+- Raw global text search for `最新` can hit the wrong node first. Scope to the `排序依据` section and then choose the visible interactive `.tags` node.
+- Prefer semantic filtering such as `aria-hidden != "true"` or section-scoped visible `.tags` selection over style-specific checks.
+- When `最新` is active, the `筛选` trigger changes to `已筛选`.
+- The rendered feed and the `已筛选` / active-pill UI are more reliable than `window.__INITIAL_STATE__.search.searchContext.sort` for confirming latest sort.
+
+## Stable cues
+
+- Search channel tabs near the top: `全部`, `图文`, `视频`, `用户`
+- Sort panel labels: `筛选`, `排序依据`, `最新`
+- Filter sections also visible in the panel: `笔记类型`, `发布时间`, `搜索范围`, `位置距离`
+
+## Interaction notes
+
+- DOM `.click()` opened the `筛选` panel reliably.
+- DOM `.click()` on the visible `最新` pill inside the open `排序依据` section reliably activated latest sort.
+- The reliable DOM pattern was:
+  - find the `排序依据` section / `.filters` block
+  - search within that block for `.tags`
+  - choose the one whose text is `最新` and which is the visible interactive node
+  - call `.click()` on that visible node
+- Example selector strategy:
+  - find `.filters` whose first label is `排序依据`
+  - inside it, pick `.tags` where `textContent.trim() === "最新"` and `el.getAttribute("aria-hidden") !== "true"`
+- `getClientRects().length > 0` alone may be insufficient to distinguish the working node from a duplicate.
+- A broad `document.querySelectorAll("*")` text match for `最新` is not reliable on this page because it may click the hidden duplicate instead of the visible control.
+- Coordinate click on the visible `最新` pill also worked and remains a valid fallback if DOM targeting gets confused by future UI changes.
+- After selecting `最新`, the grid briefly showed skeleton placeholders before the refreshed results appeared.
+- The search page stores the currently rendered note cards in `window.__INITIAL_STATE__.search.feeds._value` as an array of feed entries. For ordinary note cards, the useful fields were:
+  - `id`
+  - `xsecToken`
+  - `noteCard.displayTitle`
+  - `noteCard.user.nickname`
+- The feed array can contain non-note inserts such as hot-query modules. Filter for entries with `noteCard` before treating an item as a note result.
+
+## Post opening
+
+- Do **not** assume a raw results link like `https://www.xiaohongshu.com/explore/<id>` is directly openable.
+- Opening that raw `/explore/<id>` URL in a fresh tab can redirect to the web `404` / app-only gate even when the same post is openable from search results.
+- To open a post from search results, click the visible card image / card in-page first.
+- That click navigation can land on a tokenized URL like `https://www.xiaohongshu.com/explore/<id>?xsec_token=...&xsec_source=pc_search`, which is a more reliable note URL than the raw `/explore/<id>` form.
+- Once the tokenized URL is obtained from the click flow, it can be revisited in-session for extraction.
+- If the search results state is already loaded, you can reconstruct the tokenized note URL directly from a feed item without re-clicking:
+  - `https://www.xiaohongshu.com/explore/<id>?xsec_token=<xsecToken>&xsec_source=pc_search`
+
+## Post extraction
+
+- On tokenized post pages opened via `pc_search`, `document.body.innerText` can be a useful first-pass extraction source because it often includes the rendered note text, hashtags, timestamp, engagement counts, and visible comments.
+- Verify that the note content actually rendered before trusting `document.body.innerText`, because the page can also include substantial navigation, footer, and comment noise.
+- Prefer `document.body.innerText` as a fallback or initial probe before writing fragile per-element selectors for post content.
+
+## Gotchas
+
+- Do not assume `Enter` alone finished the workflow until you verify the URL changed to `search_result` or the result grid appeared.
+- Do not assume the visible `综合` tab controls all sorting; on this layout, time ordering is hidden inside `筛选`.
+- Do not assume the first DOM node whose text is `最新` is the clickable one; this panel duplicates pills and the hidden clone can absorb naive text-based targeting without changing state.
+- Do not assume a successfully opened post can be reproduced by stripping query params; preserve the `xsec_token` when reopening results-derived post URLs.
diff --git a/packages/bcode-browser/harness/domain-skills/shopify-admin/README.md b/packages/bcode-browser/harness/domain-skills/shopify-admin/README.md
@@ -0,0 +1,36 @@
+# shopify-admin
+
+Browser-harness patterns for `admin.shopify.com` and embedded Shopify apps.
+
+## Files in this folder
+
+- `embedded-apps.md` — every Shopify app runs in an iframe; how to target it
+- `polaris-inputs.md` — Polaris React inputs reject synthetic value setters; use CDP type_text
+- `knowledge-base.md` — automating the Shopify Knowledge Base App for FAQ entries
+
+## When to use these
+
+You're driving Shopify admin and need to add / edit / configure something. The Shopify admin UI is large and many surfaces are embedded apps — first check whether what you need is in an embedded app (most apps under `admin.shopify.com/store/<store>/apps/<app-slug>/...` are).
+
+## When to skip
+
+- If the operation is read-only product / inventory data → use the **Storefront API** (HTTP) instead, much faster
+- If the store has a custom admin app with API token provisioned → use the **Admin API** (GraphQL or REST) instead, no UI scraping
+- If you're editing theme code → use the **Shopify CLI** (`shopify theme push`) — don't touch the theme editor UI
+
+The browser is the right tool only when:
+- The setting / app exposes no API
+- The change is one-time or rare enough not to justify scripting
+- You're discovering / exploring the admin (e.g., finding selectors for a future automation)
+
+## Authentication
+
+Mike (or the human owner) must be logged into `admin.shopify.com` in the Chrome session that browser-harness attaches to. The harness does NOT log in — it inherits the human's session.
+
+If you hit `accounts.shopify.com` redirect, stop and ask the human to log in. Don't type credentials.
+
+## Polaris is in transition (Jan 2026 onward)
+
+Shopify is migrating its design system from React-based Polaris to Web-Components-based Polaris. Most legacy admin surfaces are still React. Newer surfaces (Catalog Mapping, parts of Settings) may be web components.
+
+Screenshot first. If you see `<s-text-field>` or `<s-button>` web component tags → use the web component pattern. If you see `[class*="Polaris-"]` React class names → use the CDP keystrokes pattern in `polaris-inputs.md`.
diff --git a/packages/bcode-browser/harness/domain-skills/shopify-admin/embedded-apps.md b/packages/bcode-browser/harness/domain-skills/shopify-admin/embedded-apps.md
@@ -0,0 +1,72 @@
+# Shopify embedded apps run in iframes
+
+Every Shopify app surfaced in the admin (first-party like Knowledge Base, third-party like Okendo) renders inside a sandboxed iframe. Your top-level `document` queries find the Shopify chrome (sidebar, header, search bar) but **none of the app's UI**.
+
+## How to target the iframe
+
+```python
+from helpers import iframe_target, js, type_text
+
+# 1. Find the iframe by URL substring
+tid = iframe_target("qa-pairs-app")  # Knowledge Base App
+
+# 2. Run JS inside the iframe by passing target_id
+result = js("""
+(() => {
+  const button = Array.from(document.querySelectorAll('button')).find(b => b.textContent.trim() === 'Add FAQ');
+  if (button) { button.click(); return {clicked: true}; }
+  return {clicked: false};
+})()
+""", target_id=tid)
+```
+
+## Finding the URL substring
+
+The iframe's URL contains the app slug. Run:
+
+```python
+import json
+for t in cdp("Target.getTargets")["targetInfos"]:
+    if t["type"] == "iframe" and "shopify" in t.get("url", "").lower():
+        print(t["url"])
+```
+
+Then pick a substring unique to your target app.
+
+## Known Shopify app iframe slugs
+
+| App | iframe URL substring |
+|---|---|
+| Shopify Knowledge Base (qa-pairs-app) | `qa-pairs-app` |
+| Shopify Online Store editor | `online-store-web.shopifyapps.com` |
+| Shopify Hydrogen Storefront | `hydrogen-storefronts` (or similar — verify) |
+
+Add to this table when you discover new ones.
+
+## Why iframes
+
+Shopify uses App Bridge to embed third-party apps with isolation. Your top-level page CAN'T directly access app DOM for security reasons — you need iframe targeting (which the harness does via CDP `Target.attachToTarget`).
+
+## Coordinate clicks vs JS clicks
+
+Coordinate clicks (`click(x, y)`) pass through iframes at the compositor level — they work. But JS clicks scoped to the iframe target are more reliable for routine button taps because:
+
+- Element text content is stable across UI redesigns
+- DPR scaling on retina is automatic
+- React event handlers are guaranteed to fire (vs. CDP mouse events which sometimes hit a transparent layer above the button)
+
+## Gotcha — multiple iframes from same app
+
+The Online Store editor renders the storefront preview AND the editor toolbar in two separate iframes. Pick the right one by URL substring; don't assume the first match is correct.
+
+```python
+# WRONG — picks first match
+tid = iframe_target("online-store-web")
+
+# RIGHT — disambiguate
+for t in cdp("Target.getTargets")["targetInfos"]:
+    url = t.get("url", "")
+    if "online-store-web" in url and "editor" in url:
+        tid = t["targetId"]
+        break
+```
diff --git a/packages/bcode-browser/harness/domain-skills/shopify-admin/knowledge-base.md b/packages/bcode-browser/harness/domain-skills/shopify-admin/knowledge-base.md
@@ -0,0 +1,109 @@
+# Shopify Knowledge Base App — automating FAQ entries
+
+The Knowledge Base App (Shopify Winter '26 Edition) lets merchants control how AI agents (ChatGPT, Perplexity, Claude, Copilot, Gemini) answer questions about their brand. Each entry is a Question / Answer pair. The app currently has no public API and is English-only as of Winter '26 — browser automation is the canonical path.
+
+## URL pattern
+
+```
+https://admin.shopify.com/store/<store-handle>/apps/shopify-knowledge-base/app
+```
+
+Sub-routes:
+- `/app` — overview (FAQ list, top unanswered questions, query log)
+- `/app/new` — Add FAQ form
+- `/app/pairs/<id>` — entry detail / edit
+
+## Iframe slug
+
+The app runs at iframe URL containing `qa-pairs-app`:
+
+```python
+tid = iframe_target("qa-pairs-app")
+```
+
+## Adding a single FAQ
+
+See `polaris-inputs.md` for the full canonical pattern. Quick version:
+
+```python
+def add_faq(question, answer):
+    tid = iframe_target("qa-pairs-app")
+    # focus question input via JS, type via CDP, focus answer, type, click Save
+    # poll URL for /pairs/<id> success signal
+```
+
+## Batching multiple FAQs
+
+After saving an entry, the success page shows "FAQ created. Add another FAQ" link. Click it via JS to skip navigating back to overview:
+
+```python
+def click_add_another():
+    tid = iframe_target("qa-pairs-app")
+    js("""
+    (() => {
+      const link = Array.from(document.querySelectorAll('a, button'))
+        .find(x => x.textContent.trim() === 'Add another FAQ');
+      if (link) link.click();
+    })()
+    """, target_id=tid)
+```
+
+Loop:
+
+```python
+ENTRIES = [(q1, a1), (q2, a2), ...]
+for q, a in ENTRIES:
+    click_add_another()
+    time.sleep(1.5)  # wait for form to render
+    ok, info = add_faq(q, a)
+    print(f"{q[:40]} -> {ok} ({info})")
+    if not ok: break
+```
+
+## Brand voice — what to put in answers
+
+This is application-specific (depends on the merchant). For JING the rule was Aesop founder-letter tone — sentence case, no exclamation points, "JING" not "we", specific over generic.
+
+The Shopify guidance "Provide a brief answer in 1 or 2 sentences" is a soft hint. The textarea accepts longer text and AI agents prefer specific multi-sentence answers. Aim for 2-4 short sentences with concrete details.
+
+## What to put in the Knowledge Base
+
+Categories that materially shape AI agent answers about your brand:
+
+1. **Brand voice / DNA** — "What is your brand?" / "What's your tone?"
+2. **Specs** — exact materials, dimensions, weights, sizes (NOT marketing prose)
+3. **Comparisons** — "How does X compare to <competitor>?" with concrete differences
+4. **Policies** — returns, shipping, care, warranty, contact (in brand voice)
+5. **Origin** — founder, where made, why brand exists
+6. **Limitations** — what you DON'T do (V1 scope, US-only, etc.) — agents that hallucinate availability hurt conversion
+
+Skip: anything marketing-speak. The Knowledge Base is for **truth, in voice**, not pitch copy.
+
+## Top unanswered questions
+
+The overview shows up to 7 "Top unanswered questions" Shopify auto-detected from query logs. **Answer these first** — they're real shopper queries hitting your store right now. Once answered, the section empties.
+
+## Query log
+
+`/admin/apps/shopify-knowledge-base/app/queries` (or "Query log" in app sidebar) shows what shoppers actually asked AI agents about your brand. Read weekly. New patterns become new FAQ entries.
-`/admin/apps/shopify-knowledge-base/app/queries` (or "Query log" in app sidebar) shows what shoppers actually asked AI agents about your brand. Read weekly. New patterns become new FAQ entries.
+`/store/<store-handle>/apps/shopify-knowledge-base/app/queries` (or "Query log" in app sidebar) shows what shoppers actually asked AI agents about your brand. Read weekly. New patterns become new FAQ entries.
-`/admin/apps/shopify-knowledge-base/app/queries` (or "Query log" in app sidebar) shows what shoppers actually asked AI agents about your brand. Read weekly. New patterns become new FAQ entries.
+`/store/<store-handle>/apps/shopify-knowledge-base/app/queries` (or "Query log" in app sidebar) shows what shoppers actually asked AI agents about your brand. Read weekly. New patterns become new FAQ entries.
+
+## Verifying entries surface in AI
+
+After adding an entry, allow 24 hours for AI provider indexing, then test:
+
+- ChatGPT: "Tell me about <your brand>'s return policy" → check if your exact wording surfaces
+- Perplexity: same
+- Claude: "Compare <your brand> vs <competitor>" → see if your comparison framing appears
+
+If the answer doesn't surface, the entry might be too long, too vague, or contradicted by another source (your homepage, an outdated blog post). Tighten the answer.
+
+## Limits
+
+As of Winter '26 Edition:
+- English-only
+- No bulk import / CSV upload
+- No API for read or write
+- Each entry maximum ~500 words (soft cap; UI shows guidance "1 or 2 sentences")
+- No version history visible to the merchant
+
+Watch Shopify changelogs for API exposure — likely in Spring '26 or Summer '26 Edition. When it ships, switch to API-driven population.