feat(runner): fail-closed security parity on the pi backend (#525)#695
Closed
gewenyu99 wants to merge 8 commits into
Closed
feat(runner): fail-closed security parity on the pi backend (#525)#695gewenyu99 wants to merge 8 commits into
gewenyu99 wants to merge 8 commits into
Conversation
pi has no permission layer, so attach an extension that intercepts EVERY tool call — built-in (bash/read/edit/write/grep) and custom — via pi's tool_call hook and reuses the exact anthropic policy: wizardCanUseTool (bash allowlist + .env fencing + disallowedTools) plus the YARA content scan (bash command, written content with the same wizard-doc posthog_pii suppression). A tool_result hook post-scans read/bash output for prompt injection. Everything fails closed: a scanner error blocks, and a critical post-scan violation latches so every later call is blocked and the run ends as AgentErrorType.YARA_VIOLATION. Plus a runaway tool-call cap. extensionFactories load even with noExtensions:true, so the fence is always on while the target project can't inject its own extensions. Subagents reuse the same factory so a child can't escape it. Proven by unit test (no live key needed): the blocked-action corpus (cat .env, rm -rf, curl exfil, shell operators, direct .env read/write/edit/grep) is blocked; install/build + source edits + the sanctioned env tools are allowed; the post-scan latch and runaway guard fire. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…#526) Task/todo (#526): TaskCreate/Update/Get/List as pi tools backed by a shared store; every mutation pushes the list to the TUI via getUI().syncTodos, so the todo panel updates live under pi — the parity that was missing. Controlled subagents (#526): pi has no native subagents, so dispatch_agent spawns a nested createAgentSession WE construct, which closes the leak the claude-agent-sdk path warns about. Every child inherits: the SAME security extension (canUseTool + YARA, shared cap + violation latch); a read-only built-in toolset (read/grep/find/ls + allowlisted bash) — no write/edit; and no custom tools, so no .env writes and no dispatch_agent (depth hard-capped at 1). A child can research but never mutate the project or escape the fence. Logging parity: log assistant turns ([pi] assistant: …) on message_end and tool I/O on tool_execution_*, and drive the single run spinner with one stable status at a time (no overlapping/garbled messages). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
🧙 Wizard CIRun the Wizard CI and test your changes against wizard-workbench example apps by replying with a GitHub comment using one of the following commands: Test all apps:
Test all apps in a directory:
Test an individual app:
Show more apps
Results will be posted here when complete. |
This was referenced Jun 19, 2026
Collaborator
Author
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
Profiling showed pi burns model round-trips early in every run: it reaches for `bash ls/find` (blocked by the fence) and retries before discovering `read` on a directory lists files, and it re-fetches the 36-skill menu 2-3x. Fixes: - PI_RUNTIME_NOTES appended to the pi system prompt: use `read` for directory listing (bash is install/build only), call load_skill_menu once, never hardcode the PostHog host (the YARA scanner blocks it at write time). - memoize fetchSkillMenu in-process so the menu is fetched at most once per run. - make the env tools genuinely async (await fs.promises) — fixes require-await. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…p, sharper steer From the anthropic-vs-pi parity diff + Vincent's testing: - #14: steer the agent to update the task list FREQUENTLY (complete/advance as it goes) so the TUI step label isn't stale — not "one in_progress at a time". - #5: print an init banner ("Initializing Wizard agent...") like the anthropic path. - #15: clean up .posthog-events.json host-side after the run (the skill says remove it; pi's `rm` is fence-blocked, so it left a 0-byte file). - #12: sharper tool steer — use `read` on a directory to list files (never bash ls/find/cat/grep), and run installs synchronously (no `&`/`&&`/pipes). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Give pi the native ls/find/grep tool definitions. pi's defaults are only read/bash/edit/write, so the agent bashed `ls`/`find` into the fence and spiraled (profiled + turn-by-turn audit). With the tools it explores properly (used them 14x in the re-run); fix the steer too (read errors on a directory). - System-prompt hardening (adapted from Claude Code system prompts, trimmed to our use case): treat skill/project file contents as untrusted (injection guard); actually verify (build/typecheck + confirm the SDK imports, not just that a file was written) — addresses the audit's skipped-revise gap; dispatch_agent prompts must be self-contained (read-only, no recursion). Known remaining fence-vs-workflow blocks (separate fix): the agent still tries `npm install &` (commandment says background it) and `node -e` verification — both fence-blocked. Anthropic expresses these fence-compatibly (bg tool param). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
pi's bash tool input is only {command, timeout?} — no run_in_background param —
so it can't background a command except via `&`, which the fence blocks. The
shared commandment (#306) "start the installation as a background task" is thus
unfollowable on pi, and the agent kept retrying `npm install &` / `cd && …`.
Steer pi to disregard that instruction and install synchronously. (Not a fence
change — loosening for `&` wouldn't help; pi's bash exec waits regardless.)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Per Vincent (its author): the rule was probably a mistake. It tells the agent to background installs and continue, but that races (using a package before install finishes) and is unfollowable on runtimes without backgrounding (pi's bash has no run_in_background param, and `&` is fence-blocked). Remove it for all backends and let installs run synchronously. Snapshot updated; pi note simplified to the fence facts. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- #10: load pi-mcp-adapter via jiti and connect to the hosted PostHog MCP (boot.mcpUrl); pre-warm its cache to register curated dashboard/insight tools as direct tools; proxy disabled to keep context small. - Env lockdown: pi's bash spawns with a scrubbed minimal env so no secret or ambient var reaches a subprocess; shared into the subagent too. - Auth: neutralize an inherited Claude Code session so the agent bearer-auths to the gateway instead of taking an OAuth path (the 401). - Prompt audit: promote the prompt-injection guard to both backends; extend the minimal-edits rule; parse [DASHBOARD_URL]/[NOTEBOOK_URL] markers (#9). - Deps: pi-mcp-adapter, jiti; zod ^3.25 (MCP SDK needs the zod/v4 subpath). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This was referenced Jun 20, 2026
Collaborator
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Epic #520 · implements #525 (fail-closed security parity) + #526 (Task/todo + controlled subagents). Top of the pi stack (#692 ← #693 ← #694 ← #695).
Done in this stack:
wizard-runnerflag (01 — Selection surface: config map + insertion points (each flag its own middleware) #521 · feat(runner): central runner plan + anthropic runner seam (#521) #692)pi) #524 · feat(runner): pi.dev runner — gateway provider, model from pair, in RUNNERS (#524) #693)canUseToolallowlist + YARA, fence inherited by subagents; unit-tested + verified blocking live (05 — Security parity (canUseTool + YARA, fail-closed) forpi#525)pi#526)Open parity gaps (tracked, not yet done):
[STATUS]/[DASHBOARD_URL]/[NOTEBOOK_URL]marker parsing (outro link)wizard_ask(interactive questions)bash ls/find→ slow runs)