Skip to content

feat(rules): sandbox disable detection#4

Open
MiguelHzBz wants to merge 2 commits intoleogr:mainfrom
MiguelHzBz:add-sandbox-disable-rules
Open

feat(rules): sandbox disable detection#4
MiguelHzBz wants to merge 2 commits intoleogr:mainfrom
MiguelHzBz:add-sandbox-disable-rules

Conversation

@MiguelHzBz
Copy link
Copy Markdown

Summary

  • Adds rules/user/sandbox-disable.yaml with two rules targeting prompt injection attacks that direct agents to remove their own OS-level sandbox isolation
  • Rule A (CRITICAL/deny): blocks Write/Edit to .claude/settings.json, .codex/config.toml, or .gemini/settings.json when content disables sandbox (sandbox.enabled: false, danger-full-access, toolSandboxing: false, allowUnsandboxedCommands: true)
  • Rule B (WARNING/ask): requires confirmation before Claude Code invokes Bash with dangerouslyDisableSandbox — this is the only detection layer for this event since it is a tool call parameter invisible to Falco at the syscall level

Test plan

  • bash tests/test_sandbox_rules.sh — 15 test cases covering deny, ask, and allow paths for all three agents (Claude Code, Codex, Gemini CLI)
  • Requires Falco 0.43+ and the built plugin/interceptor. To replicate: provision an EC2 Ubuntu 22.04 instance, follow the build instructions in the README, then run the test script directly.

Adds rules/default/sandbox-disable.yaml with 5 rules covering all
known bypass techniques against the original 2-rule design:

Rule A (CRITICAL/deny): blocks Write/Edit to agent sandbox config files
  with content that disables sandbox. Hardened against:
  - Edit value-only diff: "enabled":true→false without "sandbox" in diff
    (is_sandbox_disable_value_change catches "enabled"+"false" on config files)
  - Numeric zero: "enabled":0 — JS falsy, not the string "false"
    (is_sandbox_disable_value_zero catches "enabled"+":0"/": 0")
  - settings.local.json: same schema as settings.json, not in original path list
  - Gemini "none"/"disabled" string values: valid disable values, not boolean false

Rule B (WARNING/ask): requires confirmation for Claude Code per-command
  dangerouslyDisableSandbox escape — the only detection layer for this event.

Rule C (CRITICAL/deny): Bash command writing sandbox-disable content to
  an agent settings file — covers echo >, tee, python3 open(), etc.

Rule D (CRITICAL/deny): Codex CLI sandbox bypass flags —
  --dangerously-bypass-approvals-and-sandbox, --sandbox danger-full-access

Rule E (CRITICAL/deny): GEMINI_SANDBOX env var set to a disabling value —
  GEMINI_SANDBOX=none/false/disabled disables Docker isolation without any
  file write, invisible to all previous rules.

Includes:
- test_sandbox_rules.sh: 15 tests for core rule behavior
- test_sandbox_bypass.sh: 30 tests proving bypasses are now caught
@MiguelHzBz MiguelHzBz force-pushed the add-sandbox-disable-rules branch from d61124a to 48ae1f5 Compare April 10, 2026 10:56
…ive bypass test

- Add 6 detection rules (A-F) closing 17 bypass surfaces:
  B1: Edit diff missing sandbox key
  B2: Numeric zero ('enabled':0)
  B3: settings.local.json path variant
  B4: Gemini 'none'/'disabled' string values
  B5: Null value ('enabled':null)
  B6: allowUnsandboxedCommands:1 numeric truthy
  B7: Bash echo/tee shell redirection
  B8: Python capital-F False
  B9: Bash null in command string
  B10: Bash enabled:0 (no 'false' keyword)
  B11: Bash allowUnsandboxedCommands:1 in command
  B12: sed -i targeting settings files
  B13: cp pre-crafted file to settings path
  B14: mv pre-crafted file to settings path
  B15: Codex underscore flag variant
  B16: GEMINI_SANDBOX=0
  B17: dangerouslyDisableSandbox:false false-positive fix

- Add test_sandbox_bypass.sh covering all 17 sections (~45 test cases)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant