Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,7 @@ Crust inspects tool calls at multiple layers:
2. **Layer 1 (Response Scan)**: Scans tool calls in the LLM's response before they execute — blocks new dangerous actions in real-time.
3. **Stdio Proxy** ([MCP](docs/mcp.md) / [ACP](docs/acp.md)): Wraps MCP servers or ACP agents as a stdio proxy, intercepting security-relevant JSON-RPC messages in both directions — including DLP scanning of server responses for leaked secrets.

All modes apply a [10-step evaluation pipeline](docs/how-it-works.md) — input sanitization, Unicode normalization, obfuscation detection, DLP secret scanning, path-based rules, and fallback content matching — each step in microseconds.
All modes apply a [16-step evaluation pipeline](docs/how-it-works.md) — input sanitization, Unicode normalization, obfuscation detection, DLP secret scanning, path-based rules, and fallback content matching — each step in microseconds.

All activity is logged locally to encrypted storage.

Expand Down
66 changes: 54 additions & 12 deletions docs/how-it-works.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,17 +11,23 @@ Agent Request ──▶ [Layer 0: History Scan] ──▶ LLM ──▶ [Layer 1
(14-30μs) (14-30μs)
"Bad agent detected" "Action blocked"

Layer 1 Rule Evaluation Order:
1. Sanitize tool name → strip null bytes and control chars
2. Extract paths, commands, content from tool arguments
3. Normalize Unicode → NFKC, strip invisible chars and confusables (all text fields)
4. Block null bytes in write content
5. Detect obfuscation (base64, hex, IFS) and shell evasion
6. Self-protection → block management API/socket access
7. DLP Secret Detection → blocks real API keys/tokens (hardcoded + gitleaks)
8. Path normalization → expand ~, env vars, globs, resolve symlinks
9. Operation-based Rules → path/command/host matching for known tools
10. Fallback Rules (content-only) → raw JSON matching, works for ANY tool
Layer 1 Rule Evaluation (16 steps):
1. Sanitize tool name → strip null bytes, control chars
2. Extract paths, commands, content from tool arguments
3. Normalize Unicode → NFKC, strip invisible chars and confusables
4. Block null bytes in write content
5. Detect encoding obfuscation (base64, hex)
6. Block evasive commands (fork bombs, unparseable shell)
7. Self-protection → block management API access (hardcoded)
8. Block management API via Unix socket / named pipe
9. DLP Secret Detection → block real API keys/tokens
10. Filter bare shell globs (not real paths)
11. Normalize paths → expand ~, env vars
12. Expand globs against real filesystem
13. Block /proc access (hardcoded)
14. Resolve symlinks → match both original and resolved
15. Operation-based rules → path/command/host matching
16. Fallback rules (content-only) → raw JSON matching for ANY tool
```

**Layer 0 (Request History):** Scans tool_calls in conversation history. Catches "bad agent" patterns where malicious actions already occurred in past turns.
Expand Down Expand Up @@ -81,6 +87,10 @@ Layer 1 Rule Evaluation Order:
| LLM generates `cat .env` | - | ✅ Blocked | - | - |
| LLM generates `rm -rf /etc` | - | ✅ Blocked | - | - |
| `$(cat .env)` obfuscation | - | ✅ Blocked | - | - |
| `eval "cat .env"` wrapping | - | ✅ Blocked (recursive parse) | - | - |
| Fork bomb `f(){ f|f& }; f` | - | ✅ Blocked (AST) | - | - |
| `echo payload \| base64 -d \| sh` | - | ✅ Blocked (pre-filter) | - | - |
| Hex-encoded command `$'\x63\x61\x74'` | - | ✅ Blocked (pre-filter) | - | - |
| Symlink bypass | - | ✅ Blocked (composite) | - | - |
| Leaking real API keys/tokens | - | ✅ Blocked (DLP) | ✅ Blocked (DLP) | ✅ Blocked (DLP) |
| MCP client reads `.env` | - | - | ✅ Blocked (inbound) | - |
Expand All @@ -94,9 +104,41 @@ Layer 1 Rule Evaluation Order:

---

## Shell Command Analysis

The rule engine uses a hybrid interpreter+AST approach to extract paths and operations from shell commands (Bash tool calls, `sh -c` wrappers, etc.).

**Interpreter mode:** A sandboxed shell interpreter expands variables, command substitutions, and tilde/glob patterns in dry-run mode. This produces fully expanded paths — `DIR=/tmp; ls $DIR` yields `/tmp`, not `$DIR`.

**AST fallback:** When a statement contains constructs unsafe for interpretation (process substitution `<()`, background `&`, heredocs, coprocs, fd redirects), the parser falls back to AST extraction which reads literal text from the syntax tree.

**Hybrid mode:** When a script mixes safe and unsafe statements, the engine runs the interpreter on safe statements (preserving variable expansion) and uses AST fallback only for unsafe ones. Inner commands of process substitutions and coprocs are recursively interpreted when possible.

```text
DIR=/tmp; diff <(ls $DIR) <(ls $DIR/sub)

Without hybrid: diff, ls (literal — $DIR unexpanded)
With hybrid: diff, ls /tmp, ls /tmp/sub (fully expanded)
```

### Evasion Detection

The shell parser detects several evasion techniques at the AST level:

| Technique | Detection |
|-----------|-----------|
| **Fork bombs** | AST walk detects self-recursive `FuncDecl` (e.g., `bomb(){ bomb\|bomb& }; bomb`) |
| **Eval wrapping** | `eval` args are joined and recursively parsed as shell code (like `sh -c`) |
| **Base64 encoding** | Pre-filter regex catches `base64 -d` / `base64 --decode` patterns |
| **Hex encoding** | Pre-filter catches 3+ consecutive `\xNN` escape sequences |

The pre-filter runs before the shell parser (step 5) and catches encoding-based obfuscation where the actual command is hidden in encoded form — invisible to the parser at parse time. Other evasion techniques (fork bombs, eval) are detected at the AST level (step 6) after parsing.

---

## DLP Secret Detection

Step 7 of the evaluation pipeline runs hardcoded DLP (Data Loss Prevention) patterns against all operations. These patterns detect real API keys and tokens by their format, regardless of file path or tool name.
Step 9 of the evaluation pipeline runs hardcoded DLP (Data Loss Prevention) patterns against all operations. These patterns detect real API keys and tokens by their format, regardless of file path or tool name.

In stdio proxy modes (MCP Gateway, ACP Wrap, Auto-detect), DLP also scans **server/agent responses** before they reach the client. This catches secrets leaked by the subprocess — for example, an MCP server returning file content that contains an AWS access key. The response is replaced with a JSON-RPC error so the secret never reaches the client.

Expand Down
23 changes: 4 additions & 19 deletions internal/acpwrap/convert.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,9 @@ package acpwrap
import (
"encoding/json"
"fmt"
"strings"

"github.com/BakeLens/crust/internal/rules"
"mvdan.cc/sh/v3/syntax"
"github.com/BakeLens/crust/internal/shellutil"
)

// ACP parameter types
Expand All @@ -32,16 +31,6 @@ type terminalCreateParams struct {
Cwd string `json:"cwd,omitempty"`
}

// shellQuote quotes a shell argument using the shell parser's own Quote function.
// Falls back to single-quoting on error (e.g., null bytes).
func shellQuote(s string) string {
q, err := syntax.Quote(s, syntax.LangBash)
if err != nil {
return "'" + strings.ReplaceAll(s, "'", "'\"'\"'") + "'"
}
return q
}

// ACPMethodToToolCall converts an ACP JSON-RPC method + params into a rules.ToolCall.
//
// Returns:
Expand Down Expand Up @@ -91,13 +80,9 @@ func ACPMethodToToolCall(method string, params json.RawMessage) (*rules.ToolCall
if err := json.Unmarshal(params, &p); err != nil {
return nil, fmt.Errorf("malformed %s params: %w", method, err)
}
fullCmd := p.Command
if len(p.Args) > 0 {
quoted := make([]string, len(p.Args))
for i, a := range p.Args {
quoted[i] = shellQuote(a)
}
fullCmd += " " + strings.Join(quoted, " ")
fullCmd, err := shellutil.Command(append([]string{p.Command}, p.Args...)...)
if err != nil {
return nil, fmt.Errorf("cannot construct command: %w", err)
}
args, err := json.Marshal(map[string]string{"command": fullCmd})
if err != nil {
Expand Down
18 changes: 6 additions & 12 deletions internal/proxy/sse_buffer.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,14 @@ import (
"errors"
"fmt"
"net/http"
"strings"
"sync"
"time"

"github.com/BakeLens/crust/internal/rules"
"github.com/BakeLens/crust/internal/security"
"github.com/BakeLens/crust/internal/shellutil"
"github.com/BakeLens/crust/internal/telemetry"
"github.com/BakeLens/crust/internal/types"
"mvdan.cc/sh/v3/syntax"
)

const blockedToolSuffix = " Please inform the user and try a different approach."
Expand Down Expand Up @@ -93,24 +92,19 @@ func NewBufferedSSEWriter(w http.ResponseWriter, maxSize int, timeout time.Durat
// shellToolNames lists tool names that can execute shell commands (in priority order)
var shellToolNames = []string{"Bash", "bash", "Shell", "shell", "Execute", "execute", "Exec", "exec", "RunCommand", "run_command", "Terminal", "terminal", "Cmd", "cmd"}

// shellQuote quotes a string for shell using the shell parser's own Quote function.
func shellQuote(s string) string {
q, err := syntax.Quote(s, syntax.LangBash)
if err != nil {
return "'" + strings.ReplaceAll(s, "'", "'\\''") + "'"
}
return q
}

// buildBlockedReplacement constructs the replacement command input for a blocked tool call.
func buildBlockedReplacement(toolName string, matchResult rules.MatchResult) map[string]string {
msg := fmt.Sprintf("[Crust] Tool '%s' blocked.", toolName)
if matchResult.Message != "" {
msg = fmt.Sprintf("[Crust] Tool '%s' blocked: %s.", toolName, matchResult.Message)
}
msg += blockedToolSuffix
cmd, err := shellutil.Command("echo", msg)
if err != nil {
cmd = "echo '[Crust] Tool blocked.'"
}
return map[string]string{
"command": "echo " + shellQuote(msg),
"command": cmd,
"description": "Security: blocked tool call",
}
}
Expand Down
2 changes: 1 addition & 1 deletion internal/rules/engine.go
Original file line number Diff line number Diff line change
Expand Up @@ -365,7 +365,7 @@ func (e *Engine) Evaluate(call ToolCall) MatchResult {
}
}

// Step 5: PreFilter — detect obfuscation (base64, hex, IFS, curl/nc).
// Step 5: PreFilter — detect obfuscation (base64, hex encoding).
if info.Command != "" {
if match := e.preFilter.Check(info.Command); match != nil {
return MatchResult{
Expand Down
2 changes: 1 addition & 1 deletion internal/rules/evasive_check_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ func TestEvasiveMarkingOnSecondaryFields(t *testing.T) {
name: "text starting with shell keyword if",
toolName: "helper",
args: map[string]any{"script": "if you want to delete, use rm"},
wantEvasive: true, // incomplete if/then block fails parserexpected
wantEvasive: false, // unparseable commands are NOT evasiveOS sandboxing is the enforcement layer
},
{
name: "safe command + text starting with if keyword",
Expand Down
Loading