Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 16 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,8 @@ jobs:

- name: Verify no node imports in dist
run: |
if grep -rE "(from|require\()\s*['\"]node:" dist/ --include='*.mjs' --include='*.cjs' --include='*.js' 2>/dev/null; then
echo "ERROR: found node: imports in dist/"
if grep -rE "(from|require\()\s*['\"]node:" dist/ --include='*.mjs' --include='*.cjs' --include='*.js' --exclude-dir=overlay 2>/dev/null; then
echo "ERROR: found node: imports in dist/ (excluding overlay)"
exit 1
fi

Expand All @@ -59,3 +59,17 @@ jobs:
if [ "$KB" -gt 500 ]; then
echo "WARNING: Bundle size exceeds 500KB target (${KB}KB)"
fi

test-bun:
name: Test (Bun)
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: oven-sh/setup-bun@v2
- uses: pnpm/action-setup@v4
- uses: actions/setup-node@v4
with:
node-version: 22
cache: pnpm
- run: pnpm install --frozen-lockfile
- run: bun run vitest run
5 changes: 3 additions & 2 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ Virtual bash interpreter for AI agents. Pure ECMAScript, zero runtime dependenci
- `tests/comparison/commands/` - one file per command
- `tests/comparison/jq/` - jq processor tests
- **Validation gate:** `pnpm test:all` runs unit + comparison + lint + typecheck
- **CI:** macOS + Linux + Windows
- **CI:** macOS + Linux + Windows + Bun

## Docs

Expand All @@ -29,8 +29,9 @@ Design docs for AI agents in `docs/`. Read on-demand, not required.
- [`docs/design/parser.md`](docs/design/parser.md) - Lexer, AST types, recursive descent parser
- [`docs/design/interpreter.md`](docs/design/interpreter.md) - Execution, pipes, expansion phases, control flow signals
- [`docs/design/commands.md`](docs/design/commands.md) - Registry, adding commands, custom command API
- [`docs/design/filesystem.md`](docs/design/filesystem.md) - InMemoryFs, lazy files, virtual devices, symlinks
- [`docs/design/filesystem.md`](docs/design/filesystem.md) - InMemoryFs, OverlayFs, lazy files, virtual devices, symlinks
- [`docs/design/security.md`](docs/design/security.md) - Execution limits, regex guardrails, threat model
- [`docs/design/jq.md`](docs/design/jq.md) - Generator evaluator, builtins, format strings
- [`docs/3rd-party/testing-with-smokepod.md`](docs/3rd-party/testing-with-smokepod.md) - Comparison test workflow
- [`THREAT_MODEL.md`](THREAT_MODEL.md) - Security model, protections, threat analysis, non-goals

57 changes: 54 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
# @mylocalgpt/shell

Virtual bash interpreter for AI agents. Pure TypeScript, zero runtime dependencies. Runs in any JavaScript runtime - browsers, Node.js, Deno, Bun, and Cloudflare Workers. Ships with 60+ commands, a full jq implementation, and an under 40KB gzipped entry point.
Virtual bash interpreter for AI agents. Pure TypeScript, zero runtime dependencies. Runs in any JavaScript runtime - browsers, Node.js, Deno, Bun, and Cloudflare Workers. Ships with 65+ commands, a full jq implementation, and an under 40KB gzipped entry point.

- Pure JS, under 40KB gzipped, zero dependencies, runs anywhere
- 60+ commands including grep, sed, awk, find, xargs, and a full jq implementation
- 65+ commands including grep, sed, awk, find, xargs, curl, and a full jq implementation
- Pipes, redirections, variables, control flow, functions, arithmetic
- Configurable execution limits, regex guardrails, no eval
- OverlayFs: read-through overlay on real directories with change tracking

## Install

Expand Down Expand Up @@ -36,7 +37,8 @@ const shell = new Shell(options?: ShellOptions);

| Option | Type | Description |
|--------|------|-------------|
| `files` | `Record<string, string \| (() => string \| Promise<string>)>` | Initial filesystem contents. Values can be strings or lazy-loaded functions. |
| `fs` | `FileSystem` | Custom filesystem implementation. When provided, `files` is ignored. |
| `files` | `Record<string, string \| (() => string \| Promise<string>)>` | Initial filesystem contents. Values can be strings or lazy-loaded functions. Ignored when `fs` is provided. |
| `env` | `Record<string, string>` | Environment variables. Merged with defaults (HOME, USER, PATH, SHELL). |
| `limits` | `Partial<ExecutionLimits>` | Execution limits. Merged with safe defaults. |
| `commands` | `Record<string, CommandHandler>` | Custom commands to register. |
Expand All @@ -45,6 +47,9 @@ const shell = new Shell(options?: ShellOptions);
| `hostname` | `string` | Virtual hostname (used by `hostname` command). |
| `username` | `string` | Virtual username (used by `whoami` command). |
| `enabledCommands` | `string[]` | Restrict available commands to this allowlist. |
| `network` | `NetworkConfig` | Network handler for curl. See [Network Config](#network-config). |
| `onBeforeCommand` | `(cmd, args) => boolean \| void` | Hook before each command (return false to block). |
| `onCommandResult` | `(cmd, result) => CommandResult` | Hook after each command (can modify result). |

### shell.exec(command, options?)

Expand Down Expand Up @@ -174,6 +179,10 @@ Clear environment and functions, reset working directory. Filesystem is kept int
| `which` | Locate a command |
| `tee` | Duplicate stdin to file and stdout |
| `sleep` | Pause execution |
| `yes` | Repeat a string (output-capped) |
| `timeout` | Run command with time limit |
| `xxd` | Hex dump (-l, -s) |
| `curl` | HTTP requests via network handler |
| `jq` | JSON processor (full implementation) |

## jq Support
Expand Down Expand Up @@ -251,8 +260,50 @@ const shell = new Shell({
});
```

## OverlayFs

Read-through overlay that reads from a real host directory and writes to memory. The host filesystem is never modified.

```typescript
import { Shell } from '@mylocalgpt/shell';
import { OverlayFs } from '@mylocalgpt/shell/overlay';

const overlay = new OverlayFs('/path/to/project', {
denyPaths: ['*.env', '*.key', 'node_modules/**'],
});
const shell = new Shell({ fs: overlay });

await shell.exec('cat src/index.ts | wc -l');
await shell.exec('echo "new file" > output.txt');

const changes = overlay.getChanges();
// { created: [{ path: '/output.txt', content: 'new file\n' }], modified: [], deleted: [] }
```

Available as a separate entry point at `@mylocalgpt/shell/overlay`. Requires Node.js (uses `node:fs`).

## Network Config

curl delegates all HTTP requests to a consumer-provided handler. The shell never makes real network requests.

```typescript
const shell = new Shell({
network: {
handler: async (url, opts) => {
const res = await fetch(url, { method: opts.method, headers: opts.headers, body: opts.body });
return { status: res.status, body: await res.text(), headers: {} };
},
allowlist: ['api.example.com', '*.internal.corp'],
},
});

await shell.exec('curl -s https://api.example.com/data | jq .results');
```

## Security Model

See [THREAT_MODEL.md](THREAT_MODEL.md) for the full security model, threat analysis, and explicit non-goals.

**What we do:**

- All user-provided regex goes through pattern complexity checks and input-length caps
Expand Down
65 changes: 65 additions & 0 deletions THREAT_MODEL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Security Model

@mylocalgpt/shell is a virtual bash interpreter designed for AI agents. The primary threat is untrusted or buggy agent-generated scripts causing resource exhaustion, ReDoS, or state corruption. Defense is architectural: no eval, no node: imports, Map-based state, and configurable execution limits.

## Protections

### Regex Guardrails

User-provided regex patterns (in grep, sed, awk, expr, find, jq) are analyzed before execution:

- **Nested quantifier detection** - identifies patterns like `(a+)+`, `(a*)*`, `(.+)+` that cause catastrophic backtracking
- **Backreference in quantified group** - catches groups followed by quantifiers containing `\1`-`\9`
- **Input caps** - patterns are limited to 1,000 characters, subjects to 100,000 characters

All detection is hand-written with no dependencies. Properly handles escaped characters and character class internals.

Validated in: `tests/security.test.ts`

### Execution Limits

Seven configurable limits prevent runaway scripts. All are checked at execution points (loop iteration, function call, command dispatch). Exceeding a limit throws a descriptive error, not a silent truncation.

| Limit | Default | Prevents |
|-------|---------|----------|
| maxLoopIterations | 10,000 | Infinite loops (for, while, until) |
| maxCallDepth | 100 | Stack overflow from recursive functions |
| maxCommandCount | 10,000 | Runaway scripts executing endless commands |
| maxStringLength | 10,000,000 | Memory exhaustion from string concatenation |
| maxArraySize | 100,000 | Memory exhaustion from array growth |
| maxOutputSize | 10,000,000 | Unbounded stdout/stderr accumulation |
| maxPipelineDepth | 100 | Deeply nested pipeline structures |

Limits are per-exec call. Each `Shell.exec()` call resets counters.

Validated in: `tests/security.test.ts`

### Map-based Environment Variables

Environment variables are stored in a `Map<string, string>`, not a plain object. This prevents prototype pollution via keys like `__proto__`, `constructor`, or `toString`.

Validated in: `tests/security.test.ts`

### No eval or Function

The codebase contains zero `eval()` or `new Function()` code paths. Shell script execution is done by walking the AST with a recursive descent interpreter. This eliminates code injection vectors entirely.

### Path Normalization

All filesystem paths are normalized to absolute paths with `..` segments resolved in-memory. Scripts cannot escape the virtual filesystem root. The virtual filesystem has no connection to the host filesystem.

### Error Sanitization

Internal errors are caught and returned as `{ stdout, stderr, exitCode }` results. `Shell.exec()` never throws to the caller. Stack traces and internal state are not leaked in error messages.

## Explicit Non-Goals

- **OS-level sandboxing.** The shell executes within your JavaScript runtime's security context. It does not provide process isolation.
- **Network isolation.** Custom commands have full access to the JavaScript environment. Network restrictions are the caller's responsibility.
- **Multi-tenancy.** Each Shell instance is single-tenant. There is no isolation between exec() calls on the same instance.
- **Permission enforcement.** `chmod` stores mode bits but does not enforce them. Read/write access is unrestricted within the virtual filesystem.
- **Comprehensive ReDoS prevention.** Regex guardrails are heuristic. They catch common patterns but cannot detect all possible exponential-time regexes. The input caps provide a hard backstop.

## Recommendation

For running untrusted scripts, combine @mylocalgpt/shell with OS-level isolation (containers, V8 isolates, or similar). The shell's built-in limits protect against accidental resource exhaustion but are not a substitute for a security sandbox.
12 changes: 12 additions & 0 deletions biome.json
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,18 @@
"semicolons": "always"
}
},
"overrides": [
{
"include": ["src/overlay/**", "tests/overlay/**"],
"linter": {
"rules": {
"nursery": {
"noRestrictedImports": "off"
}
}
}
}
],
"files": {
"ignore": ["dist/", "node_modules/", "_docs/", "scripts/", "*.tsbuildinfo"]
}
Expand Down
13 changes: 10 additions & 3 deletions docs/design.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,9 @@ Virtual bash interpreter for AI agents. Hand-written recursive descent parser, s
| Parser | Lexer, AST, recursive descent | `src/parser/{ast,lexer,parser}.ts` | 3,200 |
| Interpreter | Execution, pipes, expansion, control flow | `src/interpreter/{interpreter,expansion,builtins}.ts` | 3,800 |
| Filesystem | In-memory virtual FS, lazy files | `src/fs/{types,memory}.ts` | 760 |
| Commands | One-file-per-command, lazy registry | `src/commands/*.ts` (61 registered) | ~8,000 |
| Commands | One-file-per-command, lazy registry | `src/commands/*.ts` (65 registered) | ~8,500 |
| Security | Execution limits, regex guardrails | `src/security/{limits,regex}.ts` | 275 |
| OverlayFs | Read-through overlay for host dirs | `src/overlay/{index,types}.ts` | ~400 |
| jq | Full jq processor, generator-based | `src/jq/*.ts` | 5,500 |
| Utils | Glob, diff, printf (hand-written) | `src/utils/{glob,diff,printf}.ts` | 1,300 |

Expand All @@ -19,7 +20,7 @@ Virtual bash interpreter for AI agents. Hand-written recursive descent parser, s
```
input string -> parse() -> AST -> execute()
-> expand words (7 phases)
-> resolve builtins (27) or commands (61)
-> resolve builtins (27) or commands (65)
-> pipe stdout as string to next command
-> CommandResult { stdout, stderr, exitCode }
```
Expand All @@ -41,13 +42,17 @@ Flat `Map<string, FileNode>` keyed by normalized paths. Lazy file content (sync
→ [design/filesystem.md](design/filesystem.md)

### Commands
One file per command, lazy-loaded on first use. Dual-track registry (definitions Map + cache Map). 61 default commands, 27 builtins. Custom commands via `ShellOptions.commands` or `defineCommand()`.
One file per command, lazy-loaded on first use. Dual-track registry (definitions Map + cache Map). 65 default commands, 27 builtins. Custom commands via `ShellOptions.commands` or `defineCommand()`.
→ [design/commands.md](design/commands.md)

### Security
Prevents resource exhaustion and ReDoS from untrusted scripts. 7 execution limits with configurable caps. Regex guardrails detect nested quantifiers and backreferences in quantified groups before executing patterns.
→ [design/security.md](design/security.md)

### OverlayFs
Read-through filesystem that overlays a host directory. Reads from host via sync `node:fs`, writes to an in-memory Map. Host is never modified. `getChanges()` returns created/modified/deleted changeset. Separate entry point at `@mylocalgpt/shell/overlay`.
-> [design/filesystem.md](design/filesystem.md)

### jq
Independent module with generator-based evaluator. 31 AST node types, 12-level precedence parser, 80+ builtins. Separate `JqLimits` with higher defaults. Full format string support.
→ [design/jq.md](design/jq.md)
Expand All @@ -61,3 +66,5 @@ Independent module with generator-based evaluator. 31 AST node types, 12-level p
- **No `node:` imports in core** - pure ECMAScript for portability. Node APIs only in test harness and build scripts.
- **Generator-based jq** - `yield*` composes multiple outputs naturally. Matches jq's semantics where filters produce zero or more values.
- **Lazy command loading** - commands imported on first use via dynamic `import()`. Reduces startup cost for scripts that use few commands.
- **Read-through OverlayFs** - overlays a host directory in memory. Uses sync `node:fs` APIs because the FileSystem interface allows `string | Promise<string>` returns and sync is simpler for a read-through layer. Host is never written to.
- **Network delegation** - curl never makes real HTTP requests. All network access is delegated to a consumer-provided handler function via `ShellOptions.network`.
17 changes: 13 additions & 4 deletions docs/design/commands.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
# Commands

One file per command, lazy-loaded on first use. 61 registered default commands + 27 shell builtins.
One file per command, lazy-loaded on first use. 65 registered default commands + 27 shell builtins.

## Files

| File | Role |
|------|------|
| `src/commands/types.ts` | Core types: Command, CommandContext, CommandResult, LazyCommandDef |
| `src/commands/registry.ts` | Dual-track registry with lazy loading and caching |
| `src/commands/defaults.ts` | 61 default command registrations |
| `src/commands/defaults.ts` | 65 default command registrations |
| `src/commands/<name>.ts` | One implementation file per command |

## Key Types
Expand Down Expand Up @@ -116,9 +116,18 @@ const shell = new Shell({

Custom commands participate fully in pipes, redirections, and all shell features.

## Default Commands (61)
## Default Commands (65)

awk, base64, basename, cat, chmod, column, comm, cp, cut, date, diff, dirname, du, echo, env, expand, expr, file, find, fold, grep, head, hostname, join, jq, ln, ls, md5sum, mkdir, mv, nl, od, paste, printenv, printf, pwd, readlink, realpath, rev, rm, rmdir, sed, seq, sha1sum, sha256sum, sleep, sort, stat, strings, tac, tail, tee, touch, tr, tree, unexpand, uniq, wc, which, whoami, xargs
awk, base64, basename, cat, chmod, column, comm, cp, curl, cut, date, diff, dirname, du, echo, env, expand, expr, file, find, fold, grep, head, hostname, join, jq, ln, ls, md5sum, mkdir, mv, nl, od, paste, printenv, printf, pwd, readlink, realpath, rev, rm, rmdir, sed, seq, sha1sum, sha256sum, sleep, sort, stat, strings, tac, tail, tee, timeout, touch, tr, tree, unexpand, uniq, wc, which, whoami, xargs, xxd, yes

## Commands with Non-obvious Behavior

| Command | Behavior |
|---------|----------|
| `curl` | Delegates to `ShellOptions.network.handler` callback; core stays network-free. Flags: `-X`, `-H`, `-d`, `-o`, `-O`, `-s`, `-L`, `-f`, `-w`. Hostname allowlist via glob. Exit 7 on rejection |
| `timeout` | `Promise.race` between `ctx.exec()` and `setTimeout`. Exit 124 on expiry. Duration 0 means no timeout. Virtual `sleep` returns instantly, so `timeout 5 sleep 100` completes immediately rather than timing out |
| `yes` | Output capped by `SHELL_MAX_OUTPUT` env var (default 10MB) to prevent unbounded string growth |
| `xxd` | Basic hex dump only (no `-r` reverse). `-l` limit, `-s` offset |

## Gotchas

Expand Down
44 changes: 43 additions & 1 deletion docs/design/filesystem.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,15 @@
# Filesystem

In-memory virtual filesystem. Flat Map storage, lazy file content, no real OS interaction.
Two filesystem implementations: InMemoryFs (default, pure ECMAScript) and OverlayFs (read-through overlay on host directory, uses node:fs).

## Files

| File | Lines | Role |
|------|-------|------|
| `src/fs/types.ts` | 172 | FileSystem interface, FsError, LazyFileContent type |
| `src/fs/memory.ts` | 592 | InMemoryFs implementation |
| `src/overlay/index.ts` | ~350 | OverlayFs implementation |
| `src/overlay/types.ts` | 20 | OverlayFsOptions, ChangeSet, FileChange |

## Storage Model

Expand Down Expand Up @@ -91,3 +93,43 @@ All virtual devices have mode `0o666`.
- **chmod is informational only.** Mode bits are stored but never enforced. `cat` reads any file regardless of permissions. This is intentional - permission enforcement adds complexity without real security value in a virtual FS.
- **No hard links.** Only symlinks are supported.
- **Directory listing is O(n).** Scans all map keys with matching prefix. Fine for typical AI agent scripts, but not for filesystems with millions of entries.

## OverlayFs

Read-through overlay that combines a real host directory with an in-memory write layer. Available as `@mylocalgpt/shell/overlay`.

### Two-Layer Architecture

```
Read: memory Map -> host FS (read-only, via node:fs)
Write: always to memory Map
Delete: adds to deletedPaths Set, shadows host files
```

The host filesystem is never modified. All mutations stay in memory.

### getChanges()

Returns a `ChangeSet` with three arrays:
- `created`: files written to memory that did not exist on host at first-write time
- `modified`: files written to memory that did exist on host at first-write time
- `deleted`: paths marked as deleted (shadowing host files)

Host existence is checked at write time (not construction time) to handle files created on host after overlay initialization.

### Path Filtering

`allowPaths` and `denyPaths` options use glob patterns to control which host paths are readable:
- `denyPaths`: matching paths return ENOENT even if they exist on host
- `allowPaths`: only matching paths are readable; everything else returns ENOENT
- Neither set: all paths readable

### Sync node:fs APIs

OverlayFs uses `readFileSync`, `statSync`, `readdirSync` because the FileSystem interface allows sync string returns and sync is simpler for a read-through layer. This is the only part of the project that imports `node:` modules.

### Security Properties

- Host writes are architecturally impossible (no `writeFileSync` calls)
- `realpath` rejects paths that resolve outside the root directory (prevents symlink escape)
- Path filtering via allowPaths/denyPaths blocks unauthorized reads
Loading
Loading