v1.0.0: Live monitoring + circuit breakers

## Problem

All current agent-trace commands are post-hoc. You analyze a session after it finishes. But the most expensive failures are the ones you could have stopped mid-session:

- Agent retrying the same failing command 15 times
- Cost spiraling past $20 on a simple task
- Agent writing to production config files
- Infinite loop: read → edit → test → fail → read → edit → test → fail

You need real-time monitoring with the ability to kill a runaway session.

## Proposal

### `agent-strace watch`

Real-time session monitoring with configurable circuit breakers. Watches the active session's event stream and triggers alerts or kills the session when thresholds are exceeded.

**Built-in watchers:**

| Watcher | Trigger | Default |
|---------|---------|---------|
| RetryWatcher | Same command fails N times | 5 retries |
| CostWatcher | Estimated cost exceeds threshold | $10 |
| ScopeWatcher | Agent touches file outside `.agent-scope.json` policy | any violation |
| DurationWatcher | Session exceeds time limit | 30 minutes |
| LoopWatcher | Same sequence of 3+ events repeats N times | 3 repetitions |

**Alert actions:**
- `terminal` — print warning to stderr
- `file` — append to `.agent-traces/alerts.log`
- `webhook` — POST JSON to a URL (Slack, PagerDuty, etc.)
- `kill` — send SIGTERM to the agent process

### CLI

```bash
# Quick start with defaults
agent-strace watch

# Custom thresholds
agent-strace watch --max-retries 3 --max-cost 5 --max-duration 10m --alert terminal

# Kill on violation
agent-strace watch --max-retries 3 --on-violation kill

# Config file
agent-strace watch --config .agent-watch.json
```

### Config file format (`.agent-watch.json`)

```json
{
  "watchers": {
    "retry": { "max": 3, "alert": "terminal" },
    "cost": { "max_dollars": 5.0, "alert": "terminal" },
    "scope": { "policy": ".agent-scope.json", "alert": "kill" },
    "duration": { "max_minutes": 15, "alert": "terminal" },
    "loop": { "sequence_length": 3, "max_repeats": 3, "alert": "file" }
  },
  "webhook": {
    "url": "https://hooks.slack.com/services/...",
    "events": ["cost", "scope"]
  }
}
```

### How it works

1. Tail the active session's `events.ndjson` file (or hook into the event callback pipeline in `hooks.py`)
2. Each new event is passed through all registered watchers
3. Watchers maintain state (retry count, running cost, event history for loop detection)
4. When a watcher triggers, execute the configured alert action
5. For `kill`: find the agent PID from session metadata and send SIGTERM

### Terminal output (live)

```
[watch] Monitoring session abc123...
[watch] +0:30  ✅ 5 events, $0.12, 0 retries
[watch] +1:05  ⚠️  RetryWatcher: "python -m pytest" failed 3 times
[watch] +2:30  ⚠️  CostWatcher: $5.12 (threshold: $5.00)
[watch] +3:00  ❌ LoopWatcher: detected loop (read→edit→test) × 4 — killing session
```

## Implementation

- New file: `src/agent_trace/watch.py`
- Event tailing: poll `events.ndjson` with `os.stat` for size changes, read new lines
- Webhook: stdlib `urllib.request` (no requests library)
- Process kill: `os.kill(pid, signal.SIGTERM)`
- New tests: `tests/test_watch.py` (~9 tests)
  - test_retry_detection
  - test_cost_threshold
  - test_scope_violation
  - test_duration_limit
  - test_loop_detection
  - test_alert_terminal
  - test_alert_webhook
  - test_config_file_loading
  - test_session_kill
- CLI: add `watch` subcommand to `cli.py`
- Depends on: `cost.py` (#3) for cost estimation, `audit.py` (#7) for scope checking
- Zero new dependencies

## Constraint

Python stdlib only. HTTP via `urllib.request`. Process management via `os` and `signal`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v1.0.0: Live monitoring + circuit breakers #8

Problem

Proposal

`agent-strace watch`

CLI

Config file format (`.agent-watch.json`)

How it works

Terminal output (live)

Implementation

Constraint

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Watcher	Trigger	Default
RetryWatcher	Same command fails N times	5 retries
CostWatcher	Estimated cost exceeds threshold	$10
ScopeWatcher	Agent touches file outside `.agent-scope.json` policy	any violation
DurationWatcher	Session exceeds time limit	30 minutes
LoopWatcher	Same sequence of 3+ events repeats N times	3 repetitions

Uh oh!

v1.0.0: Live monitoring + circuit breakers #8

Description

Problem

Proposal

agent-strace watch

CLI

Config file format (.agent-watch.json)

How it works

Terminal output (live)

Implementation

Constraint

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

`agent-strace watch`

Config file format (`.agent-watch.json`)