You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
All current agent-trace commands are post-hoc. You analyze a session after it finishes. But the most expensive failures are the ones you could have stopped mid-session:
Agent retrying the same failing command 15 times
Cost spiraling past $20 on a simple task
Agent writing to production config files
Infinite loop: read → edit → test → fail → read → edit → test → fail
You need real-time monitoring with the ability to kill a runaway session.
Proposal
agent-strace watch
Real-time session monitoring with configurable circuit breakers. Watches the active session's event stream and triggers alerts or kills the session when thresholds are exceeded.
Problem
All current agent-trace commands are post-hoc. You analyze a session after it finishes. But the most expensive failures are the ones you could have stopped mid-session:
You need real-time monitoring with the ability to kill a runaway session.
Proposal
agent-strace watchReal-time session monitoring with configurable circuit breakers. Watches the active session's event stream and triggers alerts or kills the session when thresholds are exceeded.
Built-in watchers:
.agent-scope.jsonpolicyAlert actions:
terminal— print warning to stderrfile— append to.agent-traces/alerts.logwebhook— POST JSON to a URL (Slack, PagerDuty, etc.)kill— send SIGTERM to the agent processCLI
Config file format (
.agent-watch.json){ "watchers": { "retry": { "max": 3, "alert": "terminal" }, "cost": { "max_dollars": 5.0, "alert": "terminal" }, "scope": { "policy": ".agent-scope.json", "alert": "kill" }, "duration": { "max_minutes": 15, "alert": "terminal" }, "loop": { "sequence_length": 3, "max_repeats": 3, "alert": "file" } }, "webhook": { "url": "https://hooks.slack.com/services/...", "events": ["cost", "scope"] } }How it works
events.ndjsonfile (or hook into the event callback pipeline inhooks.py)kill: find the agent PID from session metadata and send SIGTERMTerminal output (live)
Implementation
src/agent_trace/watch.pyevents.ndjsonwithos.statfor size changes, read new linesurllib.request(no requests library)os.kill(pid, signal.SIGTERM)tests/test_watch.py(~9 tests)watchsubcommand tocli.pycost.py(v0.4.0: Session explain + cost estimation #3) for cost estimation,audit.py(v0.7.0: Permission audit trail #7) for scope checkingConstraint
Python stdlib only. HTTP via
urllib.request. Process management viaosandsignal.