Source
ChatGPT architecture review feedback
Problem
No global backpressure mechanism for streaming responses or multi-agent spawning. Unbounded task spawning with concurrent streaming can cause:
- Resource exhaustion under heavy swarm workloads
- Flaky behavior when many agents stream simultaneously
- No structured cancellation on session end
Proposal
- Global semaphore (
MAX_INFLIGHT_STREAMS) to cap concurrent LLM streaming responses
- Per-agent semaphore to limit tool execution concurrency within each agent
- Structured cancellation — when a session ends or ESC is pressed, all spawned tasks are cancelled cleanly via
CancellationToken (tokio-util) rather than just setting an AtomicBool
- Backpressure on tool results — if an agent's result queue is full, slow down rather than drop
Relevant Code
src/agent/mod.rs — cancelled: Arc<AtomicBool> (current cancellation)
src/agent/execution.rs — tool execution spawning
src/orchestration/parallel.rs — parallel orchestration
Priority
P1 — stability under load
Source
ChatGPT architecture review feedback
Problem
No global backpressure mechanism for streaming responses or multi-agent spawning. Unbounded task spawning with concurrent streaming can cause:
Proposal
MAX_INFLIGHT_STREAMS) to cap concurrent LLM streaming responsesCancellationToken(tokio-util) rather than just setting anAtomicBoolRelevant Code
src/agent/mod.rs—cancelled: Arc<AtomicBool>(current cancellation)src/agent/execution.rs— tool execution spawningsrc/orchestration/parallel.rs— parallel orchestrationPriority
P1 — stability under load