B3: concurrency limit with 429 backpressure by MrTravisB · Pull Request #404 · mozilla/pilo

MrTravisB · 2026-04-27T19:04:45Z

Summary

Third PR in Stack B (server observability + capacity protection). Stacked
on #403 (B2).

Caps how many tasks can run concurrently across all SSE and WebSocket
connections. When at the limit, new requests get an immediate
AT_CAPACITY rejection (HTTP 429 with Retry-After for SSE; structured
error event for WS) instead of queuing indefinitely on the server and
tying up file descriptors and memory.

Why

Today, the server accepts unbounded concurrent tasks. If the AI provider
slows down or browser pool is saturated, requests pile up holding SSE
streams open, exhausting FDs and burning memory. A bounded queue with
clean backpressure lets the caller (tabs-api) handle overload gracefully
rather than waiting forever.

Behavior

New env var: PILO_MAX_CONCURRENT_TASKS (default 10). Read lazily so
tests can override at runtime; production sets once at startup.
Each accepted task increments an in-process counter; release happens in
the finally of the SSE stream callback / WS task execution block.
When at capacity:
- SSE: HTTP 429 with Retry-After: 30 header and the structured
  error shape (code: "AT_CAPACITY", reason: "AT_CAPACITY",
  phase: "setup", taskId).
- WS: an error event with the same structured payload. Connection
  stays open so the client can retry without reconnecting.
Slot is released on every termination path: success, error, abort.
Validation errors and the per-WS-connection taskRunning check fire
before the capacity check, so misconfigured requests don't burn slots.

Changes

packages/server/src/concurrency.ts - new module: tryAcquire,
release, getInflight, getMaxConcurrent, plus _resetInflight for
tests
packages/server/src/taskRunner.ts - add AT_CAPACITY to the
ErrorReason enum and REASON_HINTS map
packages/server/src/routes/pilo.ts - acquire after validation, return
429 with Retry-After if at capacity, release in stream finally
packages/server/src/routes/piloWs.ts - acquire after validation,
send AT_CAPACITY error event if at capacity, release in finally
16 new tests:
- test/concurrency.test.ts (11): tryAcquire/release semantics, env
  var parsing, default/fallback handling
- routes/pilo.test.ts (+2): 429 response shape, slot release after
  successful task
- routes/piloWs.test.ts (+3): AT_CAPACITY error + no runTask call,
  slot release after success, slot release after thrown error

Out of scope (explicit)

OTel pilo.inflight_tasks gauge metric - mentioned in the original
plan but deferred to Stack C2's metrics bridge, where we'll already be
setting up the OTel metrics SDK pattern. Adding it here in isolation
would duplicate that work.
/health exposure of inflight count - dropped per offline scoping
decision (we're not changing the health endpoint in Stack B).
Multi-instance concurrency - the counter is in-process. Fine for
single-instance deployment. Multi-replica would need Redis-backed
tokens or a load-balancer-level queue depth check; track separately.

Test plan

pnpm --filter pilo-server run test (129 passing: 114 prior + 15 new)
pnpm --filter pilo-core run test (still green)
pnpm run typecheck green
pnpm run format:check green

Notes

Base: travis/pilo-sentry-scrubber (stacks on B2: scrub Sentry events and add per-request scope tags #403)
Default limit 10 matches typical browserless instance sizing; override
via PILO_MAX_CONCURRENT_TASKS if your deployment has different headroom
Completes Stack B. Stack C (core observability) is next

lmorchard

We run multiple replicas in production for Pilo server, so I think in-process won't quite work there?

lmorchard · 2026-04-28T21:18:17Z

+ * In-process only — fine for the current single-instance deployment. If we
+ * ever scale to multiple replicas, replace with a Redis-backed token bucket
+ * or rely on the load balancer's queue depth.


Hmm, we might need this sooner than later for Pilo server production: looks like we currently run at least 3 replicas there and a max of 30. So, this would only cap concurrency per replica?

feat: concurrency limit with 429 backpressure

fa1fa81

lmorchard reviewed Apr 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

B3: concurrency limit with 429 backpressure#404

B3: concurrency limit with 429 backpressure#404
MrTravisB wants to merge 1 commit intotravis/pilo-sentry-scrubberfrom
travis/pilo-concurrency-limit

MrTravisB commented Apr 27, 2026

Uh oh!

lmorchard left a comment

Uh oh!

lmorchard Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

MrTravisB commented Apr 27, 2026

Summary

Why

Behavior

Changes

Out of scope (explicit)

Test plan

Notes

Uh oh!

lmorchard left a comment

Choose a reason for hiding this comment

Uh oh!

lmorchard Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants