Queue Benchmark Results — Variance Pass + Follow-up Items

Run parameters: 3 runs per (library, scenario), per-scenario worker counts and durations, Postgres 16 on docker-compose. Host: Apple Silicon (darwin/arm64), Go 1.26.0. Libraries: River v0.35.0, rstudio/platform-lib v3.0.6. Date: 2026-04-21. Runs analyzed: 54 across 18 scenario × library pairs.

This is the third benchmark pass, combining:

Variance pass: 3 runs per cell across 7 scenarios (original 5 + steady_under / steady_balanced).
Item 1: platform-lib adapter now supports exponential backoff (ExponentialBackoffRiverLike), and rate_limit_pressure was re-run with it enabled for apples-to-apples retry-shape comparison.
Item 2: new noisy_neighbor_saturated scenario (100 Hz / 10 workers / 90% tenant skew) to force real starvation pressure.
Item 3: new high_scale scenario (300 Hz / 100 workers / 10 tenants, under-capacity) to test LISTEN/NOTIFY scaling with absolute rate.
Items 4 (crash_recovery) and 5 (rscache integration) are documented in docs/FUTURE.md as deferred with concrete design notes.

TL;DR — what we've learned

Finding	Where it matters	Evidence	Confidence
LISTEN/NOTIFY responsiveness — platform-lib 10–60× faster on pickup latency when not saturated	`notify_latency`, `steady_under`, `high_scale`, `noisy_neighbor`	p95 gap consistent 16–63×, tight run-to-run variance	High
Saturated throughput advantage — platform-lib completes 25–30% more jobs in the same wall-clock under backlog	`burst`, `steady_over`, `noisy_neighbor_saturated`	Completions per second consistent across runs	Medium — may be adapter-implementation sensitive
LISTEN/NOTIFY advantage scales with rate — at 300 Hz the gap widens to 63×, not narrows	`high_scale`	4.5ms vs 285ms p95 at 300 Hz	High
Retry shape, not count — both libraries discard the same jobs; their backoff shapes differ by design	`rate_limit_pressure` (both configurations)	Identical failure counts; p95 1000× different by retry policy	High, but the backoff implementations are not identical-by-construction
Natural FIFO fairness — both libraries produce ~1.0 fairness ratios even under saturated skew	`noisy_neighbor`, `noisy_neighbor_saturated`	Per-tenant p95 near equal; FIFO by design	High
Goroutine footprint — River uses ~2× more goroutines than platform-lib	All scenarios	Maxes consistent across runs	High

Biggest architectural distinction: platform-lib prioritizes pickup latency responsiveness; River prioritizes durability, leader-election resilience, and operational batteries-included. Both are correct choices for their designs.

Pickup latency under under-capacity — the headline

When workers are not saturated (service rate > enqueue rate), pickup latency is the LISTEN/NOTIFY round-trip cost. Across four distinct scenarios at different rates:

Scenario	platform-lib p95	River p95	Ratio
`notify_latency` (20 Hz, zero-work)	5.7 ms	102.7 ms	18×
`steady_under` (30 Hz, 20 workers)	6.3 ms	101.5 ms	16×
`noisy_neighbor` (50 Hz, 20 workers)	4.7 ms	92.8 ms	20×
`high_scale` (300 Hz, 100 workers)	4.5 ms	285.0 ms	63×

high_scale is the new data point. At 6× higher enqueue rate than the other unsaturated scenarios:

platform-lib's pickup p95 actually improves (4.5 ms vs 5.7 ms) — higher rate keeps the LISTEN/NOTIFY path warm.
River's pickup p95 degrades (285 ms vs 102 ms) — more jobs arrive between polling ticks, so the last-arrived in each tick waits longer.

The LISTEN/NOTIFY advantage scales with load. It's not an "at idle" effect; it's structural.

Per-scenario detail (under-capacity)

Library	Scenario	Runs	Throughput (c/s)	Pickup p50	Pickup p95 (med, min–max)	Pickup p99
platform-lib	`notify_latency`	3	10.0	4.0	5.7 (5.7–7.1)	7.3
platform-lib	`steady_under`	3	15.0	3.5	6.3 (5.3–7.7)	8.3
platform-lib	`noisy_neighbor`	3	24.9	2.8	4.7 (4.6–6.6)	7.2
platform-lib	`high_scale`	3	149.5	2.0	4.5 (4.3–4.7)	6.1
River	`notify_latency`	3	10.0	51.4	102.7 (102.1–103.2)	105.1
River	`steady_under`	3	15.0	37.9	101.5 (101.4–101.7)	104.8
River	`noisy_neighbor`	3	25.0	44.9	92.8 (90.7–95.3)	103.8
River	`high_scale`	3	149.9	71.1	285.0 (276.5–291.4)	315.4

All run-to-run spreads are within ±5% of median.

Saturated scenarios — queue-wait dominates, but platform-lib still pulls ahead

When enqueue rate exceeds service rate, pickup latency is dominated by queue-wait time, not library coordination. But platform-lib still consistently delivers more work per unit time.

Scenario	Lib	Throughput (c/s)	Pickup p95	Completed
`steady_over` (50 Hz / 10 workers)	platform-lib	24.9	6973 ms	1499
	River	25.0	15927 ms	1499
`burst` (50 Hz + 10× spike × 5s / 10 workers)	platform-lib	40.8	48015 ms	2574
	River	32.8	48953 ms	2081
`noisy_neighbor_saturated` (100 Hz / 10 workers / 90% skew)	platform-lib	39.2	33261 ms	2480
	River	31.0	39416 ms	1916

In burst and noisy_neighbor_saturated, platform-lib completes 25–30% more jobs in the same wall-clock. The pickup p95 numbers are in the 30-50 second range for both libraries because the queue has built up tens of seconds of backlog.

Why more completions? Under saturation, every worker slot that's not waiting on LISTEN/NOTIFY handoff is one more job completed. platform-lib's LISTEN/NOTIFY-first dispatch keeps workers busier. River's polling interval includes idle gaps that compound across thousands of jobs.

Caveat: these numbers are sensitive to adapter-implementation choices. My platform-lib QueueStore uses FOR UPDATE SKIP LOCKED with a fast-path idle count — decisions a production-tuned store might make differently. A batch-pop store could produce different throughput numbers.

`noisy_neighbor_saturated` — fairness under starvation pressure

With one tenant enqueuing 90% of 100 Hz against 10 workers (saturated), does FIFO still produce fair latencies?

Fairness ratios (noisy p95 / quiet p95)

Library	Run 1	Run 2	Run 3
platform-lib	1.18	1.17	1.18
River	0.97	0.97	0.98

A ratio near 1.0 means no starvation. Notably, platform-lib's 1.18 means the noisy tenant has slightly slower p95 than the quiet tenant — the opposite of starvation. This is because noisy-tenant enqueues are time-clustered at the tail of the run (they keep arriving at 90 Hz until t=30s), while quiet-tenant enqueues are sparse, so the quiet tenant's last enqueue lands earlier in the run and therefore experiences less queue wait. FIFO is temporally fair.

River's 0.97 is essentially perfect fairness — the small deviation is sampling noise.

Neither library has per-tenant priority or fairness; both rely on FIFO. Under saturation, FIFO produces proportionate delays that roughly match enqueue-time arrival. No tenant is systematically disadvantaged.

Completion share under saturation

Library	Total completed	Noisiest tenant completed	Share
platform-lib	2480	~2236	90%
River	1916	~1727	90%

Both libraries complete work proportionate to enqueue share (the noisy tenant contributed 90% of enqueues and gets 90% of completions). This is a direct consequence of FIFO + no per-tenant worker allocation.

For true fairness enforcement you'd need to add a per-tenant worker cap or weighted fair-queuing on top of either library. Neither provides it natively.

`rate_limit_pressure` — retry shape comparison

Both libraries configured with MaxAttempts=3. platform-lib adapter tested both without backoff (immediate re-enqueue) and with ExponentialBackoffRiverLike (matches River's 1s/16s/81s spacing).

Config	Completed	Failed events	Discarded	Pickup p50	Pickup p95
platform-lib (immediate retry)	626	2454	273	4.5 ms	15.9 ms
platform-lib (River-like backoff)	626	2457	273	4.0 ms	17017 ms
River (native backoff)	626	2457	273	621.8 ms	18693 ms

With retry shapes matched, the outcome counts are identical. What differs:

platform-lib p50 stays at ~4ms even with backoff; River's p50 is ~622ms.
platform-lib p95 is slightly lower than River's (17.0s vs 18.7s) at matched backoff, with much tighter variance (17016–17017 vs 18633–18837).

Why the p50 difference?

My platform-lib adapter implements backoff with a goroutine-based time.AfterFunc pattern — when a job fails and is scheduled for retry, the worker slot is freed immediately and the retry fires on a background goroutine. Fresh enqueues don't contend with retry-waiting jobs for worker attention.

River implements backoff via Postgres scheduled_at — the retry job sits in the DB with a future timestamp and is eligible to be claimed by any worker once that timestamp has passed. This approach is durable across process restarts (a crash loses nothing) but means fresh enqueues at t+1s are competing with retry jobs for worker attention, pushing up pickup p50.

Both are defensible designs:

Goroutine-based backoff (platform-lib adapter): lower pickup latency, but retry is lost if the process dies during backoff. Appropriate for non-durable retry semantics.
DB-based backoff (River native): slightly higher pickup latency under retry pressure, but durable across restarts. Appropriate for production systems where retry-loss is unacceptable.

Caveat: this goroutine-based backoff is an adapter choice, not platform-lib's native behavior. A production platform-lib user building retry semantics would likely choose DB-based backoff (same pattern as River) for durability. This benchmark measures my implementation, not platform-lib's inherent approach.

Resource usage

Scenario	platform-lib goroutines	River goroutines	platform-lib RSS MB	River RSS MB
`notify_latency`	14	40	18.2	18.3
`steady_under`	36	62	18.6	18.5
`noisy_neighbor`	44	72	18.8	19.2
`steady_over`	35	60	19.1	18.8
`steady_balanced`	55	74	19.3	19.1
`burst`	35	60	23.1	19.1
`noisy_neighbor_saturated`	35	60	19.0	19.1
`rate_limit_pressure` (no backoff)	32	56	19.0	19.3
`rate_limit_pressure` (backoff)	193	58	19.0	19.3
`high_scale`	182	198	35.9	24.4

Two observations:

River consistently runs ~2× more goroutines than platform-lib in most scenarios. Likely due to River's periodic/leader-election/metrics goroutines. Roughly flat across load.
platform-lib goroutines spike under retry-with-backoff and under high enqueue rate. This reflects my adapter's goroutine-per-backoff-job implementation — each scheduled retry creates a goroutine that sleeps then re-enqueues. At rate_limit with backoff, 273 discards × 3 attempts = 819 goroutines spawned across the run. A pool-based scheduler would cap this.

When would we switch to platform-lib from River?

Based on this data, the case for platform-lib is strongest when:

Low-latency pickup is a product requirement. Near-zero queue-idle latency matters: user-facing "optimistic action" paths, live-UI notifications, any scenario where a few hundred ms of queue lag is visible in UX.
High rate of small jobs. The LISTEN/NOTIFY advantage widens with rate (63× at 300 Hz vs 18× at 20 Hz in this benchmark). If your workload is many-short-jobs, platform-lib's dispatch overhead stays flat.
The architectural adjacents fit. platform-lib's unique value is the integrated cache + queue + broadcaster set. If you're building a system that wants all three unified, the ecosystem fit is better than bolting cache onto River.

The case for staying on River is strongest when:

Adapter complexity matters. The River adapter in this repo is ~300 lines and uses riverpgxv5.New(pool) for a production-tested Postgres driver plus rivermigrate.Migrate for automatic schema setup. The platform-lib adapter is ~1,080 lines because platform-lib ships the QueueStore interface but not a Postgres implementation — I had to hand-write ~440 lines of Postgres CRUD and a hand-maintained SQL schema. For a team that wants fewer moving parts, River gives you more out of the box.
Leader election is needed. During these benchmark runs, River's logs showed a live leadership.Elector subsystem (e.g., Current leader stepping down because the reelection deadline elapsed). River ships this for multi-instance coordination; platform-lib treats it as a composition concern you'd wire on top of rsnotify.
Durable retry / crash resilience matters. River's DB-based retry persists through restarts. platform-lib's retry story is "build your own" — this benchmark's adapter uses time.AfterFunc-style goroutines which would lose retries on crash. A durable platform-lib retry would need a scheduler layer on top of rsqueue.Push.
You're doing 100–1000 Hz steady-state and can tolerate 100–300 ms pickup latency. River's pickup at moderate rate is below 150ms p99 at lower rates and ~315ms p99 at 300 Hz — acceptable for most SaaS workloads.

What I am NOT claiming (and did not measure in this benchmark):

Relative quality of middleware ecosystems, CLI tooling, metrics integrations, observability hooks, or documentation. River has river-cli and rivermetrics; platform-lib has rsqueue/metrics with typed hook interfaces you implement yourself. A head-to-head feature matrix would be a separate research task.
Transactional enqueue asymmetry. I initially implied platform-lib was weaker here, but plqueue.Queue.WithDbTx(ctx, tx) supports the same pattern as River's InsertTx. Both can enqueue atomically with an application transaction.

For Keavi specifically: the LISTEN/NOTIFY advantage matters (the SSE and conversational paths care about sub-100ms responsiveness), but River's out-of-the-box adapter simplicity and leader-election subsystem also matter (Keavi has 55 workers across many job types; building a custom QueueStore + scheduler + leader coordination is work you don't have to do today). A hybrid approach — stay on River for queuing, adopt platform-lib's cache module independently — remains the recommendation from the earlier audit. See project_future_platformlib_cache.md in Keavi's memory.

What's next (in order of value)

Deferred work is documented in docs/FUTURE.md with concrete design notes. Ordered by informational value vs. effort:

Jitter in ExponentialBackoffRiverLike — matches River's ±10% jitter so retry-shape is fully apples-to-apples. ~30 min.
Multi-process scenario — two adapter processes consuming the same queue. Exercises cross-process LISTEN/NOTIFY. ~1 hour.
crash_recovery — SIGKILL the worker process mid-burst; measure duplicates and resume time. Needs parent/child process split. ~2–4 hours.
rscache + AddressedPush integration — a capability benchmark of platform-lib's unique cache-integrated queue. Not head-to-head (River has no equivalent). ~2–4 hours.
DB-backed backoff in the platform-lib adapter — mirror River's durability model so backoff-enabled comparisons are truly apples-to-apples. Requires building a scheduler layer on top of rsqueue's Push. ~1–2 hours.

Methodology + caveats

See docs/METHODOLOGY.md for the workload model and metric definitions. Raw JSONL data lives in results/ (gitignored; reproducible via scripts/run-all.sh and scripts/run-items-2-3.sh).

Things this benchmark doesn't measure:

Cross-process coordination (deferred).
Crash resilience / recovery behavior (deferred).
Real-world workload patterns (synthetic time.Sleep approximates work).
Long-running stability (30s runs — memory leaks / GC pressure over hours are invisible).
Failure modes beyond the three simulated error classes.

Things this benchmark measures well:

LISTEN/NOTIFY pickup latency, under four different pressure regimes, with tight run-to-run variance.
Throughput differences at saturation.
Retry-shape semantics (with the backoff asymmetry honestly documented).
Natural fairness under tenant skew.

License

MIT. See LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Queue Benchmark Results — Variance Pass + Follow-up Items

TL;DR — what we've learned

Pickup latency under under-capacity — the headline

Per-scenario detail (under-capacity)

Saturated scenarios — queue-wait dominates, but platform-lib still pulls ahead

`noisy_neighbor_saturated` — fairness under starvation pressure

Fairness ratios (noisy p95 / quiet p95)

Completion share under saturation

`rate_limit_pressure` — retry shape comparison

Why the p50 difference?

Resource usage

When would we switch to platform-lib from River?

What's next (in order of value)

Methodology + caveats

License

FilesExpand file tree

REPORT.md

Latest commit

History

REPORT.md

File metadata and controls

Queue Benchmark Results — Variance Pass + Follow-up Items

TL;DR — what we've learned

Pickup latency under under-capacity — the headline

Per-scenario detail (under-capacity)

Saturated scenarios — queue-wait dominates, but platform-lib still pulls ahead

noisy_neighbor_saturated — fairness under starvation pressure

Fairness ratios (noisy p95 / quiet p95)

Completion share under saturation

rate_limit_pressure — retry shape comparison

Why the p50 difference?

Resource usage

When would we switch to platform-lib from River?

What's next (in order of value)

Methodology + caveats

License

`noisy_neighbor_saturated` — fairness under starvation pressure

`rate_limit_pressure` — retry shape comparison