This file is a self-contained context dump for forking a new Claude Code session on this benchmark project. If you're starting fresh, read this first.
A vendor-neutral benchmark harness for Postgres-backed Go job queues. Currently compares River and rstudio/platform-lib's rsqueue. Full positioning is in README.md.
Intentional neutrality: the repo is public and framed as a generic multi-tenant-SaaS benchmark. Do not reference Keavi (the product that motivated this benchmark) or any financial-app terminology in commits, code, or docs — the patterns are generic enough to be broadly interesting.
All three phases shipped, plus a variance pass and a follow-up round. 54 benchmark runs across 18 scenario × library pairs. Final numbers live in REPORT.md at the repo root.
- Phase 1 (
9b84909): harness bootstrap,Queueinterface, JSONL recorder, docker-compose Postgres, skeleton CLI. - Phase 2 (
8250b2f,fd189dd): River adapter (~300 lines) + platform-lib adapter (~1,080 lines). Platform-lib required a hand-writtenQueueStorebecause no production Postgres impl ships in the library. - Phase 3A (
1189fc0): runner + workload generator + 5 scenarios + CLI. - Phase 3B + 4 (
e1273f6): JSONL analyzer (bench report),scripts/run-all.sh, firstREPORT.md. - Variance pass (
33894cf): N=3 per cell, warmup filter, median+spread, 7 scenarios. - Follow-up round (
c893e49,99fc190): exponential backoff in platform-lib adapter,noisy_neighbor_saturatedscenario,high_scalescenario.
docs/FUTURE.md has concrete plans for:
crash_recoveryscenario (parent/child process split for SIGKILL injection) — ~2–4h.rscache+AddressedPushintegration — ~2–4h.- Jitter in
ExponentialBackoffRiverLike— ~30 min. - Multi-process scenario — ~1h.
- DB-backed backoff in platform-lib adapter (durable retry, matches River's model) — ~1–2h.
- Adapter parity: both libraries expose the same
Harnessinterface (internal/queue/queue.go). Dispatching is by stringKind; a single bench-wideJobArgs/work type carries the kind for library APIs that want compile-time typing. - Generic job taxonomy:
document_process,entity_update,entity_enrich,tenant_rollup,tenant_snapshot,daily_coordinator,monthly_coordinator,notification_deliver. No financial/product-specific names. - Metrics format: newline-delimited JSON, one file per run, correlated by
job_id. Schema ininternal/metrics/recorder.go. - Warmup: 2-second warmup window excluded from latency statistics.
- Retry: platform-lib adapter implements retry via goroutine-based
time.AfterFunc— a deliberate choice that trades durability for responsiveness. OptionalBackoffFnmatches River'sattempt^4backoff for apples-to-apples.
- LISTEN/NOTIFY responsiveness: platform-lib pickup p95 is 16–63× faster than River under-capacity. Gap widens with rate (63× at 300 Hz, 18× at sparse rate). Reproducible, tight run-to-run variance.
- Saturated throughput: platform-lib completes ~25–30% more jobs/sec than River under backlog. Sensitive to adapter-implementation choices; directional rather than absolute.
- Retry shape: with matched backoff, both libraries produce identical retry counts. Pickup p95 differs ~1000× due to goroutine-vs-DB scheduling — a durability-vs-responsiveness tradeoff, not a correctness gap.
- Fairness: natural FIFO fairness in both libraries; neither has per-tenant priority.
- Resource usage: River runs ~2× more goroutines (leader-election, periodic workers). RSS roughly similar.
- One hardware datapoint (Apple Silicon, Postgres 16 in docker-compose). Real deployments may differ.
- Synthetic workload —
time.Sleepto model work, not real CPU/IO. - Adapter implementation sensitivity — particularly the platform-lib
QueueStoreusesFOR UPDATE SKIP LOCKEDwith a fast-path idle count; a different implementation could shift throughput numbers. - Only two of five follow-up items implemented; three deferred.
cd /Users/jonyoder/Dev/queue-benchmark
make up # start Postgres
QB_POSTGRES_URL='postgres://benchmark:benchmark@localhost:5433/benchmark?sslmode=disable' \
go test -p 1 ./... # verify everything still passes./bin/bench report --results-dir=./results --out=./results/REPORT.md./bin/bench run \
--lib=river \ # or platlib
--scenario=steady_under \ # see --help for list
--postgres-url=$QB_POSTGRES_URL \
--results-dir=./results \
--duration=30s \
--workers=20# ~30 min wall-clock, 42 runs
DURATION=30s RUNS=3 ./scripts/run-all.sh
# Separate follow-up scenarios (items 2+3)
./scripts/run-items-2-3.sh- Highest leverage: DB-backed backoff in the platform-lib adapter. Removes the "retry durability asymmetry" caveat and makes
rate_limit_pressurecomparable across implementation strategies, not just outcomes.docs/FUTURE.mdhas the design. - Most novel finding potential:
crash_recovery. Neither library has been tested under process-kill mid-run in this benchmark. Might surface real durability differences. Parent/child process split needed. - Biggest "completeness" win:
rscache+AddressedPushintegration. Benchmarks platform-lib's genuinely-unique architectural feature. Not head-to-head with River (River has no equivalent) — but a capability benchmark. - Easy and clarifying: add jitter to
ExponentialBackoffRiverLike. River uses ±10%; without jitter, my backoff has artificially-tight pickup-p95 variance. ~30 min.
/Users/jonyoder/Dev/queue-benchmark/
├── README.md # public-facing positioning
├── REPORT.md # current benchmark results + interpretation
├── HANDOFF.md # this file
├── LICENSE # MIT
├── Makefile # up/down/build/test/bench/report
├── docker-compose.yml # ephemeral Postgres 16 on port 5433
├── go.mod # go 1.26, deps: pgx, river, platform-lib, uuid
├── cmd/bench/main.go # CLI: `bench run` + `bench report`
├── internal/
│ ├── queue/
│ │ ├── queue.go # vendor-neutral Harness interface
│ │ ├── harness.go # Config, Simulate helper, error classes
│ │ ├── testhelp/pg.go # pgxpool test helpers
│ │ ├── river/adapter.go # River implementation (~300 lines)
│ │ └── platlib/ # platform-lib implementation (~1,080 lines)
│ │ ├── adapter.go # Harness wiring + agent loop
│ │ ├── store.go # QueueStore over pgx (~440 lines)
│ │ ├── schema.sql # Postgres DDL
│ │ ├── constants.go # notify type / channel constants
│ │ └── adapter_test.go # integration tests
│ ├── workload/
│ │ ├── workload.go # Spec, Generator, RateCurve
│ │ └── scenarios.go # 9 scenarios: steady_*, burst, noisy_*, rate_*, notify_*, high_scale
│ ├── runner/runner.go # Orchestration: generator → harness → recorder
│ ├── metrics/recorder.go # JSONL recorder
│ └── analyze/
│ ├── analyze.go # Reads JSONL, computes stats, renders Markdown
│ └── analyze_test.go # Percentile / warmup / aggregation tests
├── docs/
│ ├── METHODOLOGY.md # Workload model, metrics, reproducibility
│ └── FUTURE.md # Deferred work with design notes
├── scripts/
│ ├── run-all.sh # Full sweep: 7 scenarios × 2 libs × 3 runs
│ └── run-items-2-3.sh # Follow-up scenarios only
└── results/ # .gitignored; reproducible via scripts
- Go 1.26 required (platform-lib v3 requires it).
go mod tidyauto-downloads the toolchain. - macOS bash 3.2 compat: scripts avoid associative arrays. Use
caseblocks inside functions for per-scenario config. - Cross-package test parallelism:
go test ./...runs package-level tests in parallel, but both adapters share the same Postgres DB and truncate its schema. Always run tests with-p 1(the Makefile does this by default inmake test). - Docker pitfall:
lsof -ti:PORTreturns the docker backend proxy PID too. Don't kill blindly.
This benchmark was spawned from Keavi's scalability audit to answer: "should we switch from River to platform-lib?" The Keavi-side context is in Keavi's memory at memory/project_queue_benchmark_sprint.md. The answer — per the final REPORT — is stay on River for queuing; adopt platform-lib's cache module independently when LLM-response caching becomes a cost lever.
This file is intentionally silent on that motivation because the benchmark results are broadly useful to anyone evaluating these libraries — tying them to one product narrows the audience.