Skip to content

BOBS write routing, streaming POST, and per-pod worker concurrency#192

Merged
jameshawkes merged 2 commits into
upstreamfrom
majh/test-worker-fast-stress-stream
May 15, 2026
Merged

BOBS write routing, streaming POST, and per-pod worker concurrency#192
jameshawkes merged 2 commits into
upstreamfrom
majh/test-worker-fast-stress-stream

Conversation

@jameshawkes
Copy link
Copy Markdown
Contributor

@jameshawkes jameshawkes commented May 13, 2026

Summary

Three commits, all driven by the in-cluster BOBS user-download benchmark and the polytope-config tier-2 stress harness.

feat: add BOBS write routing and stress streaming (07e515d)

  • Workers route to the right BOBS replica using the structured request-id.
  • Streaming write path through BITS so test-worker output reaches BOBS without buffering the whole response in memory.

test: make stress-worker output cheap to generate (ce07286)

  • Replace per-byte synthetic stress payload generation with a lazy Stream.
  • Reuse one fixed Bytes buffer for full chunks so test-worker CPU does not dominate BOBS throughput tests.
  • Keep chunk boundaries and total byte counts deterministic.

Stream worker output to BOBS as one request and add per-pod worker concurrency (24fff3a)

Two real bottlenecks in the producer pipeline, both fixed:

  1. Per-chunk POST loop replaced with one streaming POST. workers/common/src/delivery/bobs.rs was issuing one POST per chunk to /write/{key}/{offset}; replaced with a single streaming POST through BOBS's accept-stream path. In-cluster benchmark: 230.8 MiB/s → 375.1 MiB/s (+63%), poll p50 15.8s → 10.1s.

  2. Per-pod worker concurrency. WorkerConfig.worker_concurrency: usize is now a required, validated (≥1) field; run_worker_loop spawns N tasks sharing Arc<P> processor and Arc<dyn ResultDelivery> and uses tokio::sync::watch for graceful shutdown. Heartbeat-per-job preserved. CLI flag --worker-concurrency + env override POLYTOPE_WORKER_CONCURRENCY (default 1) on test-worker, fdb-worker, mars-worker, polytope-fe-worker.

    Tests:

    • worker_loop_processes_jobs_concurrently_when_worker_concurrency_is_two
    • worker_loop_validates_nonzero_worker_concurrency

    Concurrency=8 was tried in-cluster and pushed BOBS past the contention knee on the dev vsphere PVC: aggregate dropped to 285 MiB/s, read p50 1.79s → 6.84s p95. Default stays 1; the knob is there for deployments with faster storage.

Verification

  • cargo fmt --all -- --check clean.
  • cargo test -p polytope-worker-common 5/5.
  • cargo test -p polytope-server-integration-tests 27/27.
  • cargo test --workspace -- --nocapture green.
  • Image build for the in-cluster benchmark: FIXED_TAG=majh-dev skaffold build -b eccr.ecmwf.int/polytope/test-worker then podman push eccr.ecmwf.int/polytope/test-worker:majh-dev.
  • In-cluster polytope-config test_bobs_in_cluster_full_range_throughput ran end-to-end at 375 MiB/s aggregate with these changes; standalone bobs-benchmark in-cluster reaches 378 MiB/s under the same fan-out, confirming the polytope pipeline is at parity with the BOBS-only ceiling on the dev cluster.

@jameshawkes jameshawkes changed the title test: make stress-worker output cheap to generate BOBS write routing, streaming POST, and per-pod worker concurrency May 15, 2026
@jameshawkes jameshawkes merged commit 804b3a9 into upstream May 15, 2026
4 checks passed
@jameshawkes jameshawkes deleted the majh/test-worker-fast-stress-stream branch May 15, 2026 20:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant