feat(cli): Add standalone --server and --client modes#110
feat(cli): Add standalone --server and --client modes#110
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
dustinblack
left a comment
There was a problem hiding this comment.
Good first step toward #11. The scope is right — standalone server/client as independent processes without touching transport internals. A few things to think about as this develops:
-
No integration with existing metrics/reporting. The standalone client does its own ad-hoc stats via
info!logging (mean/P50/P95/P99) rather than usingResultsManager/MetricsCollector. This means no streaming output, no JSON/CSV, no integration with the reporting pipeline or dashboard. Fine for an initial draft, but worth planning how to close that gap. -
Blocking/async code duplication. The blocking and async server/client paths are nearly identical except for
await. Consider whether a shared structure or helper could reduce the duplication — the current pattern will be a maintenance burden as features are added. -
Two server modes.
--server(new, user-facing) and--internal-run-as-server(existing, hidden) now coexist. The distinction is clear in code but worth documenting so users don't stumble into the internal flag, and so future contributors understand when to use which. -
Duration mode not supported. The standalone client only supports message-count mode (
-i). If someone passes-d, it will silently fall back to the default message count. Should either support it or reject it with a clear error.
Overall the approach is clean and well-tested. Nice to see 5 integration tests covering round-trip, one-way, ping/pong, retry, and shutdown.
This comment was marked as outdated.
This comment was marked as outdated.
|
Empty message count panic — If -i 0 or similar produces zero messages, the round-trip percentile computation will panic on an empty vector (latencies[msg_count / 2]). -i 0 can panic in round-trip summary due to divide/index operations on empty latency vectors. main.rs: Ignored config options — concurrency, send_delay, include_first_message, percentiles, streaming/output_file are all silently ignored. Should either be supported or rejected. No shutdown message — Client relies on transport close for server to detect disconnect. The internal server path uses explicit MessageType::Shutdown which is more deterministic. main.rs: Duration mode is not honored in standalone client paths: When -d 5s is passed, BenchmarkConfig::from_args sets msg_count = None (duration takes precedence). The standalone client then falls back to the default count via unwrap_or. The config.duration field is never read anywhere in the standalone paths. So --client -d 5s -m tcp --blocking silently runs the default message count instead of a 5-second timed test. Heap allocation per message in measurement loop payload.clone() allocates a new Vec on every iteration. Per the project's own guidelines ("No allocations in measurement loops"), this adds allocation overhead to every measurement sample. The same pattern appears in the round-trip loop and both async variants. This will skew latency numbers, particularly for small messages where allocation cost is proportionally large. Standalone client bypasses existing reporting pipeline (ResultsManager/streaming JSON/CSV), unlike established benchmark flows Test coverage claim mismatch: this branch diff does not add standalone-specific CLI/integration tests; new flags/paths need explicit coverage. When I asked to compare against C2C branch: |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
…ting Add the ability to run the benchmark server and client as independent processes, enabling cross-environment IPC testing (e.g., host and container). Relates to #11. Standalone mode features: - --server flag starts a server that listens for client connections - --client flag connects to a running server with retry logic (100ms backoff, 30s timeout) - Both async (Tokio) and blocking (std) execution modes supported - Duration (-d) and message-count (-i) modes both supported - Default transport endpoints work without extra flags - Endpoint flags (--socket-path, --shared-memory-name, --message-queue-name) promoted to user-facing Reporting integration: - Full ResultsManager/MetricsCollector integration for structured output (JSON, streaming CSV, console summary with HDR percentiles) - Server-side one-way latency measurement using monotonic clock (accurate for same-host and container scenarios) - Round-trip latency with per-message streaming support Code quality: - Shared helpers: dispatch_server_message(), retry constants - 25 tests covering CLI parsing, transport config, server dispatch, connection retry, shutdown, duration mode, one-way, round-trip - Explicit MessageType::Shutdown on client disconnect Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
f382afc to
93bfac7
Compare
This comment was marked as outdated.
This comment was marked as outdated.
- Add --send-delay support: inserts a configurable pause after each message send (blocking uses thread::sleep, async uses tokio::sleep) - Add --include-first-message support: when false (default), sends a canary message before measurement to warm up the connection, matching the existing BenchmarkRunner behavior - Applied to both one-way and round-trip tests in both blocking and async client paths Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This comment was marked as outdated.
This comment was marked as outdated.
Reuse a single Message struct across loop iterations instead of calling Message::new() with payload.clone() on every send. The message id and timestamp are updated in-place before each send. This removes one Vec<u8> heap allocation per message in the measurement loop, reducing allocation overhead that can skew latency results, especially for small messages. Applied to both one-way and round-trip tests in both blocking and async client paths. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This comment was marked as outdated.
This comment was marked as outdated.
Server-side multi-accept: - TCP and UDS servers now accept multiple concurrent connections, spawning a handler thread per client with its own MetricsCollector - Grace period after first client prevents premature server exit - SHM and PMQ fall back to single-client mode with a warning - Server aggregates one-way latency metrics across all handlers Client-side multi-threaded execution: - Blocking client spawns N worker threads, each with its own transport connection, MetricsCollector, and message loop - Async client uses tokio::task::JoinSet for concurrent workers - Results aggregated via MetricsCollector::aggregate_worker_metrics() - Per-message streaming disabled for concurrent mode (aggregated only) Transport additions: - BlockingTcpSocket::from_stream() wraps pre-accepted TcpStream - BlockingUnixDomainSocket::from_stream() wraps pre-accepted UnixStream Shared helpers: - handle_client_connection() -- per-client message dispatch and metrics - aggregate_and_print_server_metrics() -- shared aggregation logic Tests: - test_standalone_concurrent_tcp_round_trip (3 concurrent clients) - test_handle_client_connection_round_trip (dispatch correctness) - test_handle_client_connection_one_way_metrics (metrics recording) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This comment was marked as outdated.
This comment was marked as outdated.
- test_standalone_concurrent_tcp_one_way: multi-accept server with 2 concurrent one-way clients, verifying server-side metrics recording - test_tcp_from_stream_send_receive: BlockingTcpSocket::from_stream() full send/receive round-trip - test_uds_from_stream_send_receive: BlockingUnixDomainSocket::from_stream() full send/receive round-trip (unix-only) - test_concurrency_forced_to_one_for_shm: CLI parsing for SHM with concurrency > 1 - test_aggregate_and_print_empty_collectors: empty input edge case - test_aggregate_and_print_single_collector: single collector with data Total binary tests: 34. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This comment was marked as outdated.
This comment was marked as outdated.
- Add multi-accept support for async TCP and UDS servers, matching the blocking server's concurrency support. Uses tokio::net listeners with spawn_blocking for per-client handler threads. - Remove unused _args parameter from run_standalone_server_async - Replace inline latency printing in async server with shared print_server_one_way_latency helper Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This comment was marked as outdated.
This comment was marked as outdated.
- Replace all 12 hardcoded test ports (18301-18314) with OS-assigned ports via get_free_port() helper (binds to port 0, extracts assigned port). Prevents port conflicts in parallel test runs and with other processes. - Extract 2-second multi-accept grace period into SERVER_ACCEPT_GRACE_PERIOD constant with documentation explaining the behavior and limitation. - Document the grace period in --server CLI help text so users know concurrent clients should connect within 2 seconds of each other. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This comment was marked as outdated.
This comment was marked as outdated.
tokio::net::TcpStream::into_std() leaves the stream in non-blocking mode (set by tokio for epoll/kqueue). The blocking transport's read_exact/write_all calls then fail with WouldBlock errors, causing immediate disconnection. Fix: call set_nonblocking(false) on streams after into_std() in both TCP and UDS async multi-accept servers. Add test_standalone_async_concurrent_tcp_round_trip to exercise the async multi-accept path (tokio accept + spawn_blocking + from_stream + handle_client_connection). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- test_standalone_blocking_tcp_one_way: verify server received exact message count with correct sequential IDs, add shutdown message - test_standalone_blocking_tcp_duration_round_trip: verify response IDs match requests, assert count > 10 for 200ms test, add shutdown - test_standalone_blocking_tcp_duration_one_way: verify server received exact count with sequential IDs, assert count > 10 for 200ms test - test_concurrency_forced_to_one_for_shm: test actual concurrency forcing logic instead of just CLI parsing - test_standalone_concurrent_tcp_one_way: assert exact message count per handler instead of just "greater than zero" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This comment was marked as outdated.
This comment was marked as outdated.
1 similar comment
This comment was marked as outdated.
This comment was marked as outdated.
- Clean up garbled doc comment on async concurrent test (editing artifacts from multiple rewrites) - Replace silent panic swallowing in async multi-accept servers: try_join_next().transpose() silently dropped JoinErrors from panicked handler tasks. Now logs warnings via warn!(). - Extract effective_concurrency() helper to deduplicate the concurrency-forcing logic (was copied in blocking client, async client, and test). Test now calls the actual helper instead of reimplementing the logic inline. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This comment was marked as outdated.
This comment was marked as outdated.
- test_standalone_large_payload_integrity: 4KB payloads with recognizable byte pattern, server echoes back, client verifies content byte-for-byte to catch corruption - test_handle_client_connection_filters_canary: verifies warmup canary messages (id=u64::MAX) are excluded from one-way metrics - test_handle_client_connection_mixed_message_types: interleaved OneWay and Request messages on a single connection, verifies correct metrics recording and response dispatch - test_aggregate_and_print_multiple_collectors: aggregation across 2 collectors with different latency distributions - test_effective_concurrency_all_mechanisms: covers UDS, PMQ, SHM, TCP, and concurrency=1 edge case Total binary tests: 40. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This comment was marked as outdated.
This comment was marked as outdated.
Accepted TCP/UDS streams inherit non-blocking mode from the listener (set for the accept poll loop). The handler threads need blocking mode for the transport's read_exact/write_all operations. This is the blocking-server equivalent of the async into_std fix in commit 8723429. Without this fix, standalone server handlers immediately disconnect from clients. Applies to both run_standalone_server_blocking_multi_accept_tcp and _uds. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Normal vs Standalone Benchmark ComparisonManual comparison of benchmark results between normal mode (main branch) and standalone client/server mode (this PR). All tests run locally on the same machine with 1000 warmup iterations. TCP Round-Trip (10000 msgs, 1024 bytes)
UDS Round-Trip (10000 msgs, 1024 bytes)
TCP Large Payload Round-Trip (5000 msgs, 8192 bytes)
TCP One-Way (10000 msgs, 1024 bytes)
Note: Standalone one-way reports throughput only on client side; server reports latency separately. Normal mode measures server-side latency via latency file. Summary
|
PR SummaryThis PR adds standalone Features
Reporting
Transport
Code Quality
Benchmark ComparisonNormal mode vs standalone mode results are within ~10-15% mean latency tolerance, with standalone often showing better tail latency (see comparison tables in PR comments). Deferred ItemsTwo maintainability concerns were raised during review: the standalone logic adding ~2800 lines to |
📈 Changed lines coverage: 7.30% (71/972)🚨 Uncovered lines in this PR
📊 Code Coverage Summary
|
Add the ability to run the benchmark server and client as independent processes, enabling cross-environment IPC testing (e.g., host and container).
Relates to #11
Description
Brief description of changes
Type of Change
Testing
Checklist