You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Apr 9, 2026. It is now read-only.
Currently, Bulker supports async inserts but processes them in a single-threaded manner. In stream mode, it listens to new messages on the incoming topic and sends a separate HTTP request to ClickHouse for each message, waiting for a response before continuing. This approach is significantly slower than batching due to the per-message roundtrip latency.
Opportunity
Since ClickHouse async inserts don't require waiting for a response, we can safely improve performance by sending inserts concurrently. Introducing a pool of N concurrent connections will allow Bulker to process multiple inserts in parallel, significantly increasing throughput.
Proposed Implementation
Create a fixed-size pool of HTTP connections (or workers).
Route insert requests to a random or round-robin worker in the pool.
Do not await the HTTP response (fire-and-forget model), but optionally handle errors asynchronously (e.g., logging failed responses in the background).
Ensure the pool size (N) is configurable for tuning based on deployment needs.
Benefits
Improved message processing throughput in stream mode.
Better resource utilization on high-throughput pipelines.
Minimal changes required since async insert semantics already support non-blocking behavior.
Benchmark (TODO)
We’ll benchmark performance before and after introducing concurrency to measure improvement. Metrics to compare:
Background
Currently, Bulker supports async inserts but processes them in a single-threaded manner. In stream mode, it listens to new messages on the incoming topic and sends a separate HTTP request to ClickHouse for each message, waiting for a response before continuing. This approach is significantly slower than batching due to the per-message roundtrip latency.
Opportunity
Since ClickHouse async inserts don't require waiting for a response, we can safely improve performance by sending inserts concurrently. Introducing a pool of N concurrent connections will allow Bulker to process multiple inserts in parallel, significantly increasing throughput.
Proposed Implementation
N) is configurable for tuning based on deployment needs.Benefits
Benchmark (TODO)
We’ll benchmark performance before and after introducing concurrency to measure improvement. Metrics to compare: