Polish bundle: file locking, sort-on-flush, RLE for kind#19
Merged
Conversation
Four operational polish items in one PR — none individually big enough
for their own.
1. File locking (DbLock). New module src/runtime/lock.rs holds an
exclusive flock on db_root/LOCK. Server (run_server) and every
admin command acquire it for their full lifetime; OS releases it
when the process exits. Prevents server + admin (or two servers)
from racing on the same manifest.
open_state_for_admin now returns (AppState, DbLock); existing
tests destructured the tuple.
2. Sort-on-flush. FlusherWorker.write_bucket sorts the per-bucket
batch by (account, product, meter, model, ts) — the spec's
canonical billing order — before writing. Saves a sort pass on
every later compaction and improves dictionary-encoding locality
on the ID columns. The algorithm is exposed as
`pub fn sort_events_canonical` so tests can verify it directly
without driving the async flusher.
3. RLE for `kind`. Encoding::Rle = 4 added. The kind column has 3
values with usually long runs of `Usage`; RLE collapses those to
single (value, run_length) pairs (then zstd over that). For a
1000-row all-Usage segment, the kind column goes from 1000 bytes
pre-zstd to ~10 bytes. Reader is permissive (accepts Plain or Rle).
Tests (tests/polish.rs, 8 tests):
- db_lock_prevents_concurrent_acquisition + re-acquire after drop
- open_state_for_admin_blocks_second_caller
- sort_events_canonical_orders_by_billing_dimensions
- sort_events_canonical_breaks_ties_by_meter_then_model
- writer_round_trip_preserves_sorted_order
- rle_kind_round_trips_all_usage
- rle_kind_round_trips_all_corrections
- rle_encoding_round_trips_mixed_kinds (Usage + Correction +
Retraction interleaved)
Dep: fs4 = "0.13" for cross-platform flock.
Total tests: 114 (was 106; +8). Clean under RUSTFLAGS=-D warnings.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three operational polish items bundled in one PR — none individually big enough for its own. Each tightens an existing path without changing public semantics.
1. File locking (
DbLock)New module
src/runtime/lock.rsholds an exclusiveflock(2)ondb_root/LOCK. The server (run_server) and every admin command acquire it at startup; the OS releases it when the process exits. Prevents server + admin (or two servers) from racing on the same manifest.API change:
open_state_for_adminnow returns(AppState, DbLock); callers destructure the tuple and hold the lock guard for the operation's lifetime.2. Sort-on-flush
FlusherWorker::write_bucketsorts each per-bucket batch by(account, product, meter, model, ts)— the spec's canonical billing order — before writing. Saves a sort pass on every later compaction and improves dictionary-encoding locality. The sort algorithm is exposed aspub fn sort_events_canonicalso tests verify it directly without driving the async flusher.3. RLE for
kindEncoding::Rle = 4added (the format reserved discriminants for this). Thekindcolumn has only 3 values with usually long runs ofUsage. RLE collapses those to single(value, run_length)pairs, then zstd handles the rest. For a 1000-row all-Usagesegment, the kind column drops from 1000 bytes pre-zstd to ~10 bytes.Reader is permissive: accepts
PlainorRlefor thekindcolumn, so segments written before this PR still load.Dep
fs4 = "0.13"for cross-platformflock.Tests
8 new tests in
tests/polish.rs:db_lock_prevents_concurrent_acquisition+ re-acquire after dropopen_state_for_admin_blocks_second_callersort_events_canonical_orders_by_billing_dimensionssort_events_canonical_breaks_ties_by_meter_then_modelwriter_round_trip_preserves_sorted_orderrle_kind_round_trips_all_usagerle_kind_round_trips_all_correctionsrle_encoding_round_trips_mixed_kinds(Usage + Correction + Retraction interleaved)Test plan
cargo build --all-targetsclean with-D warningscargo test --all-targets— 114 tests pass (was 106; +8)🤖 Generated with Claude Code