Parent: #281
Goal
Fix the correctness and file-layout issues that can invalidate later performance work.
This includes writer-property consistency, result-cache key shape, vacuum safety, table maintenance coverage, and the first layout for eval_scenarios.
Scope
- Create a shared writer-properties helper for DataFusion/Delta table writes.
- Make writer profiles explicit per table type: trace spans, trace summaries, GenAI trace tables, Bifrost datasets, dispatch records, control tables where relevant, and eval scenarios.
- Make sure tables that need bloom filters, compression, row-group sizing, and column encodings actually get those writer properties when written and optimized.
- Audit
trace_dispatch; either add writer properties and maintenance hooks or document why it is intentionally exempt.
- Fix unsafe vacuum behavior so one pod cannot remove files still needed by another pod’s active snapshot.
- Fix result-cache key shape and weights before TTL changes. Cache keys need the full query shape, not just a partial set of inputs.
- Ship
eval_scenarios as a partitioned Delta table using a created-date partition derived from created_at. This table has not shipped, so no migration is needed.
High-level design
The shared writer helper should remove copy/paste drift between engines. It should not force every table into the same physical layout. Different tables have different query patterns, so the helper should expose table profiles rather than a single global set of Parquet knobs.
eval_scenarios should be corrected before release. Use a dedicated date partition column derived from created_at, with a stable type suitable for Hive-style partition pruning.
Acceptance criteria
- Writer properties are built through one shared path or explicitly documented as intentionally different.
- Trace summaries, trace spans, GenAI trace tables, Bifrost tables, dispatch records, and eval scenarios have an explicit maintenance/layout decision.
eval_scenarios writes include a created-date partition derived from created_at.
- Result-cache keys include the full query shape required for correctness.
- Vacuum behavior is safe under multi-pod readers.
- Tests cover writer-property construction and the
eval_scenarios partition column.
Parent: #281
Goal
Fix the correctness and file-layout issues that can invalidate later performance work.
This includes writer-property consistency, result-cache key shape, vacuum safety, table maintenance coverage, and the first layout for
eval_scenarios.Scope
trace_dispatch; either add writer properties and maintenance hooks or document why it is intentionally exempt.eval_scenariosas a partitioned Delta table using a created-date partition derived fromcreated_at. This table has not shipped, so no migration is needed.High-level design
The shared writer helper should remove copy/paste drift between engines. It should not force every table into the same physical layout. Different tables have different query patterns, so the helper should expose table profiles rather than a single global set of Parquet knobs.
eval_scenariosshould be corrected before release. Use a dedicated date partition column derived fromcreated_at, with a stable type suitable for Hive-style partition pruning.Acceptance criteria
eval_scenarioswrites include a created-date partition derived fromcreated_at.eval_scenariospartition column.