Skip to content

Trace latency: track deferred advanced optimizations #293

Description

@thorrester

Parent: #281

Goal

Track advanced optimization ideas that should stay deferred until the baseline benchmarks prove they are needed.

These items have real operational cost. They should not be implemented just because they are common in larger object-store analytics systems.

Scope

  • Defer custom DataFusion optimizer rules until planner correctness, file layout, warmup, and result-cache fixes are measured.
  • Defer a custom TableProvider until there is evidence DataFusion’s default Delta/Parquet path cannot meet the target with simpler changes.
  • Defer a file-level trace index until bounded trace lookup and Postgres trace bounds are proven insufficient.
  • Defer cross-pod metadata cache designs. If revisited, prefer a dedicated cache service with bounded TTL/failure behavior over a Postgres BYTEA byte cache.
  • Defer ephemeral foyer or emptyDir data-page caching until benchmarks show repeated large range reads dominate after RAM cache sizing and warmup are tuned.
  • Explicitly reject PVC-backed cache persistence for this deployment.

High-level design

The system should earn each layer of complexity. A custom provider or disk cache may eventually be useful, but only after the cheaper fixes show where the remaining latency lives.

If an ephemeral cache is revisited, frame it correctly: it can improve warm reads within a pod lifetime, but it does not survive pod restart and does not provide cross-pod sharing.

Acceptance criteria

  • Each deferred item has a benchmark trigger that would justify reopening it.
  • PVC-backed cache designs are not proposed for this deployment.
  • Postgres is not used as a broad byte cache for Parquet metadata or data pages.
  • Any future cache prototype has a clear failure mode that degrades to direct object-store reads.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions