-
Notifications
You must be signed in to change notification settings - Fork 2k
Description
I'm opening this epic to track improvements / changes we want to our benchmarking setup.
I'll start by collecting some relevant issues:
- Run all benchmarks on merge to main branch #15511
- Run DataFusion benchmarks regularly and track performance history over time #5504
- Support multiple (>2) results comparison in benchmark scripts #13446
- [DISCUSS] Release retrospective #21034
I think we should discuss in this issue what we want from our benchmarking setup and use that to guide how we improve it.
Trackable over time
This is important for gating releases and catching regressions early. Currently we only ever really run benchmarks in a PR to compare to main or when we go to update ClickBench.
I think we should target Codspeed compatibility for this.
Can run slow/complex SQL benchmarks
Think ClickBench. I think this might discard the use of criterion at least for this style of benchmark, see bheisler/criterion.rs#320. We can use criterion for smaller, faster benchmarks.
We also need a harness that supports loading data, can give us both cold and hot numbers.
Can do a "quick run"
We want to be able to run benchmarks just to verify the results are correct, or as a test during development.