[DRAFT, EPIC] Benchmark improvements

I'm opening this epic to track improvements / changes we want to our benchmarking setup.

I'll start by collecting some relevant issues:
- https://github.com/apache/datafusion/issues/15511
- https://github.com/apache/datafusion/issues/5504
- https://github.com/apache/datafusion/issues/13446
- #21034

I think we should discuss in this issue what we want from our benchmarking setup and use that to guide how we improve it.

## Trackable over time

This is important for gating releases and catching regressions early. Currently we only ever really run benchmarks in a PR to compare to main or when we go to update ClickBench.

I think we should target Codspeed compatibility for this.

## Can run slow/complex SQL benchmarks

Think ClickBench. I think this might discard the use of criterion at least for this style of benchmark, see https://github.com/bheisler/criterion.rs/issues/320. We can use criterion for smaller, faster benchmarks.

We also need a harness that supports loading data, can give us both cold and hot numbers.

##  Can do a "quick run"

We want to be able to run benchmarks just to verify the results are correct, or as a test during development.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DRAFT, EPIC] Benchmark improvements #21165

Trackable over time

Can run slow/complex SQL benchmarks

Can do a "quick run"

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[DRAFT, EPIC] Benchmark improvements #21165

Description

Trackable over time

Can run slow/complex SQL benchmarks

Can do a "quick run"

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions