Upgrade DataFusion to 54#8044
Conversation
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
Merging this PR will not alter performance
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ⚡ | Simulation | baseline_eq[16, 65536] |
287.6 µs | 259.8 µs | +10.7% |
| ⚡ | Simulation | baseline_lt[16, 65536] |
302.7 µs | 274.9 µs | +10.13% |
| ❌ | Simulation | fast_lt_out_of_range[4, 65536] |
204.3 µs | 262.5 µs | -22.17% |
| ⚡ | Simulation | bitwise_not_vortex_buffer_mut[128] |
304.4 ns | 246.1 ns | +23.7% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing adamg/df-54 (9bec4c8) with develop (c54ce7e)
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
Polar Signals Profiling ResultsLatest Run
Previous Runs (1)
Powered by Polar Signals Cloud |
Polar Signals Profiling ResultsLatest Run
Powered by Polar Signals Cloud |
Benchmarks: PolarSignals ProfilingVortex (geomean): 1.050x ➖ datafusion / vortex-file-compressed (1.050x ➖, 0↑ 3↓)
|
File Sizes: PolarSignals ProfilingFile Size Changes (1 files changed, -0.0% overall, 0↑ 1↓)
Totals:
|
Benchmarks: FineWeb NVMeVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (0.964x ➖, 0↑ 0↓)
datafusion / vortex-compact (0.995x ➖, 0↑ 0↓)
datafusion / parquet (1.008x ➖, 0↑ 1↓)
duckdb / vortex-file-compressed (0.970x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.005x ➖, 0↑ 0↓)
duckdb / parquet (0.972x ➖, 0↑ 0↓)
Full attributed analysis
|
File Sizes: FineWeb NVMeFile Size Changes (2 files changed, -0.0% overall, 0↑ 2↓)
Totals:
|
Benchmarks: TPC-H SF=1 on NVMEVerdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (1.047x ➖, 0↑ 2↓)
datafusion / vortex-compact (1.063x ➖, 1↑ 4↓)
datafusion / parquet (1.063x ➖, 2↑ 5↓)
datafusion / arrow (1.080x ➖, 4↑ 12↓)
duckdb / vortex-file-compressed (1.048x ➖, 0↑ 1↓)
duckdb / vortex-compact (1.047x ➖, 0↑ 0↓)
duckdb / parquet (1.019x ➖, 1↑ 0↓)
duckdb / duckdb (1.037x ➖, 0↑ 0↓)
Full attributed analysis
|
File Sizes: TPC-H SF=1 on NVMEFile Size Changes (18 files changed, -0.0% overall, 0↑ 18↓)
Totals:
|
Benchmarks: TPC-DS SF=1 on NVMEVerdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (1.002x ➖, 6↑ 8↓)
datafusion / vortex-compact (1.004x ➖, 6↑ 5↓)
datafusion / parquet (1.006x ➖, 4↑ 10↓)
duckdb / vortex-file-compressed (1.025x ➖, 0↑ 4↓)
duckdb / vortex-compact (1.021x ➖, 1↑ 0↓)
duckdb / parquet (1.016x ➖, 0↑ 2↓)
duckdb / duckdb (1.010x ➖, 1↑ 2↓)
Full attributed analysis
|
File Sizes: TPC-DS SF=1 on NVMEFile Size Changes (48 files changed, -0.0% overall, 0↑ 48↓)
Totals:
|
Benchmarks: Statistical and Population GeneticsVerdict: No clear signal (low confidence) duckdb / vortex-file-compressed (1.066x ➖, 0↑ 3↓)
duckdb / vortex-compact (1.031x ➖, 0↑ 1↓)
duckdb / parquet (1.030x ➖, 0↑ 0↓)
Full attributed analysis
|
File Sizes: Statistical and Population GeneticsFile Size Changes (2 files changed, -0.0% overall, 0↑ 2↓)
Totals:
|
Benchmarks: FineWeb S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (1.111x ➖, 0↑ 2↓)
datafusion / vortex-compact (0.917x ➖, 1↑ 1↓)
datafusion / parquet (1.016x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.878x ➖, 1↑ 0↓)
duckdb / vortex-compact (0.975x ➖, 0↑ 0↓)
duckdb / parquet (1.004x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: TPC-H SF=10 on NVMEVerdict: No clear signal (medium confidence) datafusion / vortex-file-compressed (0.829x ✅, 19↑ 0↓)
datafusion / vortex-compact (0.814x ✅, 21↑ 1↓)
datafusion / parquet (0.854x ✅, 15↑ 0↓)
datafusion / arrow (0.912x ➖, 10↑ 4↓)
duckdb / vortex-file-compressed (0.988x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.999x ➖, 0↑ 0↓)
duckdb / parquet (0.987x ➖, 0↑ 0↓)
duckdb / duckdb (0.997x ➖, 0↑ 0↓)
Full attributed analysis
|
File Sizes: TPC-H SF=10 on NVMEFile Size Changes (48 files changed, -0.0% overall, 0↑ 48↓)
Totals:
|
Benchmarks: Clickbench on NVMEVerdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (0.827x ✅, 20↑ 0↓)
datafusion / parquet (0.808x ✅, 24↑ 0↓)
duckdb / vortex-file-compressed (0.988x ➖, 4↑ 0↓)
duckdb / parquet (1.005x ➖, 1↑ 1↓)
duckdb / duckdb (1.040x ➖, 0↑ 6↓)
Full attributed analysis
|
File Sizes: Clickbench on NVMEFile Size Changes (201 files changed, -0.0% overall, 0↑ 201↓)
Totals:
|
Benchmarks: TPC-H SF=1 on S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (1.021x ➖, 1↑ 3↓)
datafusion / vortex-compact (0.949x ➖, 1↑ 0↓)
datafusion / parquet (0.949x ➖, 3↑ 2↓)
duckdb / vortex-file-compressed (0.977x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.993x ➖, 0↑ 0↓)
duckdb / parquet (0.970x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: TPC-H SF=10 on S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (0.957x ➖, 1↑ 0↓)
datafusion / vortex-compact (1.115x ➖, 0↑ 2↓)
datafusion / parquet (0.893x ➖, 3↑ 6↓)
duckdb / vortex-file-compressed (1.019x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.010x ➖, 0↑ 0↓)
duckdb / parquet (1.018x ➖, 0↑ 1↓)
Full attributed analysis
|
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
Summary
This PR includes an upgrade of our DataFusion dependency/integration to the upcoming 54 release. It aims to make the minimal amount of changes, and implementing the new
MorselizerAPI will be part of a future PR (I have an old PR that was based on an earlier PoC, I'll try and pull stuff from there when the time comes).54.0.0(Apr 2026 / May 2026) apache/datafusion#21080