perf: reduce allocations in hot query execution paths by kuseman · Pull Request #154 · kuseman/payloadbuilder

kuseman · 2026-06-02T12:21:34Z

TupleVector.validate: use List path stack instead of eager string concatenation on each recursive schema descent; path is only joined when an exception is thrown
TemporaryTable.IndexTupleVector: cache selected columns to avoid recreating SelectedValueVector (+ int[] copy) on every getColumn call
ExecutionContext.copy: share stateless ExpressionFactory instance instead of allocating a new one per NestedLoop outer-row iteration

Performance fixes (allocation hot spots): - TupleVector.validate: replace eager string concatenation on each recursive schema descent with a List<String> path stack; string is only joined when an exception is thrown (eliminated 248 GB/s of byte[] allocation in production JFR) - HashMatch, NestedLoop: return cached schema field from getSchema() instead of calling joinSchema() on every probe/iteration; both operators stored the schema in the constructor but the public override recomputed it each time - TemporaryTable.IndexTupleVector: cache selected columns to avoid recreating SelectedValueVector (+ int[] copy) on every getColumn call - ExecutionContext.copy: share the stateless ExpressionFactory instance instead of allocating a new one per NestedLoop outer-row iteration BREAKING CHANGE: Memory leak fix: - IDatasink.execute signature changed from TupleIterator to Supplier<TupleIterator> so sinks can re-execute the upstream plan on demand (cache hit skips execution; cache refresh calls input.get() for a fresh iterator each time, matching old versions behaviour) - InsertInto: removed LazyTupleIterator; passes () -> input.execute(context) as the supplier, forwards estimatedBatchCount/estimatedRowCount, guards against sinks that forget to close the iterator - SelectIntoTempTableSink: materialise result through TupleVectorBuilder instead of relying on PlanUtils.concat's single-batch fast-path which returned the raw TableScan$1$1 anonymous TupleVector; that vector held a strong reference via this$1 -> TableScan$1 -> val$context -> ExecutionContext -> QuerySession -> temporaryTables, retaining the entire execution context chain for the lifetime of the cached entry - AInMemoryCache: document that expired entries reload asynchronously for alwaysLoadAsync=false; the Supplier<TupleIterator> API change ensures async reload always has a re-executable supplier

kuseman force-pushed the performance_fixes branch 4 times, most recently from d92cd8b to c7f8a44 Compare June 3, 2026 10:55

kuseman force-pushed the performance_fixes branch from c7f8a44 to a92db8c Compare June 3, 2026 11:23

kuseman merged commit e30362b into master Jun 3, 2026
1 check passed

kuseman deleted the performance_fixes branch June 3, 2026 11:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: reduce allocations in hot query execution paths#154

perf: reduce allocations in hot query execution paths#154
kuseman merged 1 commit into
masterfrom
performance_fixes

kuseman commented Jun 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kuseman commented Jun 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant