optimize(parquet): Nested list batching child.write calls#10085
optimize(parquet): Nested list batching child.write calls#10085mapleFU wants to merge 5 commits into
Conversation
| /// counting child-element starts to find and stamp slot boundaries. | ||
| /// | ||
| /// Scan backward because we don't know start offset before writing. | ||
| fn write_list_scan<O: OffsetSizeTrait>( |
There was a problem hiding this comment.
I previously want to remove the branch for write non-nested childs, however benchmark shows that adding a more branch will hurt the performance. So I split it to two functions.
| for rep in rep_levels.iter_mut().rev() { | ||
| // This can uses `==`, since list write is recursive and the child is written | ||
| // before the parent. | ||
| if *rep <= ctx.rep_level { |
There was a problem hiding this comment.
I still use <= because benchmark shows no performance enhancement between <= and ==, so just uses <=
| // before the parent. | ||
| if *rep <= ctx.rep_level { | ||
| seen += 1; | ||
| if seen == next_stamp_at { |
There was a problem hiding this comment.
Maybe we can SIMD this in the future, this is not cirtical in this branch. Or we can "batching" if list offsets is large. E.g. checking not one-by-one, and just batch by batch
There was a problem hiding this comment.
Pull request overview
This PR optimizes Parquet level generation for nested Arrow List types by batching child.write(...) calls and applying repetition-level backfilling in larger chunks, reducing recursive write-call overhead for deeply nested list structures.
Changes:
- Split list writing into two specialized hot paths: an offsets-based “direct” backfill for last-level list children and a backward-scan backfill for nested repetition cases
- Extracted a shared run-classification loop (
write_list_impl) to batch null/empty/non-empty runs while keeping monomorphized backfill strategies
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Which issue does this PR close?
Rationale for this change
Optimize nested list call recursive write counts.
What changes are included in this PR?
Separate list write function to direct-by-offsets and by backward scan
Are these changes tested?
Covered by existing
Are there any user-facing changes?
No