Skip to content

Make the loop iterating over multiple parquet files parallel#4952

Open
e-kayrakli wants to merge 2 commits intoBears-R-Us:mainfrom
e-kayrakli:parquet-parallel-multifile
Open

Make the loop iterating over multiple parquet files parallel#4952
e-kayrakli wants to merge 2 commits intoBears-R-Us:mainfrom
e-kayrakli:parquet-parallel-multifile

Conversation

@e-kayrakli
Copy link
Contributor

While working on #4906, I converted a parallel loop in to a serial one for better debugging, but then forgot to make it parallel again. This was observed as a performance regression in a use case. This PR adds the forall loop back instead of the for loop. Note that a similar part of code, that's used for col-by-col reads already uses a parallel loop.

I spot-checked the correctness.

Signed-off-by: Engin Kayraklioglu <e-kayrakli@users.noreply.github.com>
@drculhane
Copy link
Contributor

This looks like it should be a no-brainer, but when I pulled it and ran tests, the test_poisson_seed_reproducibility test (which does use parquet) failed.

@ajpotts ajpotts enabled auto-merge February 9, 2026 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants