Skip to content

Conversation

@aditanase
Copy link
Collaborator

No description provided.

let mut range_start = 0;
while range_start < source_file.object_meta.size {
// Skip splitting files smaller than repartition_file_min_size
// This may result in a number of partitions slightly smaller than requested
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm misreading the code, I think this may result in a number of partitions bigger than requested, not smaller. state.1 gets increased unconditionally and then we check to see if it exceeded the size.
So we can end up with up to double the requested size in some partitions. E.g. with 100MB requested size, if state.1 is 99MB and the next file is 99MB, the partition will be 198MB.

Unless this optimisation is a game changer, I would rather keep the current behaviour, which I think is quite sane especially for large files and large limits.

@aditanase aditanase merged commit 20f43e4 into main Mar 13, 2025
34 of 49 checks passed
@adragomir adragomir deleted the bug-repartition branch March 17, 2025 10:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants