Skip to content

Fix Arrow validity buffer lifetime in PyQuery#2294

Merged
kounelisagis merged 4 commits intomainfrom
agis/fix-arrow-val-buffer-lifetime
Feb 4, 2026
Merged

Fix Arrow validity buffer lifetime in PyQuery#2294
kounelisagis merged 4 commits intomainfrom
agis/fix-arrow-val-buffer-lifetime

Conversation

@kounelisagis
Copy link
Member

This PR fixes a bug where BufferHolder was created before the validity array was converted to Arrow’s bitmap format. As a result, BufferHolder kept a reference to the original validity array, while the Arrow array's buffer pointer referenced the converted bitmap instead. When PyQuery was deleted, the bitmap memory was freed, leaving a dangling pointer and causing corrupted null values.

The bug was already present but mostly hidden - pandas 2.x to_pandas() typically accessed the buffer before the memory was reclaimed. In pandas 3.0, the conversion path changed and the validity buffer is accessed later, after the freed memory may already have been overwritten, which exposes the corruption.

The fix ensures the bitmap is converted first and only then creates the BufferHolder, so it retains a reference to the correct buffer.

Ref CORE-487

@kounelisagis kounelisagis requested a review from ihnorton February 3, 2026 13:48
Copy link
Member

@ypatia ypatia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kounelisagis kounelisagis merged commit ed8ba44 into main Feb 4, 2026
27 checks passed
@kounelisagis kounelisagis deleted the agis/fix-arrow-val-buffer-lifetime branch February 4, 2026 10:27
Copy link
Member

@ihnorton ihnorton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants