Skip to content

Core: close entries iterable in ManifestFilterManager#15713

Open
kinolaev wants to merge 2 commits intoapache:mainfrom
kinolaev:fix-unclosed-entries-iterable
Open

Core: close entries iterable in ManifestFilterManager#15713
kinolaev wants to merge 2 commits intoapache:mainfrom
kinolaev:fix-unclosed-entries-iterable

Conversation

@kinolaev
Copy link
Copy Markdown

This PR prevents deadlocks while filtering manifest entries. The problem is ManifestFilterManager.filterManifest in some cases reads each manifest twice: first in manifestHasDeletedFiles and then in filterManifestWithDeletedFiles. If manifestHasDeletedFiles returns in the middle of entries iterable, the underlying connection is open until the ManifestReader is closed. filterManifest method is called for all manifests in parallel. When number of simultaneous connections is limited (for example by http-client.apache.max-connections) it can lead to a deadlock because all connections are held by manifestHasDeletedFiles.

The problem can be reproduced using spark-sql with S3FileIO and http-client.apache.max-connections=1:

create table manifestfiltermanager(id bigint)
  partitioned by (truncate(1, id))
  tblproperties('write.delete.mode'='merge-on-read');
-- create a data manifest with 100 entries
insert into manifestfiltermanager select id from range(100);
-- create a delete manifest with 50 entries
delete from manifestfiltermanager where id in (select id from range(0, 100, 2));
-- make delete files dangling (fails without https://github.com/apache/iceberg/pull/15712)
call system.rewrite_data_files('manifestfiltermanager', options => map('rewrite-all', 'true'));
-- ManifestFilterManager.manifestHasDeletedFiles reads first block of the delete manifest
-- and returns true before the end of liveEntries() iterable without closing it.
-- ManifestFilterManager.filterManifestWithDeletedFiles fails with ConnectionPoolTimeoutException
call system.rewrite_data_files('manifestfiltermanager', options => map('rewrite-all', 'true'));

Comment on lines 514 to 516
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has the same problem. Perhaps fix this as well?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filterManifestWithDeletedFiles is called only in filterManifest, where reader is already wrapped with try-with-resources. So it doesn't cause any issue. But if you'd like I can wrap this liveEntries() call too, just in case.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, just in case, looks a little safer now)

Signed-off-by: Sergei Nikolaev <kinolaev@gmail.com>
@kinolaev kinolaev force-pushed the fix-unclosed-entries-iterable branch from fa8978c to 845e15d Compare March 25, 2026 16:05
@kinolaev kinolaev requested a review from anoopj March 25, 2026 16:09
Copy link
Copy Markdown
Contributor

@anoopj anoopj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good to me.

Optional: it would be nice to have a test that can repro this.

Signed-off-by: Sergei Nikolaev <kinolaev@gmail.com>
@kinolaev
Copy link
Copy Markdown
Author

Thanks for the review, @anoopj ! I've added a test in 828f98e.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants