Skip to content

feat(table): support delete file removal in overwrite commits#851

Merged
zeroshade merged 1 commit intoapache:mainfrom
laskoviymishka:feat/delete-file-removal
Apr 7, 2026
Merged

feat(table): support delete file removal in overwrite commits#851
zeroshade merged 1 commit intoapache:mainfrom
laskoviymishka:feat/delete-file-removal

Conversation

@laskoviymishka
Copy link
Copy Markdown
Contributor

Extend the overwrite snapshot producer to atomically remove delete files (position + equality) alongside data file replacement. This is the commit primitive needed for compaction (#832) to clean up fully-applied delete files.

  • Add deletedDeleteFiles tracking to snapshotProducer with removeDeleteFile()
  • existingManifests() cross-checks content type when filtering entries
  • deletedEntries() splits by content type into separate data/delete manifests with correct ManifestContent in both Avro metadata and manifest list
  • newManifestWriter accepts ManifestWriterOption for content type propagation
  • New Transaction.ReplaceFiles() validates both data and delete files exist in the table with content-type gating before commit

@laskoviymishka laskoviymishka force-pushed the feat/delete-file-removal branch from efd83f1 to a790ad5 Compare April 4, 2026 17:53
@laskoviymishka laskoviymishka marked this pull request as ready for review April 4, 2026 18:03
Copy link
Copy Markdown
Member

@zeroshade zeroshade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good to me, but needs tests

@laskoviymishka laskoviymishka force-pushed the feat/delete-file-removal branch from a790ad5 to 8af15ae Compare April 7, 2026 20:53
Extend the overwrite snapshot producer to atomically remove delete files
(position + equality) alongside data file replacement. This is the commit
primitive needed for compaction (apache#832) to clean up fully-applied delete
files.

- Add deletedDeleteFiles tracking to snapshotProducer with removeDeleteFile()
- existingManifests() cross-checks content type when filtering entries
- deletedEntries() splits by content type into separate data/delete manifests
  with correct ManifestContent in both Avro metadata and manifest list
- newManifestWriter accepts ManifestWriterOption for content type propagation
- New Transaction.ReplaceFiles() validates both data and delete files exist
  in the table with content-type gating before commit
@laskoviymishka laskoviymishka force-pushed the feat/delete-file-removal branch from 8af15ae to 72f1881 Compare April 7, 2026 21:01
@laskoviymishka laskoviymishka requested a review from zeroshade April 7, 2026 21:01
@zeroshade zeroshade merged commit 1ad5ff6 into apache:main Apr 7, 2026
12 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants