Skip to content

fix: use OpReplace instead of OpOverwrite in ReplaceDataFiles and ReplaceFiles#867

Open
Bahtya wants to merge 3 commits intoapache:mainfrom
Bahtya:main
Open

fix: use OpReplace instead of OpOverwrite in ReplaceDataFiles and ReplaceFiles#867
Bahtya wants to merge 3 commits intoapache:mainfrom
Bahtya:main

Conversation

@Bahtya
Copy link
Copy Markdown

@Bahtya Bahtya commented Apr 9, 2026

Summary

Fixes #841 (parent #832)

Changes ReplaceDataFiles, ReplaceDataFilesWithDataFiles, and ReplaceFiles to use OpReplace instead of OpOverwrite when creating snapshot updates.

Problem

Per the Iceberg spec, REPLACE is the correct operation when data content is equivalent but reorganized into different files (e.g., compaction). The three replace methods were unconditionally using OpOverwrite despite a TODO comment acknowledging this was incorrect.

Changes

  • table/transaction.go: Changed OpOverwriteOpReplace in three locations:
    • ReplaceDataFiles (line ~418)
    • ReplaceDataFilesWithDataFiles (line ~713)
    • ReplaceFiles (line ~826)
  • Removed the TODO comment at ReplaceDataFiles that acknowledged the incorrect operation type
  • table/replace_files_test.go: Updated TestReplaceFiles_DataAndDeleteFiles to assert OpReplace

Testing

All existing tests pass:

=== RUN   TestReplaceFiles_DataAndDeleteFiles
--- PASS
=== RUN   TestReplaceFiles_DelegatesToReplaceDataFilesWhenNoDeleteFiles
--- PASS
=== RUN   TestReplaceFiles_ValidationErrors
--- PASS

…laceFiles

Per the Iceberg spec, REPLACE is the correct operation when data is
reorganized (e.g., compaction) without changing content. ReplaceDataFiles,
ReplaceDataFilesWithDataFiles, and ReplaceFiles all reorganize data files,
so they should use OpReplace rather than OpOverwrite.

Removes the TODO comment that acknowledged this was incorrect.

Fixes apache#841
@Bahtya Bahtya requested a review from zeroshade as a code owner April 9, 2026 18:02
Bahtya and others added 2 commits April 10, 2026 02:15
Remove extra blank line between doc comment and function declaration.
…ests

Update TestReplaceDataFiles and TestReplaceDataFilesWithDataFiles
to expect OpReplace instead of OpOverwrite in snapshot summaries,
matching the production code change.
Copy link
Copy Markdown
Contributor

@laskoviymishka laskoviymishka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thnx for contribution!

Looks good to me.

@Bahtya
Copy link
Copy Markdown
Author

Bahtya commented Apr 11, 2026

Hi team, just wanted to follow up on this PR. It has been reviewed and approved by @laskoviymishka, and all CI checks are passing. Would appreciate if a maintainer could take a look and merge when ready. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: ReplaceDataFiles should use OpReplace instead of OpOverwrite

2 participants