Skip to content

Spark: test cleanup - eliminate unnecessary table refreshes#15765

Merged
huaxingao merged 1 commit intoapache:mainfrom
wypoon:table_refresh_test_cleanup
Mar 26, 2026
Merged

Spark: test cleanup - eliminate unnecessary table refreshes#15765
huaxingao merged 1 commit intoapache:mainfrom
wypoon:table_refresh_test_cleanup

Conversation

@wypoon
Copy link
Copy Markdown
Contributor

@wypoon wypoon commented Mar 25, 2026

TestRewriteDataFilesAction has helper createTable and createTablePartitioned methods that create a Table instance, write to the table, and return the Table instance. However, as the write is done using the DataFrame API and not through the Table, the Table instance that is returned is stale. This is bad design and a trap for unwary new users of the methods who want to add tests. The trap is made worse by the verification helper methods shouldHaveFiles and shouldHaveSnapshots (which take the Table as an argument) doing a Table::refresh before performing its work, so if one looks at some existing test method that calls createTable followed by shouldHaveFiles as a model, one does not see the hidden call to Table::refresh.
The Table should be refreshed before it is returned, so it has all the changes. The side-effecting Table::refresh calls should be removed from verification helper methods. Once this is done, some new Table::refresh calls need to be added. However, I found many unnecessary Table::refresh calls (probably added defensively due to being once bitten, twice shy) and have removed them.
I also found and removed unnecessary Table::refresh calls in TestRewritePositionDeleteFilesAction.

@github-actions github-actions bot added the spark label Mar 25, 2026
Copy link
Copy Markdown
Contributor

@singhpk234 singhpk234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @wypoon !

seems like we have been doing this for a while now : #4502

@huaxingao huaxingao merged commit f575d97 into apache:main Mar 26, 2026
24 checks passed
@huaxingao
Copy link
Copy Markdown
Contributor

Thanks @wypoon for the PR! Thanks @ebyhr @singhpk234 for the review!

@wypoon
Copy link
Copy Markdown
Contributor Author

wypoon commented Mar 27, 2026

Thanks @huaxingao @ebyhr and @singhpk234 for reviewing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants