Skip to content

Spark: fix delete from branch for canDeleteWhere where it does not resolve to the correct branch#15512

Open
yingjianwu98 wants to merge 4 commits intoapache:mainfrom
yingjianwu98:yingjianw/fix_branch_delete_from
Open

Spark: fix delete from branch for canDeleteWhere where it does not resolve to the correct branch#15512
yingjianwu98 wants to merge 4 commits intoapache:mainfrom
yingjianwu98:yingjianw/fix_branch_delete_from

Conversation

@yingjianwu98
Copy link
Copy Markdown
Contributor

@yingjianwu98 yingjianwu98 commented Mar 4, 2026


Problem

When WAP (Write-Audit-Publish) is enabled via spark.wap.branch, canDeleteWhere() and deleteWhere() scan different branches:

  • canDeleteWhere() scans using this.branch (null → main) because the WAP branch is only a session config, not part of the table identifier
  • deleteWhere() resolves the WAP branch before committing

This causes canDeleteWhere() to incorrectly return true (metadata-only delete is possible) based on main's data, while deleteWhere() commits to the
WAP branch where the file has partial matches, resulting in:

ValidationException: Cannot delete file where some, but not all, rows match filter

Example

-- WAP enabled, spark.wap.branch = dev1
INSERT INTO t VALUES (1, 'a'), (2, 'b'), (3, 'c'); -- goes to dev1, main is empty
DELETE FROM t WHERE id = 1;
-- canDeleteWhere scans main (empty) → true → metadata delete
-- deleteWhere commits to dev1 → partial match → ValidationException

Fix

  • canDeleteWhere() now resolves the WAP branch via determineReadBranch before scanning. This uses determineReadBranch (not determineWriteBranch)
    because the scan is a read operation, and determineReadBranch correctly handles the case where the WAP branch doesn't exist yet by falling back to
    main.
  • Both canDeleteWhere() and deleteWhere() now use local variables for the resolved branch instead of mutating this.branch, avoiding side effects on
    reads and other operations that share the field.

Will work on the backport to other Spark versions once there is consensus from the community.

@github-actions github-actions bot added the spark label Mar 4, 2026
@yingjianwu98 yingjianwu98 changed the title fix branch delete for canDeleteWhere where it does not resolve to the correct branch Spark: fix delete from branch for canDeleteWhere where it does not resolve to the correct branch Mar 4, 2026
@rdblue
Copy link
Copy Markdown
Contributor

rdblue commented Apr 3, 2026

From Netflix discussion, this sounds like a correctness bug and release blocker for 1.11.


spark.conf().set(SparkSQLProperties.WAP_BRANCH, "dev1");
try {
// all rows go into one file on the WAP branch; main stays empty
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we also insert some rows/files into the main branch first? ideally with a row of matching the predicate of id=1 .

// resolve the WAP branch so they scan and commit to the same branch
sql("DELETE FROM %s WHERE id = 1", tableName);

assertEquals(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use assertj

assertThat(sql("SELECT * FROM %s VERSION AS OF 'dev1' ORDER BY id", tableName))
.containsExactlyInAnyOrder(...)

@stevenzwu
Copy link
Copy Markdown
Contributor

@yingjianwu98 can you resolve the merge conflict?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants