[GLUTEN-10511][VL][Delta] Fix wrong result with partition filters under column mapping#12240
Conversation
…er column mapping When `delta.columnMapping.mode` is `name` or `id`, a query with a partition column filter could return all rows instead of the pruned set. The same mechanism also disabled file-level data skipping. Root cause is in `DeltaPostTransformRules.transformColumnMappingPlan`. The rule rewrote `partitionFilters`, `dataFilters`, `partitionSchema`, and `requiredSchema` from logical to physical column names so the parquet reader sees physical names. But Delta's `PreparedDeltaFileIndex.matchingFiles` / `Snapshot.filesForScan` and `DeltaLog.rewritePartitionFilters` resolve filter attributes against the *logical* `metadata.partitionSchema` and the logical column-stats schema. Once attributes were physical-named: - `rewritePartitionFilters` could not match the physical attribute against the logical partition schema, fell into the `case None` branch, and emitted a bare `UnresolvedAttribute` without a `Cast` — partition pruning silently no-op'd, all files were returned. - File-level stats skipping silently missed all files (perf regression that masked the same root cause). Fix: keep filter attrs and partition schema LOGICAL on the scan node so Delta's file index resolves them correctly. Reader-facing pieces (`output`, `dataSchema`, the data fields of `requiredSchema`) stay PHYSICAL so the parquet reader and Velox find the right columns. Filter binding to the native side is by exprId, not by name, so logical-named filter attrs still resolve correctly against the physical-named `output`. `DeltaScanTransformer.scanFilters` is overridden to translate the logical `dataFilters` to their physical-named counterparts (by exprId match against `output`) for the native side, since `BasicScanExecTransformer.filterExprs()` matches `scanFilters` against `pushDownFilters` (built over the physical-named scan output) by `AttributeReference.equals`, which checks names. Tests added in `DeltaSuite` cover both `name` and `id` modes for: equality / range / IN partition filters, multi-partition-column filters, combined partition+data filters, IS [NOT] NULL on partition columns, and column-rename scenarios for both partition and data columns. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Run Gluten Clickhouse CI on x86 |
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Run Gluten Clickhouse CI on x86 |
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Run Gluten Clickhouse CI on x86 |
…counts Assert both `df.inputFiles.length` (Delta-level pruning) and `DeltaScanTransformer.getPartitions.size` (post-Gluten-rewrite split count) on every test, so a regression that disables either layer is caught. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Run Gluten Clickhouse CI on x86 |
…ssertion `getPartitions` reflects post-coalesce splits (Velox merges small per-partition files into one input split), which made the assertion flaky for tables with tiny files. Switch to `getPartitionArray` -- the pre-coalesce list of partition directories selected by the executed scan -- which is the actual call site exercised by the bug fix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Run Gluten Clickhouse CI on x86 |
…eanup Expand the docstring of `transformColumnMappingPlan` to explain why some scan node fields stay logical and others become physical, and note the longer-term cleanup direction (defer all physical translation to substrait emission, which would remove both the alias-back project and the scanFilters override). Mirror the same context in the `DeltaScanTransformer.scanFilters` override so each site reads independently. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Run Gluten Clickhouse CI on x86 |
|
@zhztheplayer @FelixYBW could you take a look when you get a chance? Quick context: this fixes the wrong-result bug in #10511 by narrowing the column-mapping rewrite — only the reader-facing fields ( The result is asymmetric on the scan node, which I know is a bit ugly. The reason is that vanilla Spark + Delta does the logical→physical translation just-in-time inside Verified locally end-to-end with the prebuilt CI artifacts in |
|
@zhztheplayer @FelixYBW can you prioritize the PR as it's a silent wrong result issue? Thanks |
|
Run Gluten Clickhouse CI on x86 |
What changes are proposed in this pull request?
Fixes #10511.
When
delta.columnMapping.modeisnameorid, a query with a partition column filter could return all rows instead of the pruned set. The same mechanism also disabled file-level data skipping under column mapping (a silent perf regression that masked the same root cause).Reproducer (from the issue)
Root cause
DeltaPostTransformRules.transformColumnMappingPlanrewrotepartitionFilters,dataFilters,partitionSchema, andrequiredSchemafrom logical → physical column names so the parquet reader sees physical names. But Delta'sPreparedDeltaFileIndex.matchingFiles/Snapshot.filesForScanandDeltaLog.rewritePartitionFiltersresolve filter attributes against the logicalmetadata.partitionSchemaand the logical column-stats schema. Once filter attributes were physical-named:rewritePartitionFilterscould not match the physical attribute against the logical partition schema, fell into thecase Nonebranch, and emitted a bareUnresolvedAttributewithout aCast— partition pruning silently no-op'd, all files were returned.Vanilla Spark + Delta resolves this asymmetry inside
DeltaParquetFileFormat.buildReaderWithPartitionValues(which translates filters/schema to physical only when handing to the parquet reader). Gluten bypasses that hook, so the rule rewrote everything uniformly — too aggressively.Fix
The scan node now has a deliberate logical/physical split:
output,dataSchema, data fields ofrequiredSchemaNamedStructVelox uses to find columns in parquet.partitionSchema,partitionFilters,dataFilters, partition fields ofrequiredSchemaPreparedDeltaFileIndex/Snapshot.filesForScanfor partition pruning + stats-based file skipping. Delta resolves these against logical schemas.DeltaScanTransformer.scanFilters(override)dataFiltersby exprId match againstoutputBasicScanExecTransformer.filterExprs()matchesscanFiltersagainst the physical-namedpushDownFiltersfrom the upstreamFilterbyAttributeReference.equals(which compares names).The fix is Gluten-only — no Delta-side changes.
Why the design looks asymmetric
Vanilla Spark + Delta keeps everything on the scan node logical and translates physical-only inside
DeltaParquetFileFormat.buildReaderWithPartitionValues. Gluten bypasses that hook (it goes to native via Substrait), so the translation has to live somewhere on our side. The rule today does this translation eagerly on the scan node for the reader-facing fields, which is the smallest delta from the pre-fix shape and explains why some fields are physical and others logical.A cleaner shape — possible follow-up — is to keep ALL scan-node fields logical and translate only at substrait emission time (inside the
NamedStruct/ReadRelbuild inBasicScanExecTransformer.doTransform). That removes both the alias-backProjectExecTransformerand thescanFiltersoverride, but it requires plumbing Delta-specific physical-name lookup into the substrait emitter. Out of scope for this bug fix; noted in the docstring oftransformColumnMappingPlan.How was this patch tested?
DeltaSuiteis extended with new tests, all parameterized over bothnameandidcolumn-mapping modes:column mapping mode = $mode with partition filter (single partition col)— equality, range, andINpredicates on a partition column. Assertsdf.inputFiles.length(Delta-level pruning) andDeltaScanTransformer.getPartitionArray.length(post-Gluten-rewrite selected partition count) both reflect the pruned set. Includes the exact case from the bug report.column mapping mode = $mode with partition filter (multi partition col)— predicate spanning two partition columns; predicate on only one of two.column mapping mode = $mode with partition + data filter— partition pruning + data column predicate together; data-only predicate exercises file-level stats skipping (asserts the file count actually shrunk).column mapping mode = $mode with IS [NOT] NULL on partition col— null-partition handling.column mapping mode = $mode partition filter survives column rename—ALTER TABLE ... RENAME COLUMNon the partition column.column mapping mode = $mode data column rename + filter (file skipping)—ALTER TABLE ... RENAME COLUMNon a data column; ensures filter pushdown still resolves to the physical column in parquet AND Delta's stats-based skipping resolves the new logical name.The existing tests (
column mapping mode = id,column mapping mode = name,column mapping with complex type) continue to pass.Verified locally end-to-end: pulled
apache/gluten:centos-8-jdk8, downloaded the prebuilt CI artifacts (velox-native-lib,arrow-jars), ranmvn test -pl backends-velox -DwildcardSuites=org.apache.gluten.execution.VeloxDeltaSuite. 30/30 tests pass, 0 failures, 0 errors, including all 12 new tests.Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Claude Opus 4.7)