Skip to content

[iceberg] Query plan doesn't match when Comet scan is enabled #2120

@hsiang-c

Description

@hsiang-c

Describe the bug

TestFilterPushDown > testFilterPushdownWithBucketTransform() > catalogName = testhadoop, implementation = org.apache.iceberg.spark.SparkCatalog, config = {type=hadoop, cache-enabled=false}, planningMode = testhadoop FAILED
    java.lang.AssertionError: [Post scan filter should match] 
    Expecting actual:
      "*(1) ColumnarToRow
    +- AQEShuffleRead local
       +- ShuffleQueryStage 0
          +- CometColumnarExchange rangepartitioning(id ASC NULLS FIRST, 200), ENSURE_REQUIREMENTS, CometColumnarShuffle, [plan_id=469743]
             +- CometFilter [id, salary, dep], (id = 1)
                +- CometBatchScan testhadoop.default.table[id, salary, dep] testhadoop.default.table (branch=null) [filters=dep IS NOT NULL, id IS NOT NULL, dep = 'd1', id = 1, groupedBy=] RuntimeFilters: []
    "
    to contain:
      "Filter (id = 1)" 
        at org.apache.iceberg.spark.sql.TestFilterPushDown.checkFilters(TestFilterPushDown.java:603)
        at org.apache.iceberg.spark.sql.TestFilterPushDown.testFilterPushdownWithBucketTransform(TestFilterPushDown.java:391)

Steps to reproduce

SparkSession configs used:

            .config("spark.plugins", "org.apache.spark.CometPlugin")
            .config("spark.shuffle.manager", "org.apache.spark.sql.comet.execution.shuffle.CometShuffleManager")
            .config("spark.comet.explainFallback.enabled", "true")
            .config("spark.sql.iceberg.parquet.reader-type", "COMET")
            .config("spark.memory.offHeap.enabled", "true")
            .config("spark.memory.offHeap.size", "10g")
            .config("spark.comet.use.lazyMaterialization", "false")
            .config("spark.comet.schemaEvolution.enabled", "true")

Expected behavior

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions