Skip to content

[iceberg] Comet execution only takes Arrow Arrays, but got class org.apache.iceberg.spark.data.vectorized.ColumnVectorWithFilter #2117

@hsiang-c

Description

@hsiang-c

Describe the bug

TestMergeOnReadUpdate > testUpdateRefreshesRelationCache() > catalogName = testhadoop, implementation = org.apache.iceberg.spark.SparkCatalog, config = {type=hadoop}, format = PARQUET, vectorized = true, distributionMode = hash, fanout = true, branch = null, planningMode = LOCAL, formatVersion = 2 FAILED
    org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 632.0 failed 1 times, most recent failure: Lost task 0.0 in stage 632.0 (TID 732) (localhost executor driver): org.apache.spark.SparkException: Comet execution only takes Arrow Arrays, but got class org.apache.iceberg.spark.data.vectorized.ColumnVectorWithFilter

Steps to reproduce

SparkSession configs used:

            .config("spark.plugins", "org.apache.spark.CometPlugin")
            .config("spark.shuffle.manager", "org.apache.spark.sql.comet.execution.shuffle.CometShuffleManager")
            .config("spark.comet.explainFallback.enabled", "true")
            .config("spark.sql.iceberg.parquet.reader-type", "COMET")
            .config("spark.memory.offHeap.enabled", "true")
            .config("spark.memory.offHeap.size", "10g")
            .config("spark.comet.use.lazyMaterialization", "false")
            .config("spark.comet.schemaEvolution.enabled", "true")

Expected behavior

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions