Skip to content

Possible read-after-write consistency issue with multiple schema migration steps in Iceberg tables on AWS Glue + S3 #2599

@din14970

Description

@din14970

Apache Iceberg version

Pyiceberg 0.10.0
Pyiceberg-core 0.6.0

Please describe the bug 🐞

This may be a hard one to pin down but I noticed that multiple schema migration steps executed sequentially in the same update_schema context sometimes result in Exceptions like column name not found when using Iceberg tables on AWS Glue. An example:

with table.update_schema() as update:
    update.rename_column("some_column", "renamed_column")
    update.move_first("renamed_column")  # this sometimes fails with an error
                                         # that renamed column doesn't exist

I have not noticed it with other back-ends like SQLite, leading me to believe it is a Glue issue specifically where a write may not yet be reflected by the time of the next operation.

Willingness to contribute

  • I can contribute a fix for this bug independently
  • I would be willing to contribute a fix for this bug with guidance from the Iceberg community
  • I cannot contribute a fix for this bug at this time

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions