Skip to content

feat: add support for pyiceberg 0.10.0#5277

Merged
desmondcheongzx merged 3 commits intoEventual-Inc:mainfrom
gweaverbiodev:gw/iceberg-0-10-0-support
Sep 25, 2025
Merged

feat: add support for pyiceberg 0.10.0#5277
desmondcheongzx merged 3 commits intoEventual-Inc:mainfrom
gweaverbiodev:gw/iceberg-0-10-0-support

Conversation

@gweaverbiodev
Copy link
Copy Markdown
Contributor

@gweaverbiodev gweaverbiodev commented Sep 24, 2025

Changes Made

Added changes to conditionally handle breaking changes in pyiceberg 0.10.0 for how to initialize DataFile and Record and constraints on name for Schema fields and PartitionField.

I did not update dev dependencies at this point, because there are actually still issues with Decimal. I have a PR open that hopefully addresses this. To test, I had to temporarily change dev dependencies to 0.10.0, and run tests without decimal type. I need to update to 0.10.0 because support for anonymous was added (needed on my side). Note sure what the best path is here considering that pyiceberg releases on a much slower cadence.

Related Issues

Closes #5223

Checklist

  • Documented in API Docs (if applicable)
  • Documented in User Guide (if applicable)
  • If adding a new documentation page, doc is added to docs/mkdocs.yml navigation
  • Documentation builds and is formatted properly (tag @/ccmao1130 for docs review)

Comment thread tests/io/iceberg/test_iceberg_writes.py
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Summary

This PR adds compatibility support for PyIceberg 0.10.0 by implementing version-specific handling for breaking API changes in DataFile and Record constructors. The changes use runtime version checking to conditionally call the appropriate constructor methods.

Key Changes

  • Added version checks using packaging.version.parse() to handle PyIceberg 0.10.0+ compatibility
  • DataFile now uses .from_args() method for v0.10.0+ vs direct constructor for older versions
  • IcebergRecord constructor changed from keyword arguments to positional arguments for v0.10.0+
  • Updated test partition field names for better clarity and uniqueness
  • Maintained backward compatibility with existing PyIceberg versions

Issues Found

  • Minor typo in test partition field name: "decimadecimal_truncate_partitionedl"
  • Import statements should be moved to module level per project conventions

Confidence Score: 4/5

  • This PR is safe to merge with minor fixes needed
  • The code correctly implements version-based compatibility handling with proper fallbacks. The logic is sound and maintains backward compatibility. Only minor issues are a typo and style preference for import placement.
  • Pay attention to the typo in tests/io/iceberg/test_iceberg_writes.py line 412

Important Files Changed

File Analysis

Filename        Score        Overview
daft/io/iceberg/iceberg_write.py 4/5 adds version-based compatibility handling for pyiceberg 0.10.0 DataFile and Record constructors with minor inline import style issue
tests/io/iceberg/test_iceberg_writes.py 3/5 updates partition field names for clarity and test compatibility with minor typo in partition field name

Sequence Diagram

sequenceDiagram
    participant Client as Daft Client
    participant Writer as Iceberg Writer
    participant PyIceberg as PyIceberg Library
    participant Version as Version Check

    Client->>Writer: write_iceberg_table()
    Writer->>Version: parse(pyiceberg.__version__) >= parse("0.10.0")
    
    alt pyiceberg >= 0.10.0
        Writer->>PyIceberg: DataFile.from_args(**kwargs)
        Note over Writer,PyIceberg: Uses new constructor method
        Writer->>PyIceberg: IcebergRecord(*values_list)
        Note over Writer,PyIceberg: Positional arguments only
    else pyiceberg < 0.10.0
        Writer->>PyIceberg: DataFile(**kwargs)
        Note over Writer,PyIceberg: Uses legacy constructor
        Writer->>PyIceberg: IcebergRecord(**values_dict)
        Note over Writer,PyIceberg: Keyword arguments
    end
    
    PyIceberg-->>Writer: DataFile/Record instances
    Writer-->>Client: Write successful
Loading

2 files reviewed, 2 comments

Edit Code Review Bot Settings | Greptile

Comment thread tests/io/iceberg/test_iceberg_writes.py
Comment thread daft/io/iceberg/iceberg_write.py
@desmondcheongzx desmondcheongzx changed the title add support for pyiceberg 0.10.0 feat: add support for pyiceberg 0.10.0 Sep 24, 2025
@github-actions github-actions Bot added the feat label Sep 24, 2025
Copy link
Copy Markdown
Collaborator

@desmondcheongzx desmondcheongzx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM besides the greptile nits

Thank you so much @gmweaver this is awesome

@desmondcheongzx desmondcheongzx enabled auto-merge (squash) September 24, 2025 23:51
@codecov
Copy link
Copy Markdown

codecov Bot commented Sep 25, 2025

Codecov Report

❌ Patch coverage is 25.00000% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.14%. Comparing base (153a726) to head (8df5210).
⚠️ Report is 346 commits behind head on main.

Files with missing lines Patch % Lines
daft/io/iceberg/iceberg_write.py 25.00% 6 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #5277      +/-   ##
==========================================
- Coverage   74.69%   74.14%   -0.56%     
==========================================
  Files         972      972              
  Lines      123685   125951    +2266     
==========================================
+ Hits        92392    93387     +995     
- Misses      31293    32564    +1271     
Files with missing lines Coverage Δ
daft/io/iceberg/iceberg_write.py 64.80% <25.00%> (-1.04%) ⬇️

... and 37 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@desmondcheongzx desmondcheongzx merged commit cd5f2e1 into Eventual-Inc:main Sep 25, 2025
71 of 75 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Compatibility with PyIceberg v0.10

2 participants