Skip to content

Conversation

@SaymV
Copy link

@SaymV SaymV commented Dec 23, 2025

Closes #1820

Rationale for this change

Apache Iceberg v3 introduces native geometry and geography primitive types.
This PR adds spec-compliant support for those types in PyIceberg, including:

  • New primitive types with parameter handling and format-version enforcement
  • Schema parsing and round-trip serialization
  • Avro mapping using WKB bytes
  • PyArrow / Parquet integration with version-aware fallbacks
  • Unit test coverage for type behavior and compatibility constraints

A full design and scope discussion is available in the accompanying RFC:
📄 RFC: Iceberg v3 Geospatial Primitive Types
The RFC documents scope, non-goals, compatibility constraints, and known limitations.

Are these changes tested?

Yes.

  • Unit tests cover:
    • Type creation (default and custom parameters)
    • __str__ / __repr__ round-tripping
    • Equality, hashing, and pickling
    • Format version enforcement (v3-only)
  • PyArrow-dependent behavior is version-gated and conditionally tested

Are there any user-facing changes?

Yes.

  • Users may now declare geometry and geography columns in Iceberg v3 schemas
  • Parquet files written with PyArrow ≥ 21 preserve GEO logical types when available
  • Limitations (e.g. no WKB↔WKT conversion, no spatial predicates) are documented

Implement support for Iceberg v3 geospatial types as specified in the
Iceberg specification:

- Add GeometryType(crs) and GeographyType(crs, algorithm) to types.py
- Default CRS is "OGC:CRS84", default algorithm is "spherical"
- Types require format version 3 (minimum_format_version() returns 3)
- Values are stored as WKB (Well-Known Binary) bytes at runtime
- Avro schema conversion maps to "bytes"
- PyArrow conversion maps to large_binary()
- Add type string parsing for geometry('CRS') and geography('CRS', 'algo')
- Add visitor pattern support in schema.py and resolver.py

Note: JSON single-value encoding (WKB<->WKT) raises NotImplementedError
as it requires external libraries (e.g., Shapely) which are not included
to avoid heavy dependencies.
@SaymV SaymV changed the title Feat/geospatial types Add Geometry & Geography Types Dec 23, 2025
@SaymV SaymV changed the title Add Geometry & Geography Types feat: Add Geometry & Geography Types Dec 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add GeometryType / GeographyType

1 participant