Summary
Move classes that are exclusively used by Iceberg (not used by any non-@IcebergApi Comet code) from common into the new iceberg-api module, added in #3240. This will improve the separation of concerns and make the Iceberg API boundary clearer.
Classes to Move
The following 4 classes are only referenced by other @IcebergApi classes and exist purely for Iceberg integration:
| Class |
Current Location |
Purpose |
ParquetColumnSpec |
common/src/main/java/org/apache/comet/parquet/ |
Bridge class to pass column descriptors between Comet and Iceberg (since Iceberg shades Parquet) |
RowGroupReader |
common/src/main/java/org/apache/comet/parquet/ |
PageReadStore implementation for row group reading |
WrappedInputFile |
common/src/main/java/org/apache/comet/parquet/ |
Wraps Iceberg's InputFile to implement Parquet's InputFile interface |
AbstractCometSchemaImporter |
common/src/main/java/org/apache/arrow/c/ |
Base class for schema import functionality |
Important Considerations
Packaging/Shading Concerns
⚠️ This change requires careful attention to packaging and shading:
-
Iceberg shades Parquet - ParquetColumnSpec exists specifically because Iceberg uses shaded Parquet classes. Any module structure changes must preserve this compatibility.
-
JAR structure - Ensure the classes remain accessible to Iceberg after the move. The final packaged JAR must include these classes in a location Iceberg can access.
-
Maven shade plugin configuration - Review and update shade plugin configuration if needed to ensure:
- Classes are included in the shaded JAR
- Package relocations don't break Iceberg's access
- No duplicate classes across modules
-
Dependency direction - The iceberg-public-api module currently only contains tests. If we move source classes there, we need to:
- Update module dependencies
- Ensure
common module can still depend on these classes if needed
- Consider if a separate
iceberg-api module (non-test) is more appropriate
Testing Requirements
Alternative Approaches
- Keep in
common, just document - Leave classes in common but document they're Iceberg-specific
- New
iceberg-api source module - Create a separate source module (not test-only) for Iceberg API classes
- Subpackage organization - Move to
org.apache.comet.iceberg.* package within common
Related
Summary
Move classes that are exclusively used by Iceberg (not used by any non-
@IcebergApiComet code) fromcommoninto the newiceberg-apimodule, added in #3240. This will improve the separation of concerns and make the Iceberg API boundary clearer.Classes to Move
The following 4 classes are only referenced by other
@IcebergApiclasses and exist purely for Iceberg integration:ParquetColumnSpeccommon/src/main/java/org/apache/comet/parquet/RowGroupReadercommon/src/main/java/org/apache/comet/parquet/WrappedInputFilecommon/src/main/java/org/apache/comet/parquet/AbstractCometSchemaImportercommon/src/main/java/org/apache/arrow/c/Important Considerations
Packaging/Shading Concerns
Iceberg shades Parquet -
ParquetColumnSpecexists specifically because Iceberg uses shaded Parquet classes. Any module structure changes must preserve this compatibility.JAR structure - Ensure the classes remain accessible to Iceberg after the move. The final packaged JAR must include these classes in a location Iceberg can access.
Maven shade plugin configuration - Review and update shade plugin configuration if needed to ensure:
Dependency direction - The
iceberg-public-apimodule currently only contains tests. If we move source classes there, we need to:commonmodule can still depend on these classes if needediceberg-apimodule (non-test) is more appropriateTesting Requirements
iceberg-public-apimodule testsAlternative Approaches
common, just document - Leave classes incommonbut document they're Iceberg-specificiceberg-apisource module - Create a separate source module (not test-only) for Iceberg API classesorg.apache.comet.iceberg.*package withincommonRelated
iceberg-public-apitest module@IcebergApiannotation