-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Description
Feature Request / Improvement
Query Engine
Flink
Feature Request / Improvement
The Flink Sink implementation (IcebergSink and related classes) currently uses package-private visibility for most internal classes, constructors, and accessors. This makes it impractical for downstream connector implementations to compose custom sink pipelines that build on Iceberg's existing infrastructure.
Use Case / Motivation
Downstream Flink integrations often need to:
-
Add custom metadata to committables — e.g. watermark information, lineage data, or
application-specific tracking that flows from writers through to committers. Currently there is no
extension point onIcebergCommittablefor this. -
Compose custom sink topologies — e.g. wrapping
IcebergSinkWriterorIcebergCommitter
with custom metrics, logging, or error handling. Even composition (delegation/wrapping) requires
being able to reference these types, which is impossible when they are package-private. -
Extend
IcebergSinkto overridecreateWriter()orcreateCommitter()with custom
implementations while reusing the base builder logic and data distribution.
Without these extension points, downstream projects must copy large portions of the sink code,
creating maintenance burden and version skew.
Proposed Changes
Part 1: CommittableMetadata framework (new composition-based extension point)
CommittableMetadata— marker interface for custom metadata on committablesCommittableMetadataSerializer— serializer interface for metadataCommittableMetadataRegistry— global registry allowing downstream to register a serializerIcebergCommittable— adds optional@Nullable metadatafield (backward-compatible: existing
constructors chain withmetadata = null)IcebergCommittableSerializer— writes boolean flag + delegates to registered serializer
This is a pure composition pattern — downstream registers a serializer, attaches metadata via the
existing constructors, and the pipeline carries it through transparently.
Part 2: Access modifier changes (enabling downstream composition)
Widen visibility of key sink pipeline classes (IcebergCommittable, IcebergCommitter,
IcebergSinkWriter, IcebergWriteAggregator, etc.) from package-private to public, and add
protected accessors on IcebergSink so that downstream implementations can reference, wrap, or
extend these types.
All changes are additive. No existing behavior changes. No new dependencies.
Willingness to contribute
- I can contribute this improvement/feature independently
- I would be willing to contribute this improvement/feature with guidance from the Iceberg community
- I cannot contribute this improvement/feature at this time