Skip to content

feat(iceberg): Add snapshot utils to scan ancestors#2285

Open
CTTY wants to merge 3 commits intoapache:mainfrom
CTTY:ctty/ancestors
Open

feat(iceberg): Add snapshot utils to scan ancestors#2285
CTTY wants to merge 3 commits intoapache:mainfrom
CTTY:ctty/ancestors

Conversation

@CTTY
Copy link
Collaborator

@CTTY CTTY commented Mar 24, 2026

Which issue does this PR close?

What changes are included in this PR?

  • Add Ancestors to help scan past snapshots
  • Moved existing util to the new utils mod

Are these changes tested?

Yes

@CTTY CTTY marked this pull request as ready for review March 24, 2026 23:28
Copy link
Contributor

@blackmwk blackmwk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @CTTY for this pr!

}

/// Iterate starting from `snapshot` (inclusive) to the root snapshot.
pub fn ancestors_of(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not make this part of TableMetadata?

Copy link
Collaborator Author

@CTTY CTTY Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think putting this in TableMetadata is also fine. This code structure is to echo what we have in Java: https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/util/SnapshotUtil.java

}

/// Iterate starting from `snapshot` (inclusive) to `oldest_snapshot_id` (exclusive).
pub fn ancestors_between(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be put into Ancestors struct.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you help me understand what are the benefits of doing so?

I think Ancestors here are internal details to help us iterate thru snapshots. the helper functions can be the convenient entry points


use crate::spec::{SnapshotRef, TableMetadataRef};

struct Ancestors {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we may have more conditions to build iterator, it's more idiomatic to implement IntoIterator for Ancestors, rather than Iterator.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I fully understand, do you mean something like this?

let iter = Ancestors::new()
    .until(oldest_snapshot_id)
    .into_iter();

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add helpers to scan snapshot ancestors

2 participants