Skip to content

fix(compaction): prune fully-deleted fragments from the fragment-reuse index#7378

Open
xuanyu-z wants to merge 1 commit into
lance-format:mainfrom
xuanyu-z:fix/fri-prune-fully-deleted-fragment
Open

fix(compaction): prune fully-deleted fragments from the fragment-reuse index#7378
xuanyu-z wants to merge 1 commit into
lance-format:mainfrom
xuanyu-z:fix/fri-prune-fully-deleted-fragment

Conversation

@xuanyu-z

@xuanyu-z xuanyu-z commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Fixes #7374.

Problem

When compaction is deferred (defer_index_remap: true), index entries are not physically remapped at compaction time; instead the fragment-reuse index (FRI) records old→new row-address maps and rewrites addresses on the fly via remap_row_id.

If a fragment is fully deleted before the compaction, apply_deletions removes it from the manifest entirely (no deletion file). It therefore never enters a compaction task and is absent from every rewrite group's old_fragments. An index that still covered that fragment keeps its stale entries, and because no group references the fragment, remap_row_id passes those addresses straight through — resurfacing as dangling references to a fragment that no longer exists.

The corruption is latent (effective_fragment_bitmap masks the dead fragment at query time) but surfaces once another rewrite touches the affected coverage.

Fix

Track such orphaned fragments explicitly and prune their rows in the FRI:

  • New additive proto field FragmentReuseIndexDetails.Version.removed_fragments (field 4) and FragReuseVersion.removed_frags.
  • At commit_compaction (deferred branch), compute the orphaned set as covered − live_fragments, where covered is the union of every non-system index's coverage (read post-FRI-remap via load_indices, so fragments rewritten by earlier deferred compactions are already excluded) and live_fragments is the pre-commit fragment set. The difference is exactly the fragments an index still points at that have vanished from the manifest.
  • remap_row_id maps any row whose final address (after chaining through all versions) lands in a removed_fragments fragment to None.

Why track by fragment ID rather than per-row address

A fully-deleted fragment is gone from the manifest, so its physical_rows/address range can no longer be enumerated to build per-address addr → None entries. Since the entire fragment is gone, fragment-ID granularity is exact.

Why the prune is applied after chaining, not before

A fragment can be removed at any generation. A row originally in fragment A may have been rewritten by an earlier deferred compaction into fragment B, which is then deleted. The orphaned fragment (B) shows up only at the final mapped address, not the input one. A removed fragment is always terminal (nothing compacts a manifest-removed fragment forward) and fragment IDs are never reused, so checking the final address never prunes a live row.

Coverage is intentionally left intact

remap_fragment_bitmap does not strip orphaned fragments from index coverage. Their rows are already pruned from index data by remap_row_id, and a nonexistent fragment is masked at query time by effective_fragment_bitmap. Stripping coverage instead makes the planner treat the dead fragment's row range as an unindexed flat-scan fallback, which fails on take.

Backward compatibility

removed_fragments is an additive proto field. The FRI is persisted as protobuf; an older index decodes the absent field as an empty list, so remap_row_id short-circuits and behaves exactly as before. No migration is required.

Tests

The corruption is latent end-to-end (create_restricted_deletion_mask blocks missing fragments and effective_fragment_bitmap masks dead ones), so the core cases assert at the FRI level (remap_row_id must return None) via a shared assert_fri_prunes helper, and keep faithful end-to-end checks where a non-empty projection actually exercises the take. Shared helpers (dataset_with_ivf_pq, dataset_with_btree, assert_fri_prunes, assert_knn_no_ghosts) keep the cases small; FRI-level-only cases use a cheap scalar BTREE index (no vector training).

  • test_defer_index_remap_fully_deleted_fragment — the Vector index can become corrupted when compaction is deferred #7374 repro: a fully-deleted fragment, deferred-compacted. Asserts the FRI prunes it and KNN returns no ghosts during the FRI window, then again after the physical remap + FRI trim that closes the window.
  • test_defer_index_remap_deleted_merged_fragment — a fragment merged by one deferred compaction then fully deleted before a second; pruned at its post-merge ID while survivors are kept. (Fails if the prune is applied to the input address instead of the final one.)
  • test_defer_index_remap_multiple_fully_deleted_fragments — several whole-fragment deletions before one compaction; verifies removed_frags is a union.
  • test_defer_index_remap_survivor_deleted_across_versions — a fragment left untouched by an earlier deferred compaction and then deleted (pruned via pass-through, not chaining).
  • test_defer_index_remap_then_delete_during_window — reverse ordering: compaction then delete during the FRI window; the new fragment's deletion vector is honored.

@github-actions

Copy link
Copy Markdown
Contributor

Important

This PR touches the Lance format specification.

Substantive changes to the format specification — the .proto definitions
and the spec docs under docs/src/format/ — require a PMC vote before merge.
Minor edits such as typo fixes, wording, or formatting are excluded; use your
judgment.

If this is a meaningful format change:

  • Start a vote following the Lance community voting process.
    Format specification modifications need 3 binding +1 votes (excluding the
    proposer), held on GitHub Discussions, with a minimum voting period of 1 week.
  • Once the vote passes, link the completed vote in this PR. It should not be
    merged until the vote is linked.

@github-actions github-actions Bot added bug Something isn't working A-format On-disk format: protos and format spec docs and removed bug Something isn't working labels Jun 19, 2026
@codecov

codecov Bot commented Jun 19, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 97.89474% with 6 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance/src/dataset/optimize.rs 97.72% 1 Missing and 5 partials ⚠️

📢 Thoughts on this report? Let us know!

@xuanyu-z xuanyu-z force-pushed the fix/fri-prune-fully-deleted-fragment branch from 311eeef to 16c4c9a Compare June 21, 2026 21:03
@github-actions github-actions Bot added the bug Something isn't working label Jun 21, 2026
@xuanyu-z xuanyu-z force-pushed the fix/fri-prune-fully-deleted-fragment branch 2 times, most recently from abff2f1 to d603bc4 Compare June 21, 2026 21:26
@xuanyu-z xuanyu-z marked this pull request as ready for review June 21, 2026 21:27
@xuanyu-z xuanyu-z force-pushed the fix/fri-prune-fully-deleted-fragment branch from d603bc4 to 536997b Compare June 22, 2026 00:52
…e index

A fragment that has all its rows deleted is removed from the manifest at
delete time, so it never enters a compaction task and is absent from every
fragment-reuse-index (FRI) group's old_frags. With deferred compaction
(defer_index_remap=true) the inline remap is skipped, so
FragReuseIndex::remap_row_id passed those addresses through unchanged
instead of pruning them. An index that still covered the removed fragment
then kept dangling references to it, surfacing as 'take received reference to
fragment that does not exist' errors / ghost results after a later physical
remap.

Record the orphaned fragment IDs (covered by an index but no longer present
in the dataset, and not part of any rewrite group) on the FRI version and
prune those addresses to None in remap_row_id, fixing both auto-remap at load
and physical remap. Index coverage bitmaps are left intact (masked at query
time by effective_fragment_bitmap); only the row data is pruned. The new proto
field is additive and backward-compatible.

Adds a regression test.

Fixes lance-format#7374
@xuanyu-z xuanyu-z force-pushed the fix/fri-prune-fully-deleted-fragment branch from 536997b to 216112c Compare June 22, 2026 01:06
@xuanyu-z

Copy link
Copy Markdown
Contributor Author

@claude review once

@westonpace westonpace left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch for this error. I have one minor question but this is good otherwise.

Comment on lines +223 to +225
/// Cache of `details.removed_frag_bitmap()`; rebuilt in `new`, so skipped.
#[serde(skip)]
removed_frags: RoaringBitmap,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably unrelated to this PR but why do we store the FRI as a collection of versions in the first place? Why not merge them all together at write time?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the git history and code comments and my understandings, the main benefit is versioning allows the system to trim the FRI as different indices catch up so the size of the FRI is not unbounded before a full index job kicks in.

Then versioning also makes FRI ~append only which inherit all the consistency benefits

/// Top-200 KNN projecting `i` (the projection forces a `take`, so a stale
/// row address would error). Asserts no result id falls in `ghost_range`
/// and at least one live row is returned.
async fn assert_knn_no_ghosts(dataset: &Dataset, ghost_range: std::ops::Range<i32>) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ghosts is a good word for the phenomenon 👻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-format On-disk format: protos and format spec docs bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Vector index can become corrupted when compaction is deferred

2 participants