Skip to content

fix(index): use range block max for fts conjunction#7387

Open
BubbleCal wants to merge 5 commits into
mainfrom
yang/fts-and-range-block-max
Open

fix(index): use range block max for fts conjunction#7387
BubbleCal wants to merge 5 commits into
mainfrom
yang/fts-and-range-block-max

Conversation

@BubbleCal

@BubbleCal BubbleCal commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Bug Fix

What is the bug?

FTS AND/conjunction block-max pruning could only ask a posting for its current block max score. When the lead posting defines a wider up_to window, another posting can have a higher block max later in that same window, so using only its current block can understate the safe upper bound for Lucene-style getMaxScore(upTo).

What issues or incorrect behavior does the bug cause?

The understated upper bound can make and_advance_target skip a lead block window even though a later document in that window could still beat the current top-k threshold. For exact BM25 search, pruning must use a safe upper bound so possible top-k documents are not dropped.

How does this PR fix the problem?

This adds a query-time BlockMaxWindow to compressed posting iterators. The window lazily maintains a monotonic deque of block max scores over [current shallow block, block containing up_to]. AND/conjunction now lets the lead posting choose up_to and asks each follower for a range max that safely covers that same up_to. Plain postings still fall back to their existing list-level upper bound. This does not change the index format or posting-list build path.

Tests

  • cargo fmt --all --check
  • git diff --check
  • CARGO_TARGET_DIR=/tmp/lance-target-fts-and-rangemax-main cargo test -p lance-index scalar::inverted::wand::tests -- --nocapture
  • CARGO_TARGET_DIR=/tmp/lance-target-fts-and-rangemax-clippy cargo clippy --all --tests --benches -- -D warnings

@github-actions github-actions Bot added A-index Vector index, linalg, tokenizer bug Something isn't working labels Jun 22, 2026
@codecov

codecov Bot commented Jun 22, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 90.84746% with 27 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance-index/src/scalar/inverted/wand.rs 90.84% 24 Missing and 3 partials ⚠️

📢 Thoughts on this report? Let us know!

@github-actions github-actions Bot added the A-python Python bindings label Jun 22, 2026
@BubbleCal BubbleCal marked this pull request as ready for review June 22, 2026 16:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-index Vector index, linalg, tokenizer A-python Python bindings bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant