Tighten subset ordering contract coverage by Fieldnote-Echo · Pull Request #179 · Fieldnote-Echo/ordvec

Fieldnote-Echo · 2026-06-04T06:14:47Z

Summary

make RankQuant and two-stage fuzz targets assert the full score-descending/doc-ID-ascending ordering contract
move the RankQuant SIMD centre offset into TopK insertion/eviction so boundary retention uses the same visible score key as final output
add the k=1 offset/rounding-collapse eviction regression
strengthen two-stage regression coverage so full-candidate subset rerank preserves full RankQuant IDs and ordered scores
document duplicate-candidate behavior consistently across Rust, Python, C ABI/header, Go, and rank-mode docs
add a Python RankQuant duplicate-candidate regression and remove stale sentinel-padding wording from the Python subset docstring

Review notes

Review remediation found the earlier finalize-only offset approach was insufficient for TopK boundary eviction. The branch now applies the query-constant offset before TopK retention decisions and keeps final sorting on the already-visible score key.

Subset/two-stage duplicate-candidate assertions intentionally allow equal row IDs where duplicate candidates are documented; full-index fuzz assertions require strict doc-id increase on score ties.

Validation

cargo fmt --check
cbindgen ordvec-ffi --config ordvec-ffi/cbindgen.toml --output ordvec-ffi/include/ordvec.h --verify
cargo test -p ordvec topk_score_offset_is_part_of_eviction_key
cargo test -p ordvec --test determinism_contract
cargo test -p ordvec --test index two_stage
cargo test -p ordvec
cargo test -p ordvec-ffi
cargo clippy --manifest-path fuzz/Cargo.toml --all-targets -- -D warnings
env GOCACHE=/tmp/ordvec-go-build go test -count=1 ./... from ordvec-go
VIRTUAL_ENV=/home/ndspence/GitHub/ordvec/ordvec-python/.venv PATH=/home/ndspence/GitHub/ordvec/ordvec-python/.venv/bin:$PATH pytest ordvec-python/tests/test_rank_quant.py -q
git diff --check

codecov · 2026-06-04T06:15:42Z

Codecov Report

❌ Patch coverage is 92.30769% with 2 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/quant.rs	80.00%	2 Missing ⚠️

📢 Thoughts on this report? Let us know!

gemini-code-assist

Code Review

This pull request documents and tests the behavior of duplicate candidate IDs in subset searches, clarifying that duplicate candidates are scored independently and can produce duplicate hits. It updates documentation across C, Go, Python, and Rust interfaces, and adds assertions in tests and fuzz targets to verify score-descending and document-ID-ascending ordering. A review comment suggests strengthening the tie-breaker assertion in the full-index search fuzz target to require strictly ascending document IDs, as full searches should not return duplicate IDs.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Fieldnote-Echo · 2026-06-04T14:02:18Z

CI follow-up after the first run:

The signbitmap_rankquant_twostage fuzz smoke found a real visible-ordering issue: the SIMD RankQuant asymmetric path was adding the dropped centre offset after TopK finalization, so two distinct internal scores could round-collapse into equal visible scores after the offset and violate the public (score desc, row_id asc) order.
Fixed by applying the query-constant offset inside TopK finalization before the final visible-score sort.
Added regressions for the low-level offset/tie behavior and the fuzz-shaped two-stage case.

Local validation after the fix:

cargo test -p ordvec topk_offset_sort_uses_visible_score_ties
cargo test -p ordvec two_stage
cargo test -p ordvec --test determinism_contract
cargo clippy --manifest-path fuzz/Cargo.toml --all-targets -- -D warnings
cargo test -p ordvec
git diff --check

The AVX512 failure on the same run still appears to be the Intel SDE download 403, not a code regression.

qodo-code-review · 2026-06-04T14:41:39Z

Review Summary by Qodo

Fix visible RankQuant tie ordering after score offsets and strengthen contract coverage

🐞 Bug fix ✨ Enhancement

Walkthroughs

Description

• Fix visible RankQuant tie ordering after score offsets applied
• Strengthen subset ordering contract coverage with full score-desc/doc-ID-asc assertions
• Document duplicate-candidate behavior consistently across all language bindings
• Add regression test for centre-offset visible-score tie ordering

Diagram

flowchart LR
  A["Score offset applied<br/>before finalize"] -->|"finalize_with_score_offset_into"| B["Visible-score ties<br/>ordered by row ID"]
  B -->|"full contract"| C["Score-desc/<br/>doc-ID-asc ordering"]
  D["Fuzz targets &<br/>regression tests"] -->|"assert full ordering"| C
  E["Documentation<br/>across bindings"] -->|"clarify duplicate<br/>candidate behavior"| F["Consistent contract<br/>specification"]

File Changes

1. fuzz/fuzz_targets/search_rankquant.rs ✨ Enhancement +15/-11

Tighten RankQuant fuzz target ordering assertions

fuzz/fuzz_targets/search_rankquant.rs

2. fuzz/fuzz_targets/signbitmap_rankquant_twostage.rs ✨ Enhancement +16/-2

Add score-desc/doc-ID-asc ordering check to two-stage fuzz

fuzz/fuzz_targets/signbitmap_rankquant_twostage.rs

3. src/quant.rs 🐞 Bug fix +14/-24

Apply score offset before finalize sort for correct tie ordering

src/quant.rs

View more (10)

4. src/util.rs ✨ Enhancement +31/-1

Add finalize_with_score_offset_into for visible-score tie ordering

src/util.rs

5. tests/index/two_stage.rs 🧪 Tests +45/-8

Add regression test for centre-offset visible-score ties

tests/index/two_stage.rs

6. ordvec-python/src/lib.rs 📝 Documentation +8/-3

Document duplicate-candidate behavior in Python docstring

ordvec-python/src/lib.rs

7. ordvec-python/tests/test_rank_quant.py 🧪 Tests +17/-2

Add duplicate-candidate regression test and fix docstring

ordvec-python/tests/test_rank_quant.py

8. ordvec-ffi/src/lib.rs 📝 Documentation +7/-0

Document duplicate-candidate behavior in C ABI comments

ordvec-ffi/src/lib.rs

9. ordvec-ffi/include/ordvec.h 📝 Documentation +9/-0

Document duplicate-candidate behavior in C header

ordvec-ffi/include/ordvec.h

10. ordvec-go/doc.go 📝 Documentation +4/-0

Document duplicate-candidate behavior in Go package docs

ordvec-go/doc.go

11. ordvec-go/ordvec.go 📝 Documentation +4/-0

Document duplicate-candidate behavior in Go SearchOptions

ordvec-go/ordvec.go

12. ordvec-go/README.md 📝 Documentation +4/-0

Document duplicate-candidate behavior in Go README

ordvec-go/README.md

13. docs/RANK_MODES.md 📝 Documentation +3/-0

Document duplicate-candidate behavior in rank modes docs

docs/RANK_MODES.md

qodo-code-review · 2026-06-04T14:41:40Z

Code Review by Qodo

🐞 Bugs (2) 📘 Rule violations (0)

1. Offset not used for TopK 🐞 Bug ≡ Correctness

Description

TopK::finalize_with_score_offset_into applies score_offset only during the final sort/output,
while TopK::maybe_insert/eviction decisions are still made on pre-offset scores. This means the
returned *set* of top-k results is not guaranteed to be the top-k under the documented
(visible_score desc, tie_key asc) order when the offset changes float rounding (collapsing
distinct scores into equal visible scores).

Code

src/util.rs[R480-495]

+    /// Drain into `out_scores` / `out_indices`, applying a query-constant score
+    /// offset before the final `(score desc, tie_key asc)` ordering.
+    ///
+    /// SIMD RankQuant asymmetric kernels drop a query-constant centre term from
+    /// the hot loop and re-apply it at finalize time. Adding that offset can
+    /// collapse two distinct finite `f32` scores into one visible output score,
+    /// so the public tie order must be computed after the offset is applied.
+    pub(crate) fn finalize_with_score_offset_into(
+        &self,
+        out_scores: &mut [f32],
+        out_indices: &mut [i64],
+        score_offset: f32,
+    ) {
        debug_assert_eq!(out_scores.len(), out_indices.len());
        for s in out_scores.iter_mut() {
            *s = f32::NEG_INFINITY;

Evidence

The code sorts/output on offset-applied scores, but selection/eviction is still based on raw scores;
this breaks the guarantee that returned results are the top-k under the same ordering key used for
output.

src/util.rs[435-447]
src/util.rs[480-531]
src/quant.rs[400-410]
src/quant.rs[651-660]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`TopK::finalize_with_score_offset_into` changes the ordering key at finalize time by sorting on `(s + score_offset, tie_key, slot)`.

However, `TopK::maybe_insert` and `recompute_worst` still decide what to keep/evict using the *pre-offset* score. When `score_offset` causes rounding-collapse (two distinct `s` values map to the same visible `s + score_offset`), the set of retained entries can differ from the true top-k under the published visible-score order.

### Issue Context
This path is used by RankQuant’s asymmetric SIMD “centre-drop” kernels, where the score shift is applied at the end of the scan.

### Fix Focus Areas
- src/util.rs[421-470]
- src/util.rs[472-532]
- src/quant.rs[341-410]
- src/quant.rs[584-673]

### Implementation direction
- Ensure the *same* score key used for final ordering is also used for keep/evict decisions.
 - Option A (simplest): when `centre_drop_used`, add the query-constant `offset` to each candidate score **before** calling `TopK::maybe_insert` (i.e., store visible scores in `TopK` so selection and ordering match).
 - Option B: extend `TopK` to carry an optional `score_offset` and apply it consistently in comparisons (`better` check and `recompute_worst`) as well as finalize.
- Add a regression test demonstrating boundary behavior (e.g., `k=1`, two scores that become equal after offset, and different tie keys) so selection respects the visible-score tie policy, not just ordering of already-kept items.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

2. Order asserts ignore total_cmp 🐞 Bug ⚙ Maintainability

Description

New test/fuzz helpers assert score order using < and ==, but production result ordering is
defined using f32::total_cmp (a different total order for some finite edge cases like -0.0 vs
0.0). This can let contract tests drift from actual ordering semantics or become brittle on edge
float values.

Code

tests/index/two_stage.rs[R22-31]

+fn assert_score_then_id_order(scores: &[f32], ids: &[i64]) {
+    for slot in 1..scores.len() {
+        let prev = (scores[slot - 1], ids[slot - 1]);
+        let cur = (scores[slot], ids[slot]);
+        assert!(
+            cur.0 < prev.0 || (cur.0 == prev.0 && cur.1 >= prev.1),
+            "results violate score-desc/doc-id-asc order at slots {} and {slot}",
+            slot - 1,
+        );
+    }

Evidence
The assertions use numeric </==, while the production sort uses total_cmp; aligning them
avoids contract drift on float edge cases.
tests/index/two_stage.rs[22-31]
fuzz/fuzz_targets/search_rankquant.rs[112-121]
fuzz/fuzz_targets/signbitmap_rankquant_twostage.rs[36-45]
src/util.rs[514-524]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
Several new “score-desc/doc-id-asc” assertion helpers use `cur_score < prev_score || (cur_score == prev_score && ...)`.

The implementation’s ordering uses `f32::total_cmp` for the score key, which is not identical to `<`/`==` for all finite float values (notably `-0.0` vs `0.0`). To keep contract checks aligned with the library’s actual ordering definition, the asserts should mirror the same comparator.

### Issue Context
`TopK::finalize_with_score_offset_into` sorts by `b.0.total_cmp(&a.0)` and then tie key / slot.

### Fix Focus Areas
- tests/index/two_stage.rs[22-32]
- fuzz/fuzz_targets/search_rankquant.rs[112-121]
- fuzz/fuzz_targets/signbitmap_rankquant_twostage.rs[36-46]
- src/util.rs[514-524]

### Implementation direction
- Replace `<`/`==` logic with `total_cmp`-based logic, e.g.:
 - Ensure `prev_score.total_cmp(&cur_score) != Ordering::Less` (descending)
 - If `prev_score.total_cmp(&cur_score) == Ordering::Equal`, assert `prev_id <= cur_id` (doc-id ascending)
- This keeps tests/fuzz targets checking the exact same ordering contract as the production sort.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Fieldnote-Echo · 2026-06-04T14:42:25Z

Remediated the TopK offset boundary issue in e2ffc7f.

What changed:

Moved the RankQuant SIMD centre offset into TopK insertion/eviction via set_score_offset, so the retained top-k set is selected using the same visible score key that final output exposes.
Removed the finalize-only offset path.
Added the k=1 boundary regression where f32 rounding collapses visible scores and the lower global row key must win before eviction.
Tightened the full-index fuzz ordering assertion to require strict doc-id increase on score ties; subset/two-stage duplicate-candidate assertions remain permissive where duplicates are part of the documented contract.

Validation:

cargo test -p ordvec topk_score_offset_is_part_of_eviction_key
cargo test -p ordvec --test index two_stage
cargo test -p ordvec --test determinism_contract
cargo test -p ordvec
cargo clippy --manifest-path fuzz/Cargo.toml --all-targets -- -D warnings
git diff --check

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3ec4897bae

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

gemini-code-assist Bot reviewed Jun 4, 2026

View reviewed changes

Comment thread fuzz/fuzz_targets/search_rankquant.rs Outdated

Fieldnote-Echo mentioned this pull request Jun 4, 2026

[codex] Guard release compatibility drift #180

Open

Fieldnote-Echo marked this pull request as ready for review June 4, 2026 14:41

Fieldnote-Echo requested a review from project-navi-bot as a code owner June 4, 2026 14:41

chatgpt-codex-connector Bot reviewed Jun 4, 2026

View reviewed changes

Comment thread src/util.rs Outdated

Fieldnote-Echo force-pushed the codex/manifest-cache-invariant-matrix branch from 3064191 to cf7ae3e Compare June 4, 2026 15:51

Fieldnote-Echo force-pushed the codex/determinism-fuzz-contract branch from e2ffc7f to e34e012 Compare June 4, 2026 15:51

Fieldnote-Echo force-pushed the codex/manifest-cache-invariant-matrix branch from cf7ae3e to f92d771 Compare June 4, 2026 16:44

Fieldnote-Echo added 3 commits June 4, 2026 11:44

Tighten subset ordering contract coverage

68cf532

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

Fix visible RankQuant tie ordering after score offsets

14392b3

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

Fix RankQuant offset eviction ordering

b0e91a9

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

Fieldnote-Echo force-pushed the codex/determinism-fuzz-contract branch from e34e012 to b0e91a9 Compare June 4, 2026 16:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Tighten subset ordering contract coverage#179

Tighten subset ordering contract coverage#179
Fieldnote-Echo wants to merge 3 commits into
codex/manifest-cache-invariant-matrixfrom
codex/determinism-fuzz-contract

Fieldnote-Echo commented Jun 4, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Jun 4, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Fieldnote-Echo commented Jun 4, 2026

Uh oh!

qodo-code-review Bot commented Jun 4, 2026

Uh oh!

qodo-code-review Bot commented Jun 4, 2026 •

edited

Loading

Uh oh!

Fieldnote-Echo commented Jun 4, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Fieldnote-Echo commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Review notes

Validation

Uh oh!

codecov Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Fieldnote-Echo commented Jun 4, 2026

Uh oh!

qodo-code-review Bot commented Jun 4, 2026

Review Summary by Qodo

Walkthroughs

File Changes

Uh oh!

qodo-code-review Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review by Qodo

Uh oh!

Fieldnote-Echo commented Jun 4, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fieldnote-Echo commented Jun 4, 2026 •

edited

Loading

codecov Bot commented Jun 4, 2026 •

edited

Loading

qodo-code-review Bot commented Jun 4, 2026 •

edited

Loading