diff --git a/CHANGELOG.md b/CHANGELOG.md
index 9cf1abf..6dec955 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -7,6 +7,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ## [Unreleased]
 
+### Documentation
+
+- Reframed bitmap-overlap docs around the checked Lean proof spine: query
+  symmetry, quotient sufficiency, finite threshold optimality, and idealized
+  hypergeometric calibration, while preserving the real-encoder caveats.
+
 ## [0.2.0] - 2026-05-26
 
 First public release on crates.io / PyPI — the crate was not published before
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 598d86b..fcab97d 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -14,6 +14,10 @@ Contributions to the code, the docs, and the paper are all welcome.
   reproducible `examples/bench_rank` synthetic run. Real-corpus numbers are
   user-runnable and reported in the paper — please don't add unverifiable
   performance claims.
+- **Keep theory claims bounded.** Cite theorem names or formalization docs for
+  proof-backed statements, and preserve the finite in-model vs real-encoder
+  caveat. The Lean bitmap theorem proves a constant-weight overlap admission
+  model under explicit assumptions; it is not a blanket retrieval guarantee.
 - **MSRV is Rust 1.89.** Don't use newer standard-library or language APIs.
 - **Stable surface.** The persistence file magics (`.tvr` / `.tvrq` /
   `.tvbm` / `.tvsb`) and the public method names
diff --git a/README.md b/README.md
index ab838a5..f5d1683 100644
--- a/README.md
+++ b/README.md
@@ -66,35 +66,42 @@ Two further paths, for callers who need them:
   latency-critical callers know it exists.
 - **`MultiBucketBitmap`** *(behind `--features experimental`)* — the
   multi-bucket bilinear-overlap probe behind the research-side decomposition;
-  a scaffold for the theory, not a production path.
+  an algebraic scaffold, not the top-bucket theorem surface or a production
+  path.
 
-## The bitmap prefilter has a closed-form null
+## The bitmap prefilter has a checked finite model
 
 The `Bitmap` prefilter scores candidates by `popcount(Q AND D)` over each
-document's fixed-size top-bucket set. Two *unrelated* documents — modelled as
-independent uniform top-bucket sets — overlap **hypergeometrically**,
-`H(dim, n_top, n_top)`, with expected overlap `n_top² / dim` (e.g. 16 at
-`dim = 256`, `n_top = 64`). So the filter's **selectivity** — how often an
-unrelated document clears a given overlap threshold — is closed-form and
-data-independent, not a tuned cutoff. (Whether *true* neighbours clear the bar
-is empirical; this is an exact candidate-generation null, not a
-retrieval-optimality theorem.) Two pieces of this are separately machine-checked in Lean 4, both `sorry`-free
-on Lean's standard axiom base (`propext`, `Classical.choice`, `Quot.sound`):
+document's fixed-size top-bucket set. In the idealized uniform constant-weight
+null, two unrelated `n_top`-active bitmaps in `dim` coordinates overlap
+**hypergeometrically**, `H(dim, n_top, n_top)`, with expected overlap
+`n_top² / dim` (e.g. 16 at `dim = 256`, `n_top = 64`). That makes the null
+selectivity of an overlap cutoff closed-form.
+
+The current proof story is stronger than a closed-form null alone. Two pieces
+are machine-checked in Lean 4, both `sorry`-free on Lean's standard axiom base
+(`propext`, `Classical.choice`, `Quot.sound`):
 
 - the **ordinal invariance** on which the rank transform rests — that a vector's
   sorting permutation is unchanged by any strictly monotone reparametrisation
   of its coordinates — in
   [`takens-formalization`](https://github.com/Project-Navi/takens-formalization)
   (theorem `isOrdinalPatternOf_comp_strictMono`); and
-- the **shape of the bitmap candidate filter** — that under a finite
-  monotone-likelihood-ratio overlap-tilt model, an overlap-count *threshold*
-  (the popcount cutoff) is the Bayes-optimal deterministic admission rule, and
-  the uniform constant-weight null gives that threshold exactly the
-  hypergeometric upper tail — in
+- the **finite constant-weight bitmap admission model** — symmetry makes
+  literal overlap the canonical query-preserving invariant, quotient
+  sufficiency reduces the decision to that evidence, a finite overlap-tilt
+  signal model makes an overlap-count threshold Bayes-optimal among
+  deterministic admission rules, and the uniform constant-weight bitmap null
+  assigns that same threshold event exactly the hypergeometric upper tail — in
   [`ordvec-formalization`](https://github.com/Fieldnote-Echo/ordvec-formalization)
   (theorem `exists_uniformBitmapOverlapTail_finiteBayesRisk_le_and_hypergeomTail`).
-  This is an *in-model* result: it fixes the optimal rule's shape under the
-  stated assumptions, not a claim that any given corpus obeys the model.
+
+This is an *in-model* result. It proves the rule shape and the idealized finite
+null under explicit quotient, symmetry, and monotone-overlap assumptions. It
+does not prove that real encoders satisfy those assumptions, that the textbook
+hypergeometric is every deployment corpus's null, or that ordinal quotients are
+representation-complete. Whether true neighbours clear a cutoff remains an
+empirical contract to measure.
 
 Details in [`docs/RANK_MODES.md`](docs/RANK_MODES.md).
 
@@ -149,6 +156,10 @@ Wheels target CPython 3.10+ (abi3); to build from source instead, see
 - **Index-file trust model:**
   [`docs/INDEX_PROVENANCE.md`](docs/INDEX_PROVENANCE.md),
   [`THREAT_MODEL.md`](THREAT_MODEL.md)
+- **Formal proof spine:** [`ordvec-formalization`](https://github.com/Fieldnote-Echo/ordvec-formalization),
+  including its [`proof-spine`](https://github.com/Fieldnote-Echo/ordvec-formalization/blob/main/docs/proof-spine.md),
+  [`theorem-map`](https://github.com/Fieldnote-Echo/ordvec-formalization/blob/main/docs/theorem-map.md),
+  and [`reviewer brief`](https://github.com/Fieldnote-Echo/ordvec-formalization/blob/main/docs/reviewer-brief.md).
 - **API docs:** <https://docs.rs/ordvec>
 - **Paper (OrdVec / RankQuant):** _link TBD — see
   [Research collaboration](#research-collaboration)._
@@ -236,8 +247,10 @@ Collaboration we're actively seeking:
 
 - **Real-corpus evaluation** — running the modes against public corpora
   (GloVe, MTEB / BEIR, OpenAI embedding dumps) beyond the synthetic benchmark.
-- **Theory** — formalising the hypergeometric candidate-generation null and
-  the rank-cosine invariants.
+- **Theory** — extending and independently auditing the `sorry`-free Lean
+  formalization, especially the finite bitmap proof spine, rank-cosine
+  invariants, and empirical diagnostics for when real encoders meet or violate
+  the model assumptions.
 - **Independent reproduction** — re-running the benchmark on other hardware
   and reporting the numbers.
 
diff --git a/ROADMAP.md b/ROADMAP.md
index ec35520..e964d3a 100644
--- a/ROADMAP.md
+++ b/ROADMAP.md
@@ -57,5 +57,6 @@ into more hosts rather than forcing callers to adapt to it.
 
 ## Out of scope here
 
-Benchmarks and the intellectual framing live in the OrdVec / RankQuant paper;
-this roadmap concerns the crate as an embeddable building block.
+Benchmarks and the intellectual framing live in the OrdVec / RankQuant paper
+and the companion `ordvec-formalization` Lean repository; this roadmap concerns
+the crate as an embeddable building block.
diff --git a/docs/RANK_MODES.md b/docs/RANK_MODES.md
index 8381e17..296ec3d 100644
--- a/docs/RANK_MODES.md
+++ b/docs/RANK_MODES.md
@@ -128,23 +128,29 @@ don't have a hypergeometric null because they don't have fixed
 bucket cardinalities — their score distribution depends on the
 unknown embedding distribution.
 
-**A research program this suggests — now partly machine-checked.** The
-chain — representation → statistic → retrieval theorem → systems
-implementation — now has a checked link in the middle. Under a finite
-monotone-likelihood-ratio overlap-tilt model (the formal cousin of a
-shared-latent-support model, where relevant documents have elevated
-coordinates on a query-specific support set `S_q`), the top-bucket
-overlap statistic is monotone in the likelihood ratio for relevance,
-and an overlap-count threshold — the popcount cutoff — is the
-Bayes-optimal deterministic admission rule, with the uniform
-constant-weight null assigning that threshold exactly the
-hypergeometric upper tail. That is proved `sorry`-free in
-[`ordvec-formalization`](https://github.com/Fieldnote-Echo/ordvec-formalization)
-(`exists_uniformBitmapOverlapTail_finiteBayesRisk_le_and_hypergeomTail`).
-It is an *in-model* result: it fixes the shape of the optimal rule
-under the stated assumptions, not a claim that any given corpus obeys
-the model — whether real neighbours clear the bar stays empirical,
-which is what the bench and the paper measure.
+**Checked finite model: symmetry, quotient sufficiency, threshold,
+calibration.** The proof chain now has a larger machine-checked middle
+than the implementation docs used to claim. In
+[`ordvec-formalization`](https://github.com/Fieldnote-Echo/ordvec-formalization),
+Lean proves that literal bitmap overlap is the canonical invariant
+under query-preserving coordinate relabelings; finite quotient
+sufficiency reduces the admission decision to ordered overlap
+evidence when the likelihood ratio factors through it; a finite
+monotone-likelihood-ratio overlap-tilt model makes an overlap-count
+threshold Bayes-optimal among deterministic admission rules; and the
+uniform constant-weight bitmap null gives that same threshold event
+the exact hypergeometric upper tail. The headline theorem is
+`exists_uniformBitmapOverlapTail_finiteBayesRisk_le_and_hypergeomTail`;
+the proof path is summarized in the formalization repo's
+`docs/proof-spine.md`, with theorem names in `docs/theorem-map.md`.
+
+It is still an *in-model* result. The theorem is about literal
+constant-weight bitmaps, finite deterministic admission, and explicit
+symmetry / quotient / monotone-overlap assumptions. It does not prove
+that real encoders satisfy those assumptions, that the textbook
+hypergeometric is every deployment corpus's null, or that ordinal
+quotients are representation-complete. Whether real neighbours clear a
+cutoff stays empirical, which is what the bench and the paper measure.
 
 The systems consequence is what the bench measures: at a moderate M
 the bitmap probe captures most of exact RankQuant's top-10 neighbours,
@@ -512,3 +518,7 @@ multi-seed stability is your call.
    retrieval still works after the removal is the interesting result:
    on the corpora tested, those components were carrying less than the
    dense-quantization literature assumes.
+6. **Formal boundary.** The Lean result supports the constant-weight
+   bitmap overlap admission model and its idealized null calibration.
+   It does not replace real-corpus recall, null-fit, or monotonicity
+   checks for a deployed encoder.
diff --git a/examples/bench_rank.rs b/examples/bench_rank.rs
index e8c76b0..ff6e102 100644
--- a/examples/bench_rank.rs
+++ b/examples/bench_rank.rs
@@ -18,6 +18,11 @@
 //! - recall@10 against FP32 brute-force cosine ground truth
 //! - candidate-recall (CR) for the two-stage modes (ANN probe quality)
 //!
+//! The bitmap rows also serve as an empirical complement to the Lean
+//! constant-weight overlap model: they measure whether this synthetic or
+//! user-supplied corpus behaves like the monotone-overlap regime assumed by the
+//! theorem.
+//!
 //! DETERMINISM. The corpus, queries, and ground truth are fully seeded,
 //! so every QUALITY column (R@10, CR, bytes/vec, total MiB, ns/dim) is
 //! bit-identical across runs on the same machine. Only the wall-clock
diff --git a/ordvec-python/README.md b/ordvec-python/README.md
index 24649a4..a76707c 100644
--- a/ordvec-python/README.md
+++ b/ordvec-python/README.md
@@ -21,8 +21,20 @@ scores, ids = q.search_asymmetric(np.random.randn(8, 1024).astype(np.float32), k
 |-------|---------|
 | `Rank` | Full-precision rank vectors (u16 per coordinate). |
 | `RankQuant` | Bucketed ranks, `bits` ∈ {1, 2, 4}; symmetric + asymmetric (float-query LUT) scoring. |
-| `Bitmap` | Top-bucket bitmap per document; `popcount(Q AND D)` candidate scoring. |
-| `SignBitmap` | Sign bitmap for sign-cosine candidate generation. |
+| `Bitmap` | Constant-weight top-bucket bitmap per document; `popcount(Q AND D)` candidate scoring. |
+| `SignBitmap` | Sign bitmap for sign-cosine candidate generation; separate from the constant-weight bitmap theorem. |
+
+## Theory and calibration
+
+`Bitmap` exposes the constant-weight top-bucket overlap statistic formalized in
+[`ordvec-formalization`](https://github.com/Fieldnote-Echo/ordvec-formalization).
+In that finite Lean model, literal bitmap overlap is the query-preserving
+quotient statistic, an overlap threshold is Bayes-optimal under explicit
+monotone-overlap assumptions, and the idealized uniform constant-weight null
+calibrates that threshold by the hypergeometric upper tail.
+
+This is not a deployment guarantee for every encoder or corpus. Real-corpus
+recall, monotonicity, and null fit remain empirical diagnostics.
 
 ## Installation
 
diff --git a/ordvec-python/pyproject.toml b/ordvec-python/pyproject.toml
index defc182..25794ac 100644
--- a/ordvec-python/pyproject.toml
+++ b/ordvec-python/pyproject.toml
@@ -49,6 +49,7 @@ dependencies = [
 Homepage = "https://github.com/Fieldnote-Echo/ordvec"
 Repository = "https://github.com/Fieldnote-Echo/ordvec"
 Issues = "https://github.com/Fieldnote-Echo/ordvec/issues"
+Formalization = "https://github.com/Fieldnote-Echo/ordvec-formalization"
 
 [tool.maturin]
 manifest-path = "Cargo.toml"
diff --git a/ordvec-python/python/ordvec/__init__.py b/ordvec-python/python/ordvec/__init__.py
index 73d5ce7..7c2b9d9 100644
--- a/ordvec-python/python/ordvec/__init__.py
+++ b/ordvec-python/python/ordvec/__init__.py
@@ -15,6 +15,14 @@
 read/write functions are reached through the classes' ``write()`` / ``load()``
 methods rather than exposed as standalone free functions.
 
+``Bitmap`` exposes the constant-weight top-bucket overlap statistic formalized
+in the companion ``ordvec-formalization`` Lean repo: under explicit finite
+symmetry, quotient, and monotone-overlap assumptions, an overlap threshold is
+Bayes-optimal and the idealized uniform constant-weight null gives that
+threshold the hypergeometric upper tail. This is an in-model candidate-admission
+claim, not a guarantee that real encoders or deployment corpora satisfy those
+assumptions.
+
 The ``*Index`` names are back-compat aliases for the pre-0.2 turbovec-python
 rank-mode classes; they are kept only to ease script migration and are not part
 of the documented surface — new code should use the OrdVec ontology names above.
diff --git a/ordvec-python/src/lib.rs b/ordvec-python/src/lib.rs
index f21a1aa..31472b7 100644
--- a/ordvec-python/src/lib.rs
+++ b/ordvec-python/src/lib.rs
@@ -1,8 +1,8 @@
 //! Python bindings for [`ordvec`](https://github.com/Fieldnote-Echo/ordvec) — the
 //! training-free ordinal & sign vector-quantization crate.
 //!
-//! Exposes the four retrieval types under the OrdVec ontology — [`Rank`],
-//! [`RankQuant`], [`Bitmap`], [`SignBitmap`] — as a single abi3 extension module
+//! Exposes the four retrieval types under the OrdVec ontology — `Rank`,
+//! `RankQuant`, `Bitmap`, `SignBitmap` — as a single abi3 extension module
 //! (`_ordvec`) wrapped by the `ordvec` Python package.
 //!
 //! The core crate is aliased as `ordvec_core` throughout, so the Rust namespace
@@ -487,7 +487,7 @@ impl Rank {
             .map_err(|e| pyo3::exceptions::PyIOError::new_err(e.to_string()))
     }
 
-    /// Load a `Rank` index from a `.tvr` file previously written by [`Rank.write`].
+    /// Load a `Rank` index from a `.tvr` file previously written by [`Rank::write`].
     ///
     /// `path` is forwarded to the filesystem unmodified — no `..` / traversal
     /// sanitisation — so treat it as trusted input (see the module docstring).
@@ -664,7 +664,7 @@ impl RankQuant {
             .map_err(|e| pyo3::exceptions::PyIOError::new_err(e.to_string()))
     }
 
-    /// Load a `RankQuant` index from a `.tvrq` file written by [`RankQuant.write`].
+    /// Load a `RankQuant` index from a `.tvrq` file written by [`RankQuant::write`].
     ///
     /// `path` is forwarded to the filesystem unmodified — no `..` / traversal
     /// sanitisation — so treat it as trusted input (see the module docstring).
@@ -694,6 +694,10 @@ impl RankQuant {
     /// filled are returned as ``-1``. Uses the same AVX-512 → AVX2 → scalar
     /// dispatch as ``search_asymmetric``.
     ///
+    /// If the shortlist came from [`Bitmap`], this is the exact RankQuant
+    /// rerank stage over that survivor set; it does not itself apply or
+    /// calibrate a bitmap overlap threshold.
+    ///
     /// ``candidates`` may be a 1-D array of any integer dtype — the ``uint32``
     /// emitted by ``top_m_candidates``/``top_m_candidates_batched`` or a plain
     /// ``int64`` index array (``np.arange``, ``np.where(...)[0]``, fancy-index
@@ -762,6 +766,8 @@ impl Bitmap {
     /// Construct a top-bucket bitmap index. `dim` must be a positive multiple of
     /// 64; `n_top` (how many coordinates per document are flagged "top", e.g.
     /// dim/4 for the b=2-equivalent top quarter) must satisfy `0 < n_top < dim`.
+    /// Every stored document and query bitmap has exactly `n_top` active
+    /// coordinates, matching the constant-weight overlap model.
     #[new]
     fn new(dim: usize, n_top: usize) -> PyResult<Self> {
         if dim == 0 || !dim.is_multiple_of(64) {
@@ -821,7 +827,9 @@ impl Bitmap {
     }
 
     /// Bitmap-overlap search: returns the top-`k` doc indices by
-    /// popcount(query_top AND doc_top).
+    /// popcount(query_top AND doc_top). This uses the overlap statistic from
+    /// the finite threshold theorem but returns a top-k ranking, not a
+    /// calibrated threshold-admission set.
     fn search<'py>(
         &self,
         py: Python<'py>,
@@ -848,7 +856,9 @@ impl Bitmap {
 
     /// Return top-`m` candidate doc IDs for a single query as a 1-D `uint32`
     /// array. Used as the candidate generator for two-stage retrieval (bitmap
-    /// probe → exact RankQuant rerank).
+    /// probe → exact RankQuant rerank). This is a fixed-budget shortlist over
+    /// the formal overlap statistic; the theorem's admission rule is an
+    /// explicit overlap threshold.
     fn top_m_candidates<'py>(
         &self,
         py: Python<'py>,
@@ -868,7 +878,8 @@ impl Bitmap {
 
     /// Build the query-side top-`n_top` bitmap from an FP32 query, returned as a
     /// 1-D `uint64` array of `dim / 64` words (the doc-side packing). Pairs with
-    /// [`Bitmap.body_overlap_scores_subset`] for staged rescoring.
+    /// [`Bitmap::body_overlap_scores_subset`] for staged rescoring. The returned
+    /// bitmap has exactly `n_top` active coordinates.
     fn build_query_bitmap_fp32<'py>(
         &self,
         py: Python<'py>,
@@ -889,6 +900,8 @@ impl Bitmap {
     /// dim)` f32 array; returns a 2-D `uint32` array of shape `(batch, m_eff)`
     /// where `m_eff = min(m, len(index))`. The column count is `m_eff`
     /// regardless of `batch`, so an empty `(0, dim)` input returns `(0, m_eff)`.
+    /// Each row has the same fixed-budget semantics as
+    /// [`Bitmap::top_m_candidates`].
     fn top_m_candidates_batched<'py>(
         &self,
         py: Python<'py>,
@@ -928,11 +941,12 @@ impl Bitmap {
     }
 
     /// Chunked batched candidate generation: like
-    /// [`Bitmap.top_m_candidates_batched`] but processes `queries` in groups of
+    /// [`Bitmap::top_m_candidates_batched`] but processes `queries` in groups of
     /// `batch_size` rows in parallel — use when the full query workload is
     /// larger than one batch fits efficiently in cache. `queries` is a 2-D `(n,
     /// dim)` f32 array; returns a 2-D `uint32` array `(n, m_eff)`. `batch_size`
-    /// must be > 0.
+    /// must be > 0. Each row has the same fixed-budget semantics as
+    /// [`Bitmap::top_m_candidates`].
     fn top_m_candidates_batched_chunked<'py>(
         &self,
         py: Python<'py>,
@@ -991,9 +1005,11 @@ impl Bitmap {
 
     /// Compute bitmap-overlap scores for a subset of doc IDs against a pre-built
     /// query bitmap. `q_bitmap` is a 1-D `uint64` array of `dim / 64` words
-    /// (e.g. from [`Bitmap.build_query_bitmap_fp32`]); `doc_ids` is a 1-D
+    /// (e.g. from [`Bitmap::build_query_bitmap_fp32`]); `doc_ids` is a 1-D
     /// integer array of any dtype (converted to `uint32`) whose ids must be in
     /// range. Returns a 1-D `uint32` array of overlap scores aligned to `doc_ids`.
+    /// These are the exact overlap values to compare against an explicit
+    /// calibrated cutoff when using threshold admission.
     ///
     /// `doc_ids` must additionally be sorted ascending. This is a *Python-side
     /// ergonomic policy*, not a core requirement: the Rust core accepts unsorted
@@ -1058,7 +1074,7 @@ impl Bitmap {
             .map_err(|e| pyo3::exceptions::PyIOError::new_err(e.to_string()))
     }
 
-    /// Load a `Bitmap` index from a `.tvbm` file written by [`Bitmap.write`].
+    /// Load a `Bitmap` index from a `.tvbm` file written by [`Bitmap::write`].
     ///
     /// `path` is forwarded to the filesystem unmodified — no `..` / traversal
     /// sanitisation — so treat it as trusted input (see the module docstring).
@@ -1104,7 +1120,8 @@ impl SignBitmap {
     /// 1-bit-per-coord sign-cosine retrieval substrate. `dim` must be a positive
     /// multiple of 64, at most `MAX_SIGN_BITMAP_DIM`. No `n_top` parameter — the
     /// threshold is data-independent (bit set iff coord > 0). Storage: `dim / 8`
-    /// bytes per doc (128 B at D = 1024).
+    /// bytes per doc (128 B at D = 1024). This is separate from the
+    /// constant-weight `Bitmap` theorem and its hypergeometric overlap null.
     #[new]
     fn new(dim: usize) -> PyResult<Self> {
         if dim == 0 || !dim.is_multiple_of(64) {
@@ -1297,7 +1314,7 @@ impl SignBitmap {
     }
 
     /// Load a `SignBitmap` from a `.tvsb` file previously written by
-    /// [`SignBitmap.write`]. Raises `IOError` if the file is missing, malformed,
+    /// [`SignBitmap::write`]. Raises `IOError` if the file is missing, malformed,
     /// or its payload length disagrees with the header-declared shape.
     ///
     /// `path` is forwarded to the filesystem unmodified — no `..` / traversal
diff --git a/ordvec-python/tests/test_bitmap.py b/ordvec-python/tests/test_bitmap.py
index 8cf95a3..0aee255 100644
--- a/ordvec-python/tests/test_bitmap.py
+++ b/ordvec-python/tests/test_bitmap.py
@@ -3,9 +3,9 @@
 Exercise the top-bucket bitmap candidate-generator API (add, search,
 top_m_candidates, save/load, accessors) and the two-stage flow with a
 RankQuant rerank. Algorithmic correctness (kernel-vs-scalar parity,
-hypergeometric null distribution checks, bilinear identity proofs) lives
-in the crate's Rust tests under `tests/index/`; these tests cover the
-pyo3 boundary only.
+constant-weight bitmap invariants, bilinear identity checks) lives in the
+crate's Rust tests under `tests/index/`; the formal overlap-null theorem lives
+in `ordvec-formalization`. These tests cover the pyo3 boundary only.
 """
 from __future__ import annotations
 
diff --git a/ordvec-python/tests/test_redteam_fuzz.py b/ordvec-python/tests/test_redteam_fuzz.py
index 436b624..21fba01 100644
--- a/ordvec-python/tests/test_redteam_fuzz.py
+++ b/ordvec-python/tests/test_redteam_fuzz.py
@@ -1013,7 +1013,8 @@ def test_bitmap_load_forged_n_top_io_error(tmp_path):
 
 def test_bitmap_load_corrupt_popcount_io_error(tmp_path):
     # Zero the payload → each row popcount 0 != n_top, violating the per-row
-    # popcount invariant the hypergeometric model assumes.
+    # popcount invariant the constant-weight bitmap-null / formal overlap model
+    # assumes.
     idx = Bitmap(dim=128, n_top=32)
     idx.add(unit_vectors(20, 128))
     real = str(tmp_path / "real.tvbm")
diff --git a/src/bitmap.rs b/src/bitmap.rs
index 6b9cf3c..07536a1 100644
--- a/src/bitmap.rs
+++ b/src/bitmap.rs
@@ -8,11 +8,18 @@
 //! For `dim=1024, n_top=256` this is 128 B/doc — half of RankQuant b=2's
 //! storage — and equivalent to the top bucket of RankQuant b=2.
 //!
-//! Score = `popcount(Q_bitmap AND D_bitmap) ∈ [0, n_top]`. The null
-//! distribution for a random doc is hypergeometric
-//! `H(N=dim, K=n_top, n=n_top)` with mean `n_top² / dim`, which lets
-//! callers compute a closed-form p-value for the overlap — a property
-//! magnitude quantizers don't have.
+//! Score = `popcount(Q_bitmap AND D_bitmap) ∈ [0, n_top]`. Under the
+//! idealized uniform constant-weight bitmap null, a random document's overlap
+//! is hypergeometric `H(N=dim, K=n_top, n=n_top)` with mean `n_top² / dim`,
+//! which lets callers compute a closed-form p-value for an overlap cutoff — a
+//! property magnitude quantizers don't have.
+//!
+//! This is also the implementation surface for the companion Lean
+//! formalization's finite model: literal constant-weight bitmap overlap is the
+//! query-preserving quotient statistic, an overlap-tail rule is Bayes-optimal
+//! under explicit monotone-overlap assumptions, and the same threshold event has
+//! the hypergeometric upper tail under the idealized uniform null. The theorem
+//! is in-model only; real encoders and deployment nulls remain empirical.
 //!
 //! Intended primary use: candidate generator for two-stage retrieval
 //! (bitmap probe → top-M candidates → exact RankQuant rerank).
@@ -24,6 +31,12 @@ use crate::util::{and_popcount, assert_all_finite, result_buffer_len, TopK};
 use crate::SearchResults;
 
 /// Top-bucket bitmap index for constant-composition coarse scoring.
+///
+/// The checked finite theorem attached to this surface concerns threshold
+/// admission on literal `n_top`-active bitmaps. The top-k helpers below are
+/// engineering conveniences over the same overlap statistic; calibrated
+/// threshold admission still requires choosing and validating a cutoff for the
+/// caller's corpus.
 pub struct Bitmap {
     dim: usize,
     n_top: usize,
@@ -99,7 +112,9 @@ impl Bitmap {
     /// Build the query-side bitmap from the *FP32 query directly*
     /// (top `n_top` coordinates by value). This preserves the
     /// "rich query, cheap docs" asymmetry of the rank-cosine paper:
-    /// the query side never gets rank-quantised.
+    /// the query side never gets rank-quantised. The returned query bitmap has
+    /// exactly `n_top` active coordinates, matching the constant-weight
+    /// overlap model.
     pub fn build_query_bitmap_fp32(&self, q: &[f32]) -> Vec<u64> {
         assert_eq!(q.len(), self.dim);
         assert_all_finite(q);
@@ -122,6 +137,10 @@ impl Bitmap {
     /// Search: returns the top-`k` documents by popcount-overlap with
     /// the query's top-bucket bitmap. Scores are `popcount(Q AND D)`
     /// reported as f32 (in `[0, n_top]`).
+    ///
+    /// This ranks by the same overlap statistic used by the finite threshold
+    /// theorem, but it returns a top-k list rather than applying a calibrated
+    /// admission cutoff.
     pub fn search(&self, queries: &[f32], k: usize) -> SearchResults {
         let nq = queries.len() / self.dim;
         assert_eq!(queries.len(), nq * self.dim);
@@ -172,7 +191,9 @@ impl Bitmap {
 
     /// Return the top-`m` candidate document indices for a single
     /// query, ordered by bitmap-overlap desc. Helper for two-stage
-    /// retrieval.
+    /// retrieval. This is a fixed-budget shortlist over the formal overlap
+    /// statistic; the theorem's admission rule is a threshold event, not an
+    /// `m`-sized survivor set.
     ///
     /// For large `m` this would exhibit O(N · m) behaviour through a
     /// streaming top-k buffer (each replacement triggers a linear
@@ -224,7 +245,8 @@ impl Bitmap {
     ///
     /// `queries` is a flat `batch * dim` f32 slice. Returns
     /// `Vec<Vec<u32>>` with one top-`m` set per query, sorted by
-    /// overlap descending.
+    /// overlap descending. Each row has the same fixed-budget semantics as
+    /// [`Self::top_m_candidates`].
     #[must_use = "this scans the corpus per query to generate candidates; dropping the result discards that work"]
     pub fn top_m_candidates_batched(&self, queries: &[f32], m: usize) -> Vec<Vec<u32>> {
         let dim = self.dim;
@@ -295,7 +317,8 @@ impl Bitmap {
     /// `batch_size` and runs each chunk through
     /// [`Self::top_m_candidates_batched`] in parallel via rayon. Use
     /// when the full query workload is larger than one batch fits
-    /// efficiently in L2/L3.
+    /// efficiently in L2/L3. This preserves the fixed-budget shortlist
+    /// semantics of [`Self::top_m_candidates`].
     #[must_use = "this scans the corpus per query to generate candidates; dropping the result discards that work"]
     pub fn top_m_candidates_batched_chunked(
         &self,
@@ -343,7 +366,9 @@ impl Bitmap {
     /// index addresses at most `u32::MAX` documents.
     ///
     /// Public surface to support staged-pipeline callers that need to
-    /// rescore a small survivor set under the exact body overlap.
+    /// rescore a small survivor set under the exact body overlap. The returned
+    /// overlaps are also the values to compare against an explicit calibrated
+    /// cutoff when using threshold admission.
     pub fn body_overlap_scores_subset(&self, q_bitmap: &[u64], doc_ids: &[u32], out: &mut [u32]) {
         let qpv = self.qwords_per_vec;
         assert_eq!(q_bitmap.len(), qpv);
diff --git a/src/fastscan.rs b/src/fastscan.rs
index a39b8de..5149bd5 100644
--- a/src/fastscan.rs
+++ b/src/fastscan.rs
@@ -21,6 +21,8 @@
 //! `pub(crate)`: production callers should reach for
 //! [`RankQuant::search_asymmetric`](crate::RankQuant::search_asymmetric),
 //! whose AVX-512 → AVX2 → scalar dispatch is the maintained surface.
+//! This latency path is not part of the constant-weight bitmap overlap
+//! calibration theorem.
 //!
 //! # Provenance
 //!
diff --git a/src/lib.rs b/src/lib.rs
index 3972dec..24dd8b4 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -20,6 +20,15 @@
 //!   coordinate, set when the coordinate is positive) for sign-cosine
 //!   candidate generation.
 //!
+//! The [`Bitmap`] candidate score is the implementation surface with the
+//! strongest formal story: in the companion Lean formalization, literal
+//! constant-weight bitmap overlap is the query-preserving quotient statistic,
+//! an overlap threshold is Bayes-optimal under an explicit finite
+//! monotone-overlap signal model, and the idealized uniform constant-weight null
+//! calibrates that threshold by the hypergeometric upper tail. This is a finite
+//! in-model theorem, not a claim that real encoders automatically satisfy the
+//! quotient, symmetry, or null assumptions.
+//!
 //! ```no_run
 //! use ordvec::{Rank, RankQuant};
 //!
@@ -67,8 +76,9 @@ pub use sign_bitmap::SignBitmap;
 pub use quant::search_asymmetric_byte_lut;
 
 // `MultiBucketBitmap` underwrites the bilinear bucket-overlap
-// decomposition but is not stable public API. It is reachable only with
-// the `experimental` feature; the default surface excludes it.
+// decomposition but is not the constant-weight top-bucket theorem surface and
+// is not stable public API. It is reachable only with the `experimental`
+// feature; the default surface excludes it.
 #[cfg(feature = "experimental")]
 pub use multi_bucket::MultiBucketBitmap;
 
diff --git a/src/multi_bucket.rs b/src/multi_bucket.rs
index 878fb18..0b4a2d9 100644
--- a/src/multi_bucket.rs
+++ b/src/multi_bucket.rs
@@ -8,7 +8,7 @@
 //! score(q, d) = Σ_{a, b} W[a, b] · |Q_a ∩ D_b|
 //! ```
 //!
-//! for arbitrary weights `W[2^bits][2^bits]` is the formal object the
+//! for arbitrary weights `W[2^bits][2^bits]` is the algebraic object the
 //! scoring decomposes into. For **outer-product weights**
 //! `W[a, b] = (a − c)(b − c)` with `c = (2^bits − 1) / 2` this is
 //! algebraically identical to the symmetric RankQuant per-coord score
@@ -25,6 +25,10 @@
 //! bilinear decomposition empirically and serve as the reference for
 //! **truncated** weight matrices (top-k buckets only, diagonal-only,
 //! banded) which are the principled candidate-generation primitives.
+//!
+//! The companion Lean theorem currently attaches to the literal top-bucket
+//! constant-weight [`crate::Bitmap`] overlap statistic and its threshold tail,
+//! not to arbitrary bilinear weights over all buckets.
 
 use rayon::prelude::*;
 
diff --git a/src/quant.rs b/src/quant.rs
index 2d9562d..0e4a2ff 100644
--- a/src/quant.rs
+++ b/src/quant.rs
@@ -549,6 +549,10 @@ impl RankQuant {
     /// probe), but the copy grows linearly in `M`. For very large `M`
     /// (e.g. misuse via FFI), a full [`Self::search_asymmetric`] may be
     /// cheaper; a gather-free in-place scan is tracked for the FFI work.
+    ///
+    /// If the candidate list came from [`crate::Bitmap`], this method reranks
+    /// that shortlist exactly under RankQuant; it does not itself carry the
+    /// bitmap threshold-calibration guarantee.
     pub fn search_asymmetric_subset(
         &self,
         query: &[f32],
diff --git a/src/rank_io.rs b/src/rank_io.rs
index d34d010..32be790 100644
--- a/src/rank_io.rs
+++ b/src/rank_io.rs
@@ -541,9 +541,10 @@ pub(crate) fn load_bitmap(path: impl AsRef<Path>) -> io::Result<(usize, usize, u
     let bitmaps = read_le_vec(&mut f, payload_bytes / 8, u64::from_le_bytes)?;
     // Constant-composition invariant: every document bitmap must have exactly
     // `n_top` bits set (it flags the document's top `n_top` coordinates). The
-    // hypergeometric null model and the documented `[0, n_top]` score range
-    // both assume this, so a forged row with valid shape but a different
-    // popcount would break both. Verify per-row popcount at the boundary.
+    // idealized uniform constant-weight hypergeometric null model and the
+    // documented `[0, n_top]` score range both assume this, so a forged row
+    // with valid shape but a different popcount would break both. Verify
+    // per-row popcount at the boundary.
     for (row_idx, row) in bitmaps.chunks_exact(qpv).enumerate() {
         let pop: u32 = row.iter().map(|w| w.count_ones()).sum();
         if pop as usize != n_top {
diff --git a/src/sign_bitmap.rs b/src/sign_bitmap.rs
index 2f5435a..8efd63a 100644
--- a/src/sign_bitmap.rs
+++ b/src/sign_bitmap.rs
@@ -17,6 +17,10 @@
 //! candidate selector takes top-M docs by **lowest** Hamming
 //! (= **highest** agreement).
 //!
+//! This is a separate sign-agreement primitive. It is not a constant-weight
+//! bitmap space and is not covered by [`crate::Bitmap`]'s hypergeometric
+//! overlap-tail theorem.
+//!
 //! Kernel architecture mirrors [`crate::Bitmap`] (single-query
 //! and CHUNK=8 batched hot+tail paths under AVX-512 VPOPCNTDQ). The
 //! only material difference is `_mm512_xor_si512` in place of