Context
Follow-up to #62 (deps-as-text surface) — thread #61. #62 added an opt-in deps surface to ir.strategy.Package. Measured on the 231-package corpus (all-MiniLM, hybrid) it produces real, correctly-targeted dependency signal — allude 83→28 (depends on meshed), xcosmo 103→40 (depends on cosmograph), while a true non-dependent stays flat (au 22→23). But as an equal-weighted surface in hybrid RRF the lift mostly lands in the rank 20–50 band, so aggregate nDCG@20 moves only +0.023 (embeddings) / +0.00 (graphs), and a few tail items jitter by 1–4 ranks.
Problem
The deps surface is a high-precision, low-noise signal (exact library tokens), but RRF fuses every surface-hit equally, so a single dep-token match is averaged in with prose chunks rather than boosted. ir/retrieve.py already filters/branches on surface_kind (~line 106), so the hook exists — what's missing is a per-surface-kind weight at fusion.
Proposal (single-shot retrieval seam → ir, per #38)
Add an optional surface_weights: Mapping[str, float] | None (e.g. {"deps": 2.0}) to ir.search / fuse_hits, applied as a multiplier on a surface-hit's fused contribution (RRF rank-weight or blend score) before the per-artifact collapse. Default None = today's equal weighting (progressive disclosure). Keep it embedder-agnostic and offline.
Experiment
Reuse the #66 harness + the private benchmark. Sweep surface_weights={"deps": w} for w ∈ {1, 1.5, 2, 3} via compare_indexings and pick the w that maximizes hard-positive recall@20 on the dep-revealing set (allude, xcosmo, chromadol, http_cosmo_prep, lexis, unbox) without raising the distractor fp_rate (esp. au/creek/strand must stay put).
Success metric
Part of #61.
https://claude.ai/code/session_01D229oNHVN1drd1mdbQL5MV
Context
Follow-up to #62 (deps-as-text surface) — thread #61. #62 added an opt-in
depssurface toir.strategy.Package. Measured on the 231-package corpus (all-MiniLM, hybrid) it produces real, correctly-targeted dependency signal —allude83→28 (depends onmeshed),xcosmo103→40 (depends oncosmograph), while a true non-dependent stays flat (au22→23). But as an equal-weighted surface in hybrid RRF the lift mostly lands in the rank 20–50 band, so aggregate nDCG@20 moves only +0.023 (embeddings) / +0.00 (graphs), and a few tail items jitter by 1–4 ranks.Problem
The
depssurface is a high-precision, low-noise signal (exact library tokens), but RRF fuses every surface-hit equally, so a single dep-token match is averaged in with prose chunks rather than boosted.ir/retrieve.pyalready filters/branches onsurface_kind(~line 106), so the hook exists — what's missing is a per-surface-kind weight at fusion.Proposal (single-shot retrieval seam → ir, per #38)
Add an optional
surface_weights: Mapping[str, float] | None(e.g.{"deps": 2.0}) toir.search/fuse_hits, applied as a multiplier on a surface-hit's fused contribution (RRF rank-weight or blend score) before the per-artifact collapse. DefaultNone= today's equal weighting (progressive disclosure). Keep it embedder-agnostic and offline.Experiment
Reuse the #66 harness + the private benchmark. Sweep
surface_weights={"deps": w}forw ∈ {1, 1.5, 2, 3}viacompare_indexingsand pick thewthat maximizes hard-positive recall@20 on the dep-revealing set (allude,xcosmo,chromadol,http_cosmo_prep,lexis,unbox) without raising the distractor fp_rate (esp.au/creek/strandmust stay put).Success metric
allude/xcosmo(and ideallychromadol/http_cosmo_prep) into top-20 → hard-positive recall@20 up materially vs the Embed the dependency list as a first-class 'deps' surface in ir.strategy.Package (deps-as-text), keeping the filter field #62 equal-weight result..regressions()empty on the distractor set; aggregate nDCG@20 ≥ the Embed the dependency list as a first-class 'deps' surface in ir.strategy.Package (deps-as-text), keeping the filter field #62 number.Package(embed_deps=True)on by default and rebuild the livepackagescorpus.Part of #61.
https://claude.ai/code/session_01D229oNHVN1drd1mdbQL5MV