Skip to content

search: Cohere reranking (pro tier) regresses MRR on the retrieval eval #322

Description

@DevanshuNEU

Finding (dogfood F-017)

Surfaced while calibrating the retrieval-quality eval harness (#312, PR #321) against the OCI repo index over the 10-query ground-truth set.

Cohere reranking is net-negative on rank quality on this set:

Tier recall@10 MRR
Free (core ranker) 0.80 0.80
Pro (Cohere rerank) 0.85 0.658

Reranking buys +0.05 recall@10 (pulls one extra expected file into the top 10) but drops MRR 0.80 -> 0.658 by demoting rank-1 hits.

Why this matters

For an agent consumer, the rank-1 hit carries most of the value: it reads top results first and pays per token, so a strong top hit is worth more than marginally better deep-list coverage. Trading MRR for recall is the wrong trade here, yet reranking is the pro-tier default. A paying user can get worse top-rank quality than the free tier.

Suggested investigation

  • Inspect the per-query breakdown (eval results/ JSON) to identify exactly which queries the reranker demotes.
  • Options to evaluate:
    • Only rerank when the core ranker's top hit is low-confidence (conditional rerank).
    • Blend the rerank score with the original rank instead of a full reorder.
    • Reconsider whether Cohere rerank should be the pro-tier default on small codebases at all.
  • Re-run python -m evals after any change and compare against the calibrated baseline (free recall@10 0.80 / MRR 0.80).

Notes

Metadata

Metadata

Assignees

No one assigned

    Labels

    dogfood-findingSurfaced by using OCI on OCI itself

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions