Chroma backend search score is metric-blind and not comparable across backends

`ChromaCollection.search` (`vd/backends/chroma.py`) converts every distance to a score with:

```python
score = 1.0 / (1.0 + distance)
```

regardless of the collection's distance metric. The inline comment even notes that for cosine ChromaDB returns `1 - cosine` — so for a cosine collection the reported score is `1 / (2 - cosine_sim)`, a monotonic but non-standard, non-interpretable number. The code never reads the collection's actual configured metric (`hnsw:space`).

**Impact:** ranking is preserved (the transform is monotonic), so results are not mis-ordered. But:
- Scores are **not comparable across backends** — the `memory` backend returns cosine similarity directly in `[-1, 1]`.
- Scores are **not interpretable** — a "0.5" means nothing consistent.
- `vd`'s own `reciprocal_rank_fusion` / `deduplicate_results` / `multi_query_search` helpers and `ef`'s `SearchHit.score` all consume these values.

**Proposal:**
- [ ] Read the collection's configured metric and convert to a single documented canonical score (suggest: cosine similarity in `[-1, 1]`, or a documented `similarity = 1 - distance` for cosine / `-distance` for L2).
- [ ] Document the score semantics in the `Collection.search` contract in `base.py`, so every backend (memory, chroma, future) agrees on what `score` means.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Chroma backend search score is metric-blind and not comparable across backends #9

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Chroma backend search score is metric-blind and not comparable across backends #9

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions