fix: deterministic package repo tie-break#4263
Conversation
Signed-off-by: anilb <epipav@gmail.com>
|
|
|
Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability. Example:
Projects:
Please add a Jira issue key to your PR title. |
Signed-off-by: anilb <epipav@gmail.com>
There was a problem hiding this comment.
Pull request overview
This PR aims to make “winner repo” selection deterministic when a package maps to multiple repositories with tied confidence, so Postgres-derived API responses and Tinybird-enriched health/scorecard data don’t fluctuate run-to-run.
Changes:
- Postgres: adds
repo_id DESCas a deterministic tie-breaker to severalLIMIT 1repo-selection laterals inservices/libs/data-access-layer/src/osspckgs/api.ts. - Tinybird: adds
repoIdas a final component in theargMaxtuple for per-package repo selection inossPackages_enriched.pipe.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| services/libs/data-access-layer/src/osspckgs/api.ts | Adds deterministic ordering for selecting a single repo per package in multiple Postgres queries. |
| services/libs/tinybird/pipes/ossPackages_enriched.pipe | Adds a final tie-break key to Tinybird argMax selection to make repo choice deterministic under ties. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| JOIN repos r ON r.id = pr.repo_id | ||
| WHERE pr.package_id = p.id | ||
| ORDER BY pr.confidence DESC | ||
| ORDER BY pr.confidence DESC, pr.repo_id DESC |
| JOIN repos r ON r.id = pr.repo_id | ||
| WHERE pr.package_id = p.id | ||
| ORDER BY pr.confidence DESC | ||
| ORDER BY pr.confidence DESC, pr.repo_id DESC |
| FROM package_repos pr2 | ||
| WHERE pr2.package_id = p.id | ||
| ORDER BY pr2.confidence DESC, (pr2.source = 'declared') DESC | ||
| ORDER BY pr2.confidence DESC, (pr2.source = 'declared') DESC, pr2.repo_id DESC |
| JOIN repos r ON r.id = pr.repo_id | ||
| WHERE pr.package_id = p.id | ||
| ORDER BY pr.confidence DESC | ||
| ORDER BY pr.confidence DESC, pr.repo_id DESC |
Note
Low Risk
Narrow query-ordering change; may shift which repo is shown when ties exist, but does not alter auth, writes, or broader business logic.
Overview
When a package maps to multiple repos with the same confidence, which repo wins was undefined, so scorecard-driven health, filters, and detail views could flip between runs.
Postgres (
osspckgs/api.ts): Every lateral that picks onepackage_reposrow for scorecard (status counts, package list, scatter) now sorts byconfidence DESCthenrepo_id DESC. Package detail keeps its existingdeclaredpreference and addsrepo_id DESCafter that.Tinybird (
ossPackages_enriched.pipe): The per-packageargMax(repoId, …)tie-break tuple now includesrepoIdafter confidence andverifiedAt, matching the same deterministic rule for enriched health scores.Reviewed by Cursor Bugbot for commit ecab2d3. Bugbot is set up for automated code reviews on this repo. Configure here.