- Load
ragopt.yaml. - Load benchmark dataset.
- For each candidate:
- run generation over each query
- compute per-case metrics
- aggregate metrics
- apply hard constraints
- Rank candidates by weighted score.
- Persist JSON artifact + markdown report.
- Optionally post markdown to GitHub PR.
- UI loads run artifact JSON for interactive inspection.
ragopt/models.py: schema and result typesragopt/config.py: config and dataset loadingragopt/adapters.py: generation provider abstractionragopt/metrics.py: metric and scoring functionsragopt/engine.py: orchestration and comparisonragopt/reporting.py: markdown outputsragopt/github.py: PR comment helperragopt/cli.py: public command interfaceui/: React + TypeScript dashboard for viewing run artifacts
- Add providers in
adapters.py. - Add new metrics in
metrics.pyand wire into engine. - Add policy checks in
engine.pycompare/run paths. - Connect
ui/to a live backend API instead of file upload only.