Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions src/pages/LeaderboardPage.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -288,22 +288,22 @@ const LeaderboardPage: React.FC = () => {
<h3>Arena Score</h3>
<p>
A composite measure balancing accuracy and cost using a weighted harmonic mean.
Higher scores indicate routers that achieve the best accuracycost trade-off.
Higher scores indicate routers that achieve the best accuracy-cost trade-off.
</p>
</div>

<div className="metric-card">
<h3>Cost Ratio Score</h3>
<p>
Evaluates routing efficiency relative to an oracle.
Evaluates the cost of the router's choices relative to an oracle that always selects the cheapest correct model.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better terminological consistency, consider using 'optimal model' here. This term is introduced and defined in the 'Optimality Score' description below, and using it here would make the metric explanations more cohesive.

Suggested change
Evaluates the cost of the router's choices relative to an oracle that always selects the cheapest correct model.
Evaluates the cost of the router's choices relative to an oracle that always selects the optimal model.

Routers with higher scores achieve comparable accuracy at lower inference cost.
</p>
</div>

<div className="metric-card">
<h3>Optimality Score</h3>
<p>
Measures how often a router selects the cheapest correct model.
Measures how often a router selects the optimal model (i.e., the model answers the question correctly with the lowest cost).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The definition of 'optimal model' is clear, but it could be more concise to improve readability. Using a shorter phrase would keep the parenthetical explanation brief and impactful.

Suggested change
Measures how often a router selects the optimal model (i.e., the model answers the question correctly with the lowest cost).
Measures how often a router selects the optimal model (i.e., the cheapest correct model).

Higher values reflect closer alignment to cost-optimal routing behavior.
</p>
</div>
Expand Down