Skip to content

Upgrade to Opus on distributional / reform questions in _select_chat_model #83

@vahid-ahmadi

Description

@vahid-ahmadi

Symptom (live test 2026-05-28)

The decile-impact reform question routed to Haiku 4.5 (visible in stream: "model": "claude-haiku-4-5"). Haiku is fast per turn but semantically weaker on API derivation — it needed many guesses to converge on the reform shape, and ultimately never converged within the iteration budget. Opus would likely have nailed it in 2–4 iterations.

Diagnosis

_select_chat_model (backend/routes/chatbot.py:153) routes by message length / token count. That heuristic doesn't capture cognitive difficulty — distributional simulation questions are short to ask but reasoning-heavy.

Fix

Add a keyword/intent check to upgrade to Opus when the latest user message contains any of:

  • decile, distributional, winners, losers
  • reform, increase the X, change the X by
  • 1pp, 1 percentage point, 1%, points
  • marginal rate, effective rate (paired with by/from)
  • poverty, inequality, gini

Tune the list against real usage rather than guessing — start small, watch which queries the agent fumbles, expand.

Constraints

  • Don't change pricing visibility (the cost UI already shows per-message £).
  • Cache hits still preferred where possible — keep the routing decision cheap (no extra LLM call for routing).
  • Should compose with Plan mode and Charts mode (no override conflict).

Open questions

  • Should charts_mode=True also bias toward Opus? Charts often imply distributional analysis. Probably yes.
  • Should Plan mode always upgrade? Plan turns generate clarifying questions, low cognitive load — probably stay on Haiku.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions