Background
The colleague-eval merge moved the colleague model + reasoning effort into the scenario config (chat.model / chat.reasoningEffort in lib/scenarios.json), read by both app/api/chat/route.ts and the validation pipeline. A few model ids remain hardcoded:
app/api/writing-support/route.ts:145 — gpt-5.2 (the AI writing suggestions, not the colleague)
scripts/scenario_design/simulate.ts — participant simulator uses gpt-4o
scripts/scenario_design/judge.ts — LLM judge uses gpt-4o
Ask
Decide whether these should also be config-driven (and where the config should live), then parameterize them so model choices are centralized rather than scattered as literals.
Notes
- Deferred from the colleague-eval merge (branch
merge-colleague-evals) per review — lower priority than the colleague model.
Background
The colleague-eval merge moved the colleague model + reasoning effort into the scenario config (
chat.model/chat.reasoningEffortinlib/scenarios.json), read by bothapp/api/chat/route.tsand the validation pipeline. A few model ids remain hardcoded:app/api/writing-support/route.ts:145—gpt-5.2(the AI writing suggestions, not the colleague)scripts/scenario_design/simulate.ts— participant simulator usesgpt-4oscripts/scenario_design/judge.ts— LLM judge usesgpt-4oAsk
Decide whether these should also be config-driven (and where the config should live), then parameterize them so model choices are centralized rather than scattered as literals.
Notes
merge-colleague-evals) per review — lower priority than the colleague model.