Add canonical reform recipes to cached library reference#88
Open
vahid-ahmadi wants to merge 1 commit into
Open
Add canonical reform recipes to cached library reference#88vahid-ahmadi wants to merge 1 commit into
vahid-ahmadi wants to merge 1 commit into
Conversation
In a recent live session a user asked for the distributional effect of
raising the income tax basic rate by 1pp. The agent ran `run_python` 15+
times trying to construct the reform, every guess producing wrong results
(revenue dropping when the rate went up) because
`Parameters(income_tax={"uk_brackets": [...]})` REPLACES the full
schedule — passing a single-bracket list zeroed out higher and additional
rates. The run never converged (262s of wasted compute).
Add a "Reform recipes" section to the cached library reference produced
by scripts/build_reference.py. Each recipe is exact, copy-pasteable
Python the agent can adapt verbatim:
1. Basic rate +1pp via the full `uk_brackets` list (the failure case) — VERIFIED
2. Personal allowance change — VERIFIED
3. NI primary threshold change — unverified, marked for follow-up
4. Child benefit uprating — unverified, marked for follow-up
5. Marriage allowance toggle — unverified, marked for follow-up
Recipes 1–2 are verified against backend/tests/test_agent_tools.py
(`uk_brackets` and `personal_allowance` are exercised there). Recipes
3–5 use the conventional PolicyEngine UK naming from
`_build_compiled_policy` in backend/agent_tools.py and reference the
Parameters JSON schema dump (which lives in the same reference file) as
the authoritative source if a field name is rejected.
The section is inserted between Public API and Parameters JSON schema in
`render()` so it lands inside the existing cached `REFERENCE_DOC` block
in `_build_system_blocks` — one-time cache invalidation on this merge,
then cached again on every subsequent request.
Closes #85
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Beta preview is ready.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
A recent live test failed dramatically: a user asked for the distributional effect of raising income tax basic rate by 1pp, and the agent ran
run_python15+ times trying to construct the reform — every guess producing wrong results (revenue dropping when the rate went up). Root cause:Parameters(income_tax={"uk_brackets": [...]})REPLACES the full schedule, so passing only the basic-rate band silently zeroed out higher and additional rates. The run never converged (262s wasted).This PR adds a "Reform recipes" section to the cached library reference produced by
scripts/build_reference.py, so the agent sees five copy-pasteable reform constructions inside the prompt-cached reference document.Recipes
uk_bracketslist (the actual failure case) — VERIFIED againstbackend/tests/test_agent_tools.py::test_uk_brackets_reform.backend/tests/test_agent_tools.py::test_valid_reform.amount_for_first_child/amount_for_additional_child, recipe carries an inline verify note.income_tax, recipe carries an inline verify note.The Parameters JSON schema (already dumped in this reference) is the authoritative fallback when a field name is rejected — the recipes explicitly point the agent at it.
Where it lands
backend/scripts/build_reference.py— newREFORM_RECIPESconstant inserted intorender()between Public API and Parameters JSON schema sections.backend/reference.md(generated at deploy time) — picked up by_build_system_blocksinbackend/routes/chatbot.pyinside the existingcache_control={"type":"ephemeral"}block. One-time cache invalidation on merge, then cached again on subsequent requests.Constraints respected
_build_system_blockssignature or logic.Test plan
python -c "import ast; ast.parse(open('backend/scripts/build_reference.py').read())"already passes locally).scripts/build_reference.pyin the deploy image and confirmreference.mdcontains the "Reform recipes" section.pe.Parameters.model_json_schema()and, if any field names are wrong, follow up with a fix-up PR.Closes #85
🤖 Generated with Claude Code