Skip to content

Child slot labels should be youngest-first, not oldest-first #50

@MaxGhenis

Description

@MaxGhenis

Problem

scenarios.py:1160-1163 (US) and the analogous UK code at scenarios.py:1337-1357 sort household members by descending age before assigning child1, child2, ... labels:

household.sort_values(
    by=["is_tax_unit_head", "is_tax_unit_spouse", "age", "person_id"],
    ascending=[False, False, False, True],  # age=False → DESCENDING
)

So child1 is the oldest child in the household, child2 the next oldest, etc. This inverts the natural reading for benefit programs that target young children — WIC (age < 5), Head Start (3–4), Early Head Start (under 3), Medicaid/CHIP eligibility (varies by age/income), school-meal eligibility.

Concrete evidence

Across the 10 US multi-child benchmark scenarios:

Scenario child1 age child2 age child3 age ...
008 4 0
030 16 14 11
053 7 5 0
055 17 14 12 10, 5, 1
061 7 3
089 7 3

WIC eligibility (age < 5) by child slot in the sample:

  • child1_wic_eligible: 1 case (only scenario 008)
  • child2_wic_eligible: 3 cases (008, 061, 089)
  • child3_wic_eligible: 1 case (053)
  • child6_wic_eligible: 1 case (055)

The household-impact weight share follows the same shape — child2_wic_eligible weight (0.0017) is ~2× child1_wic_eligible (0.0009). That's the inverse of what users will expect when reading "Child 1 WIC eligibility" in the prompt or the program breakdown table.

Why this matters

  • Prompt readability: a reader expects "Child 1" to be the household's most prominent young-child case for these programs, not the oldest sibling who's usually outside the eligibility age band.
  • Person-level slot weights: programs that key on young children (WIC, EHS) get most of their weight on child2/child3 rather than child1, distorting the program breakdown table.
  • Failure-mode attribution: when a model misses WIC eligibility on the youngest child, the failure currently surfaces under child2_wic_eligible etc., not under the "primary" label.

Proposed fix

Switch the age sort to ascending — youngest child first. After the head/spouse rows, append children oldest-last so child1 is the youngest. Same in the UK path.

- ascending=[False, False, False, True],
+ ascending=[False, False, True, True],

This is a content change that renames many prompt-visible labels (child1_* → was-childN_*), so it will:

  • Invalidate the frozen paper snapshot's reproducibility against re-generated scenarios
  • Shift household-impact weights between childN slots
  • Require updating model outputs that referenced specific child slots by index (audit notes, case annotations)

If the frozen paper snapshot needs to stay byte-identical, this should land alongside a new snapshot date and a paper note that the convention changed.

Out of scope

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions