Skip to content

Document that Census TAX_ID is replaced by our tax-unit construction#1154

Closed
MaxGhenis wants to merge 2 commits into
PolicyEngine:mainfrom
MaxGhenis:claude/document-census-tax-id-replacement
Closed

Document that Census TAX_ID is replaced by our tax-unit construction#1154
MaxGhenis wants to merge 2 commits into
PolicyEngine:mainfrom
MaxGhenis:claude/document-census-tax-id-replacement

Conversation

@MaxGhenis
Copy link
Copy Markdown
Contributor

What

Add an explanatory comment in CensusCPS._create_tax_unit_table documenting that the raw Census ASEC TAX_ID is intentionally replaced by our own construct_tax_units() assignment, and that the original Census value is retained as CENSUS_TAX_ID.

Why

Reading the method cold, the line person["TAX_ID"] = constructed_person["TAX_ID"].values looks like it might be silently clobbering the Census filing-unit grouping. It isn't — that's deliberate:

  • We construct tax units ourselves (default mode "policyengine", applying PE filing/dependency rules) rather than trusting the documented Census TAX_ID.
  • The original is preserved as CENSUS_TAX_ID because it's the ground-truth baseline our construction is validated against (validation/cps_tax_unit_validation.py) and a required raw-schema column (_validate_raw_cps_schema in cps.py).

So CENSUS_TAX_ID should neither be dropped (it would break validation + the schema guard) nor renamed (the UPPER_SNAKE matches the raw-Census-column convention). The only thing missing was a comment making the intent legible at the call site.

Comment-only change; no behavior change.

Tests

  • ruff check + ruff format --check pass on the changed file.
  • No logic touched.

🤖 Generated with Claude Code

MaxGhenis and others added 2 commits May 30, 2026 06:48
_create_tax_unit_table overwrites the raw Census ASEC TAX_ID with our own
construct_tax_units() assignment, retaining the original as CENSUS_TAX_ID. Add a
comment explaining the why so the intent (we build tax units ourselves; the
Census value is kept only as the validation baseline + required raw-schema
column) is legible at the call site.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@MaxGhenis
Copy link
Copy Markdown
Contributor Author

Refiling from a canonical branch (not a fork) so CI's check-fork gate runs. Replaced by a same-repo PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant