Skip to content

Adopt microunit for tax-unit reconstruction (scoped, behavior-preserving; part of #113)#114

Merged
MaxGhenis merged 3 commits into
mainfrom
wire-microunit
May 31, 2026
Merged

Adopt microunit for tax-unit reconstruction (scoped, behavior-preserving; part of #113)#114
MaxGhenis merged 3 commits into
mainfrom
wire-microunit

Conversation

@MaxGhenis
Copy link
Copy Markdown
Contributor

What

Adds microunit as a dependency and introduces a delegation seam in the PolicyEngine tax-unit reconstruction path. USPipeline._build_policyengine_tax_units_via_microunit calls microunit.construct_tax_units when the person frame carries microunit's raw CPS input columns (PH_SEQ, A_LINENO, A_MARITL, A_SPOUSE, PEPAR1, PEPAR2, A_EXPRRP), and returns None otherwise.

Part of #113. The us-data side of the same adoption is PolicyEngine/policyengine-us-data#1157.

Scope — what this does and does NOT do

A scoped, behavior-preserving first step, not the full ~1,500-line replacement #113 describes.

  • ✅ Adds the dependency (pinned pre-PyPI: microunit @ git+…@d3eccbb) + lockfile.
  • ✅ Adds the delegation method and wires it into _build_policyengine_tax_units_from_role_flags (tried first; falls through to the legacy reconstruction on None).
  • ✅ Preserves the authoritative-ID path (Restore eCPS export and entity ID parity #112) untouched — it is never routed through microunit.
  • ❌ Does not delete the legacy role-flag reconstruction; it stays as the fallback.
  • ❌ Does not change any output on today's data. microplex's reconstruction-stage frame collapses relationship_to_head into a 0/1/2/3 coding and drops the spouse/parent pointer columns, so the column guard is never satisfied in production → the delegation always returns None today. It is inert until a follow-up threads CPS columns through to entity construction.

⚠️ Entity-convergence caveat (do not skip — see #113)

microunit is eCPS's tax-unit construction engine. Activating this delegation (in a later PR, once CPS columns are threaded through) makes microplex's tax units converge toward eCPS's. Any loss movement from that must be read as an entity-convergence effect — not microplex improving — and isolated with a matched-N, symmetric-refit, holdout before/after comparison on the same target surface. This PR deliberately keeps the delegation inert so it can land without touching the live benchmark.

Tests

  • tests/pipelines/test_us_microunit_delegation.py4 passing (exercises the real delegation path with a CPS-shaped fixture; microunit installed).
  • ruff check clean on changed files.

Reviewer notes

  • The crux is the input-schema mapping in _build_policyengine_tax_units_via_microunit: microunit keys on (PH_SEQ, A_LINENO); per-person TAX_ID/role and per-unit filing_status are mapped back onto microplex's person_id/household_id with start_tax_unit_id offsetting. Please scrutinize that mapping and the offsetting.
  • Draft on purpose. Per Adopt microunit for tax-unit construction (guard against regression-to-eCPS) #113, MP adoption is sequenced after the us-data adoption (#1157) lands and is verified in production. Do not merge ahead of that.

🤖 Generated with Claude Code

Add microunit as a dependency and route the reconstruction-from-scratch
tax-unit path through microunit.construct_tax_units when the person frame
carries microunit's raw CPS input columns (PH_SEQ, A_LINENO, A_MARITL,
A_SPOUSE, PEPAR1, PEPAR2, A_EXPRRP). When those columns are absent -- the
current production case, since microplex's reconstruction frame collapses
relationship_to_head and drops the spouse/parent pointers -- the new
USPipeline._build_policyengine_tax_units_via_microunit returns None and the
legacy role-flag reconstruction runs unchanged. The authoritative-ID path
(#112) is never routed here.

Net effect is behavior-preserving on today's data: the delegation stays
inert until an upstream change threads CPS columns through to entity
construction. microunit IS eCPS's tax-unit engine, so activating the
delegation converges microplex's tax units toward eCPS's; any resulting
loss movement is an entity-convergence effect and must be interpreted as
such, not as a quality win (see #113).

Adds tests/pipelines/test_us_microunit_delegation.py (4 passing); ruff clean.
Implementation produced by the parallel wire-microunit agent; verified
(ruff + delegation tests) and committed here.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
MaxGhenis and others added 2 commits May 30, 2026 14:53
microunit 0.1.0 is now published to PyPI, so drop the pre-PyPI
git+https commit pin in favor of a standard version constraint.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…r test

- Route microunit's filing_status through _normalize_policyengine_filing_status
  so the delegated path cannot diverge from the legacy paths if microunit ever
  changes its spelling/casing (today the vocabularies already match).
- Add a regression test feeding rows out of PH_SEQ/A_LINENO order, asserting
  correct unit/role/filing assignment — locks in microunit's input-row-order
  contract that the positional TAX_ID mapping relies on.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@MaxGhenis MaxGhenis marked this pull request as ready for review May 31, 2026 20:34
@MaxGhenis MaxGhenis merged commit 49f31d0 into main May 31, 2026
4 checks passed
@MaxGhenis MaxGhenis deleted the wire-microunit branch May 31, 2026 20:34
MaxGhenis added a commit that referenced this pull request May 31, 2026
Default-on high-fidelity microunit adapter replaces the CPS TAX_ID (keeps SPM units); builds on #114. filing_status stays PE-delegated. Entity-convergence toward eCPS per #113. Supersedes auto-closed #116. Follow-up: #122.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant