Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/1154.changed.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Documented in `CensusCPS._create_tax_unit_table` that the raw Census ASEC `TAX_ID` is replaced by our `construct_tax_units()` assignment, with the original retained as `CENSUS_TAX_ID` for validation.
9 changes: 9 additions & 0 deletions policyengine_us_data/datasets/cps/census_cps.py
Original file line number Diff line number Diff line change
Expand Up @@ -160,13 +160,22 @@ def _create_tax_unit_table(
person: pd.DataFrame,
mode: str | None = None,
) -> pd.DataFrame:
# The raw Census ASEC TAX_ID (a documented filing-unit grouping) is NOT
# used as our tax unit. We build tax units ourselves with
# construct_tax_units() (default mode "policyengine", which applies PE
# filing/dependency rules) and overwrite TAX_ID with that assignment
# below. The original Census value is preserved as CENSUS_TAX_ID so it
# stays available as the ground-truth baseline our construction is
# validated against (see validation/cps_tax_unit_validation.py) and is a
# required raw-schema column (see _validate_raw_cps_schema in cps.py).
person["CENSUS_TAX_ID"] = person["TAX_ID"]
mode = mode or self.tax_unit_construction_mode
constructed_person, tax_unit_df = construct_tax_units(
person=person,
year=self.time_period,
mode=mode,
)
# Replace Census TAX_ID with our constructed tax-unit assignment.
person["TAX_ID"] = constructed_person["TAX_ID"].values
return tax_unit_df[["TAX_ID"]]

Expand Down
Loading