Enforce PUF-clone weight-share floor in the dataset upload validator#1159
Draft
MaxGhenis wants to merge 1 commit into
Draft
Enforce PUF-clone weight-share floor in the dataset upload validator#1159MaxGhenis wants to merge 1 commit into
MaxGhenis wants to merge 1 commit into
Conversation
The Enhanced CPS build has two guards against degraded PUF-clone behavior, but they were not equivalent: - The generation-time guard (``enhanced_cps.validate_clone_diagnostics``) raises when the clone household weight share falls below ``MIN_PUF_CLONE_HOUSEHOLD_WEIGHT_SHARE_PCT`` (5%) or the clone-tax-vs- market-income share exceeds ``MAX_PUF_CLONE_TAXES_EXCEED_MARKET_INCOME_SHARE_PCT`` (25%). - The upload-time guard (``upload_completed_datasets._clone_diagnostics_errors``) only checked that each metric was finite and within [0, 100]. Because the upload-time guard was weaker, a degraded artifact could still publish even though generation would have rejected it. Import the two thresholds from ``enhanced_cps`` (rather than re-hardcoding 5.0 / 25.0) and enforce the same floor/bound in the upload validator, after the existing finite/range checks. The share fields are absent on some periods/older sidecars, so enforcement is guarded by presence and finiteness checks to preserve back-compatibility. Fixes #1158 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #1158
Summary
Enforce the PUF-clone weight-share floor (and the taxes-exceed-market-income cap) in the dataset upload validator, so a degraded Enhanced CPS artifact cannot be published.
The generation-time guard in
enhanced_cps.pyalready rejects builds where PUF-clone households fall below the 5% weight-share floor (or where clone tax pathology exceeds 25%). But the upload validator only checked each clone metric was finite and in[0,100]— it did not enforce the floor/cap. That gap is why the currently-published Enhanced CPS (clone share ≈ 0%, built before the clone fix) was able to ship.Changes
policyengine_us_data/storage/upload_completed_datasets.py:_clone_diagnostics_errorsnow importsMIN_PUF_CLONE_HOUSEHOLD_WEIGHT_SHARE_PCTandMAX_PUF_CLONE_TAXES_EXCEED_MARKET_INCOME_SHARE_PCTfromenhanced_cps.py(single source of truth) and appends an error whenclone_household_weight_share_pct < 5%orclone_taxes_exceed_market_income_share_pct > 25%. Both checks are presence- and finiteness-guarded, so older sidecars without these fields still validate (back-compat).tests/unit/test_upload_clone_diagnostics_floor.py: 2% share → rejected (floor); 66% tax share → rejected (cap); healthy 10%/5% → passes; missing fields → passes.Tightens the upload gate only; healthy artifacts and older sidecars are unaffected.
🤖 Generated with Claude Code