Skip to content

Add top-tail preserving MP-vs-eCPS sampling#156

Draft
MaxGhenis wants to merge 2 commits into
mainfrom
codex/top-tail-preserving-compare-20260601
Draft

Add top-tail preserving MP-vs-eCPS sampling#156
MaxGhenis wants to merge 2 commits into
mainfrom
codex/top-tail-preserving-compare-20260601

Conversation

@MaxGhenis
Copy link
Copy Markdown
Contributor

Summary

  • add a top_agi_preserve matched sampling mode for MP-vs-eCPS comparisons
  • preserve households with any tax unit at or above a configurable AGI threshold before random fill
  • expose --matched-top-agi-threshold and record the threshold in comparison outputs

Why

The PR #141 PUF aggregate/support-clone artifact currently shows severe top-tail placement issues, especially cap gains and interest in high AGI bins. Uniform matched-N thinning can randomly drop rare top-tail households, so this adds a diagnostic comparison mode to separate sampling noise from true PUF aggregate synthesis problems.

This does not change production builds or the default comparison method.

Validation

  • uv run ruff check src/microplex_us/pipelines/performance.py src/microplex_us/pipelines/ecps_replacement_comparison.py tests/pipelines/test_performance.py
  • uv run --extra dev python -m pytest tests/pipelines/test_performance.py::test_sample_matched_household_ids_supports_weighted_methods tests/pipelines/test_performance.py::test_write_matched_policyengine_us_baseline_dataset_preserves_top_agi_households tests/pipelines/test_ecps_replacement_comparison.py::test_sound_ecps_replacement_comparison_satisfies_gate_contract tests/pipelines/test_ecps_replacement_comparison.py::test_ecps_replacement_comparison_module_cli_help_runs -q
  • uv run --extra dev python -m pytest tests/pipelines/test_ecps_replacement_comparison.py -q

@MaxGhenis MaxGhenis force-pushed the codex/top-tail-preserving-compare-20260601 branch from be6a74b to 3a5c0db Compare June 1, 2026 20:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant