test(survey): validate Survey Data Support methodology + promote to Complete#558
Conversation
Overall Assessment✅ Looks good — no unmitigated P0/P1 findings. Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
…omplete Add tests/test_methodology_survey.py (33 tests, anchored to Binder 1983 Eq. 4.7 and survey-theory.md sections 5/6) isolating the design-based TSL and replicate- weight variance identities that the broad survey suite previously covered only indirectly: the multi-stratum Bessel decomposition, the fweight (df=sum(w)-k) and aweight (unweighted-meat) structures, the exact DEFF = design_var/srs_var ratio, and the residual-scale==score-scale cross-function identity. The core variance machinery (compute_survey_vcov / _compute_stratified_psu_meat / compute_replicate_vcov / df_survey) was read against Binder Eq. 4.7 and verified to implement the documented identities faithfully -- no code change was required. Promote the Survey Data Support methodology-review row to Complete (the last In Progress row, so the tracker is now fully Complete). Correct the Korn & Graubard (1990) citation venue (JASA 85(409) -> The American Statistician 44(4):270-276) and add Lumley (2004) JSS, Korn-Graubard (1990), and Solon-Haider-Wooldridge (2015) to docs/references.rst. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
bcec813 to
2898b0e
Compare
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment✅ Looks good — no unmitigated P0/P1 findings. Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
Summary
tests/test_methodology_survey.py(33 tests, 10 Binder-equation-anchored classes) — the methodology validation suite for Survey Data Support. It isolates the design-based TSL and replicate-weight variance identities that the broad survey suite previously covered only indirectly: the multi-stratum Bessel decomposition, the fweight (df=Σw−k) and aweight (unweighted-meat) structures, the exactDEFF = design_var/srs_varratio, and the residual-scale==score-scale cross-function identity. The other 6 core identities are equation-anchored and reference the existing direct oracles (no duplication).compute_survey_vcov/_compute_stratified_psu_meat/compute_survey_if_variance/compute_replicate_vcov/df_survey) was read against Binder (1983) Eq. 4.7 anddocs/methodology/survey-theory.md§5/§6 and verified to implement the documented identities faithfully — no code change was required (consistent with the machine-precision R-parity scenarios already passing).docs/references.rst(both were cited in REGISTRY/theory but absent fromreferences.rst).Survey Data Supportto Complete — the last In Progress methodology-review row, so the tracker is now fully Complete. The Complete entry has full Verified Components / R-parity table / Corrections Made / Deviations / cross-estimator gaps-boundary; the status row, Priority Order, and the now-stale "In Progress band" prose are swept.Methodology references
## Survey Data Support+ the tracker Complete entry; none undocumented):lonely_psudefault"remove"vs R"fail"; replicate factor divides by designRnotn_valid; PSU-level Hall-Mammen wild bootstrap; strata-vs-no-strata non-bit-equality (RNG path).Validation
tests/test_methodology_survey.py(33 tests, all passing; black/ruff clean).tests/test_survey_r_crossvalidation.py+tests/test_survey_estimator_validation.py(48 passed, 6 skipped where real-data goldens are absent in this checkout).diff_diff/source changed — this PR is test + documentation only.Security / privacy