wooldridge: CI R9 P1 fix — reject cohort_trends=True + control_group='never_treated' (unidentified)

igerber · igerber · commit ebb255554d5c · 2026-05-22T21:05:25.000-04:00
diff --git a/TODO.md b/TODO.md
@@ -97,6 +97,7 @@ Deferred items from PR reviews that were not addressed before merge.
 | WooldridgeDiD: design-consistent cohort totals for `aggregate(weights="cohort_share")` on survey-weighted fits. Current impl populates `_n_g_per_cohort` from `unit.nunique()` (raw counts); composing these unweighted cohort shares with the design-weighted ATTs targets a mixed estimand inconsistent with paper W2025 Section 7's design-population cohort-share form. PR-B Stage E fail-closes the surface (raises `ValueError` when `survey_design is not None`); the follow-up implements survey-weighted unit totals per cohort and re-enables the surface. | `wooldridge.py` `_n_g_per_cohort` population, `wooldridge_results.py::aggregate` survey gate | PR-B follow-up | Medium |
 | WooldridgeDiD: unconditional inference for `aggregate(weights="cohort_share")` accounting for sampling uncertainty in the cohort shares ω̂_g / ω̂_{ge} (paper W2025 Section 7.5). Current impl fail-closes the t-stat / p-value / conf-int fields to NaN under cohort-share aggregation because the analytical SE is conditional-on-shares. Proper APE/GMM-style aggregate inference (Wooldridge 2023 Section 4 framework) re-enables full inference. | `wooldridge_results.py::aggregate` cohort_share inference branch | PR-B follow-up | Medium |
 | WooldridgeDiD: `cohort_trends=True` + `survey_design` composition. PR-B Stage E fail-closes the cross-product with `NotImplementedError` at `fit()` because the full-dummy `dg_i · t` design composed with the survey TSL variance hasn't been validated against R-parity goldens. Follow-up: validate the composition (or implement a survey-aware alternative) and re-enable the surface. | `wooldridge.py` fit guard, `wooldridge_results.py::aggregate` (if survey-aware cohort_trends variance plumbing is added) | PR-B follow-up | Low |
+| WooldridgeDiD: `cohort_trends=True` + `control_group="never_treated"` composition. PR-B Stage E (codex R9 P1 fix) fail-closes the cross-product with `NotImplementedError` at `fit()` because the OLS + never_treated branch emits ALL `(g, t)` cells as treatment-cell dummies (paper Section 4.4 placebo coverage); the appended `dg_i · t` trend columns are linearly spanned by the per-cohort sum of those cell dummies, so the Section 8 trend specification is unidentified. Follow-up: implement a separate design-matrix branch that drops the pre-treatment placebo dummies (or restricts the trend interaction to post-treatment cells) under the trend specification, then re-enable the combination. | `wooldridge.py` fit guard + `_build_interaction_matrix` redesign for the cohort_trends path | PR-B follow-up | Low |
 | WooldridgeDiD: optional *efficiency hint* (NOT a canonical-link violation per W2023 Prop 3.1) when method/outcome pairing is sub-optimal — e.g., `method="ols"` on binary data is consistent under QMLE, but `method="logit"` is typically more efficient. The original framing in this row as a "canonical link requirement" tied to Prop 3.1 was incorrect: Wooldridge (2023) Table 1 lists Gaussian/OLS for "any response" and logistic-Bernoulli for "binary OR fractional". A useful hint exists (efficiency), but should not be framed as a methodology violation. See PR #453 R1 review for the corrected reading. | `wooldridge.py` | #216 | Low |
 | WooldridgeDiD: Stata `jwdid` golden value tests — add R/Stata reference script and `TestReferenceValues` class. | `tests/test_wooldridge.py` | #216 | Medium |
 <!-- The PreTrendsPower R parity row (PR-C, 2026-05-19) and the four PR-A-tagged PreTrendsPower rows (CS/SA Σ_22 fidelity, helper `violation_weights`, custom-weight persistence, linear γ-unit MDV; resolved in PR-B 2026-05-18) are all closed — see CHANGELOG.md [Unreleased] Added/Changed/Fixed entries for the new behavior. -->
diff --git a/diff_diff/wooldridge.py b/diff_diff/wooldridge.py
@@ -590,6 +590,33 @@ def fit(
                 "wait for the deferred follow-up tracked in TODO."
             )
 
+        # 0d.ii Reject cohort_trends=True + control_group="never_treated".
+        # The OLS + never_treated branch (paper W2025 Section 4.4 / library
+        # ``_build_interaction_matrix`` ``include_pre=True``) emits ALL
+        # ``(g, t)`` cells for each treated cohort as treatment-cell
+        # indicator dummies — including pre-treatment placebo cells. For
+        # any treated cohort, ``dg_i · t = Σ_t t · 1{cohort=g, time=t}``
+        # is fully spanned by the existing per-cohort sum of cell dummies,
+        # so the appended trend columns are unidentified. The per-cohort
+        # pre-period guard above counts observed periods but doesn't
+        # catch this collinearity. Fail-close pending a redesigned
+        # design-matrix path that drops the placebo dummies (or restricts
+        # to post-treatment cells) under the trend specification — codex
+        # R9 P1 fix.
+        if self.cohort_trends and self.control_group == "never_treated":
+            raise NotImplementedError(
+                "WooldridgeDiD(cohort_trends=True, "
+                "control_group='never_treated') is not yet supported: "
+                "the OLS never_treated path emits ALL (g, t) cells (paper "
+                "W2025 Section 4.4 placebo coverage), and the appended "
+                "dg_i · t trend columns are linearly spanned by the "
+                "per-cohort sum of those cell dummies, so the Section 8 "
+                "trend specification is unidentified on this branch. Use "
+                "control_group='not_yet_treated' (the default) for "
+                "cohort_trends=True, or wait for the deferred follow-up "
+                "tracked in TODO."
+            )
+
         # 0d. Reject survey_design + non-hc1 analytical family. The survey-
         # design TSL (or replicate-weight refit) variance overrides the
         # analytical sandwich, so the requested HC2/HC2-BM/classical family
diff --git a/docs/methodology/REGISTRY.md b/docs/methodology/REGISTRY.md
@@ -1573,6 +1573,7 @@ where `g(·)` is the link inverse (logistic or exp), `η_i` is the individual li
 - **Result attribute:** `WooldridgeDiDResults.cohort_trend_coefs: Dict[g → δ_g]` populated under `cohort_trends=True`; empty dict otherwise.
 - **Note:** Polynomial-trend extensions (`"quadratic"`, `"cubic"` per paper p. 2572 footnote) are NOT yet exposed — `cohort_trends` is a binary `True/False` flag for linear `dg_i · t` only.
 - **Note:** `cohort_trends=True` + `survey_design` is **NOT yet supported** (raises `NotImplementedError` at `fit()`). The full-dummy auto-route composed with the survey TSL variance has not been validated against R-parity goldens. Tracked in TODO follow-up.
+- **Note:** `cohort_trends=True` + `control_group="never_treated"` is **NOT yet supported** (raises `NotImplementedError` at `fit()`). The OLS + never_treated branch emits ALL `(g, t)` cells as treatment-cell indicator dummies (paper W2025 Section 4.4 placebo coverage); the appended `dg_i · t` trend columns are then linearly spanned by the per-cohort sum of those cell dummies, so the Section 8 trend specification is unidentified on this branch. Use `control_group="not_yet_treated"` (the default) for `cohort_trends=True`. Tracked in TODO follow-up.
 - **Note:** Identification + baseline normalization for `cohort_trend_coefs` on all-eventually-treated panels: when a never-treated cohort (`g = 0`) is present, **all** `G` treated cohorts get a `dg_i · t` interaction column and `cohort_trend_coefs[g]` reports each cohort's linear slope relative to the never-treated baseline (absorbed by time FE). When **no** never-treated cohort exists, the **last cohort's** trend column is dropped deterministically per paper W2025 Section 5.4 ("all variables in regression (5.3) involving `dT_i` get dropped"); that cohort serves as the trend baseline, and `cohort_trend_coefs` surfaces `G - 1` entries (the last cohort is absent — its slope is the baseline in deviation form). Mirrors the all-treated normalization the library applies to cohort × time interactions.
 
 ### Deviations from the paper / from R / library extensions
diff --git a/tests/test_methodology_wooldridge.py b/tests/test_methodology_wooldridge.py
@@ -1482,6 +1482,34 @@ def test_cohort_trends_true_plus_weights_cohort_share_simple_excludes_trend_colu
         assert np.isfinite(res.overall_att)
         assert np.isfinite(res.overall_se) and res.overall_se > 0
 
+    def test_cohort_trends_true_rejects_never_treated_control_group(self) -> None:
+        """CI R9 P1 fix: ``cohort_trends=True`` + ``control_group="never_treated"`` raises.
+
+        The OLS + never_treated branch emits ALL (g, t) cells as
+        treatment-cell dummies (paper W2025 Section 4.4 placebo
+        coverage). For each treated cohort g, the trend column
+        ``dg_i · t`` is fully spanned by the per-cohort sum of those
+        cell dummies, so ``dg_i · t`` is unidentified. The library
+        fail-closes the combination with ``NotImplementedError``.
+        """
+        rng = np.random.default_rng(_BASE_SEED_SECTION8 + 23)
+        panel = _make_heterogeneous_trends_panel(rng, n_per_cohort=80, sigma=0.05)
+        with pytest.raises(
+            NotImplementedError,
+            match=r"cohort_trends=True.*control_group='never_treated'.*not yet supported",
+        ):
+            WooldridgeDiD(
+                method="ols",
+                cohort_trends=True,
+                control_group="never_treated",
+            ).fit(
+                panel,
+                outcome="y",
+                unit="unit",
+                time="time",
+                cohort="cohort",
+            )
+
     def test_cohort_trends_true_rejects_survey_design(self) -> None:
         """R5 P1 fix: ``cohort_trends=True`` + ``survey_design`` raises NotImplementedError.