Skip to content

Add return_components support to R-learner#923

Open
aman-coder03 wants to merge 4 commits into
uber:masterfrom
aman-coder03:feature/rlearner-return-components
Open

Add return_components support to R-learner#923
aman-coder03 wants to merge 4 commits into
uber:masterfrom
aman-coder03:feature/rlearner-return-components

Conversation

@aman-coder03

@aman-coder03 aman-coder03 commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Proposed changes

this PR adds return_components support to the R-Learner, bringing its API in line with the existing T- and X-Learner implementations

specific changes..

  • adds a return_components argument to predict() and fit_predict() for both BaseRLearner and BaseRClassifier
  • returns the nuisance components used by the R-Learner
    • yhat: outcome model predictions (E[Y|X])
    • p: propensity score estimates (E[W|X])
  • adds the same mutual exclusion guard as other meta-learners, preventing return_ci and return_components from being used together
  • fits the nuisance outcome model after cross-validation so that it can be used for inference-time component retrieval.
  • adds tests covering the new return_components functionality for both predict() and fit_predict() along with the mutual exclusion behavior

fixes #304

Types of changes

What types of changes does your code introduce to CausalML?
Put an x in the boxes that apply

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation Update (if none of the other choices apply)

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

  • I have read the CONTRIBUTING doc
  • I have signed the CLA
  • Lint and unit tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)
  • Any dependent changes have been merged and published in downstream modules

Further comments

If this is a relatively large or complex change, kick off the discussion by explaining why you chose the solution you did and what alternatives you considered, etc. This PR template is adopted from appium.

@jeongyoonlee jeongyoonlee left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two blockers before merge:

B1 — the p component is the stored training propensity, not recomputed for the passed X.
In both predict() bodies, when p is None you set p = self.propensity (training-data propensity) while yhat = self.model_mu.predict(X) is computed for the passed X. The two returned components end up on different bases, and on a differently-sized X the p array length won't match yhat/te — with no error raised:

rl.fit(X[:800], treatment[:800], y[:800], p=p[:800])
te, yhat, p_hat = rl.predict(X[800:1000], return_components=True)
# te: (200, 1), yhat: len 200, p_hat: len 800

Please mirror the X-learner, which recomputes propensity for X (xlearner.py:203/:643):

if p is None:
    p = {g: self.propensity_model[g].predict(X) for g in self.t_groups}
else:
    p = self._format_p(p, self.t_groups)

Note self.propensity_model only exists when fit() ran with p=None; if the user supplied p at fit and then calls predict(p=None) there is no model to recompute from — raise a clear error there rather than returning stale training values. The current test doesn't surface this because it always passes p=p_scores and predicts on the training X of equal length.

B2 — predict(..., return_ci=True) is accepted but never implemented.
The new predict() signature adds return_ci=False, but the body only uses it for the return_ci/return_components mutual-exclusion guard — it never computes CIs. So te, lb, ub = rl.predict(X, return_ci=True) raises ValueError at the unpack site (same footgun as #886). R-learner has no per-predict bootstrap path, so rather than accept-and-ignore, please drop return_ci from predict() (keep the guard in fit_predict, which does implement it) or raise NotImplementedError when True.

Tests: the new coverage only exercises BaseRLearner. Since BaseRClassifier and XGBRRegressor both had fit()/predict() modified, please add return_components tests for the classifier override and XGBR, plus the p=None predict path and a predict on a different-sized X (that last one guards B1).

Non-blocking notes to follow separately.

@jeongyoonlee jeongyoonlee added the enhancement New feature or request label Jul 4, 2026
@aman-coder03 aman-coder03 requested a review from jeongyoonlee July 4, 2026 09:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

return_components for R-Learner

2 participants