Skip to content

Reimplement dropout_weights() correctly for log-space weights#92

Merged
MaxGhenis merged 1 commit into
mainfrom
fix/dropout-log-space
Apr 17, 2026
Merged

Reimplement dropout_weights() correctly for log-space weights#92
MaxGhenis merged 1 commit into
mainfrom
fix/dropout-log-space

Conversation

@MaxGhenis
Copy link
Copy Markdown
Contributor

Summary

Finding #2 (HIGH) in the bug-hunt report. dropout_weights held weights in log space but set masked entries to literal 0, which is exp(0) = 1 in linear space — the opposite of dropping. It then normalised by dividing masked log-weights by their sum and multiplying by total_weight, which is not a meaningful operation on logs. On realistic survey-weight scales (hundreds to thousands, log ~6–8), masked_weights.sum() could cross zero, producing Inf/NaN that crashed the forward pass via the NaN guard in loss().

Rewritten as standard inverted dropout performed in log space: dropped entries go to -inf (exp = 0) and surviving entries are shifted by -log(1-p) so the expected linear-space sum is preserved. Hoisted to module scope so it is directly testable, and invalid probabilities now raise.

Test plan

  • Add tests/test_dropout_regression.py: p=0 identity, p=1 zeroes all linear-space weights, expected-sum preservation on realistic-scale weights (Monte Carlo tolerance), approximate drop fraction, input validation, and end-to-end smoke test that training with dropout no longer poisons the loss.
  • All existing tests pass (uv run pytest tests -x -q -> 21 passed).

🤖 Generated with Claude Code

The previous implementation held `weights` in log space but set masked
entries to literal 0, which is `exp(0) = 1` in linear space -- the
opposite of dropping. It then normalised by dividing masked log-weights
by their sum and multiplying by `total_weight`, which is not a
meaningful operation on logs. On realistic survey-weight scales
(hundreds to thousands, log ~6-8) `masked_weights.sum()` could cross
zero, producing Inf/NaN that crashed the forward pass via the NaN
guard in loss().

Rewrite the function as standard inverted dropout performed in log
space: dropped entries go to `-inf` (exp = 0) and surviving entries
are shifted by `-log(1-p)` so the expected linear-space sum is
preserved. Hoist the helper to module scope so it is directly testable
and reject out-of-range probabilities explicitly.

Adds tests/test_dropout_regression.py covering p=0 identity, p=1
zeros-all, expected-sum preservation on realistic-scale weights,
approximate drop fraction, input validation, and an end-to-end smoke
test that training with dropout on realistic-scale weights no longer
poisons the loss.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented Apr 17, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
microcalibrate Ready Ready Preview, Comment Apr 17, 2026 0:40am

Request Review

Copy link
Copy Markdown
Contributor Author

@MaxGhenis MaxGhenis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (cannot self-approve; posting as comment).

Correct inverted-dropout semantics in log space:

  • p=0 returns input unchanged (identity).
  • p=1 returns -inf tensor → exp = 0 everywhere (all dropped).
  • Masked entries go to -inf (not 0, which was the original bug).
  • Survivors shifted by -log(1-p) so E[exp(dropout(logw))] = exp(logw) in linear space, matching standard inverted dropout.
  • p validated to [0, 1] with a clear ValueError.

Tests cover identity, full dropout, approximate drop fraction, expected-sum preservation on realistic (hundreds-to-thousands) weight scales, input validation, and end-to-end reweight() with dropout_rate > 0. Good coverage of the original bug's exact failure mode (NaN/Inf on realistic scales).

Hoisting dropout_weights to module scope is also a small structural win — now directly unit-testable.

@MaxGhenis MaxGhenis merged commit d8a478f into main Apr 17, 2026
6 checks passed
@MaxGhenis MaxGhenis deleted the fix/dropout-log-space branch April 17, 2026 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant