Reported calibration uncertainties are systematically too narrow

Two independent bugs that both bias calibration confidence intervals downward. Users who rely on `tmpf_var`, `tmpb_var`, `tmpw_var`, or MC percentile CIs for uncertainty quantification are getting overconfident estimates.

### 1. Wrong variance propagation for fixed parameters
When a user passes `fix_gamma=(value, var)`, `fix_alpha=(value, var)`, or `fix_dalpha=(value, var)` to single- or double-ended calibration, the helper functions inflate the WLS weights by adding the parameter variance contribution to `1/w`. The formula used is:

```python
w = 1 / (1 / w + fix_gamma[1] * X_gamma)
```

This is dimensionally and statistically wrong. The propagated variance from a fixed parameter `θ` with variance `σ²_θ` is `(∂y/∂θ)² · σ²_θ`, i.e. the **Jacobian column squared** times the variance, not the column itself times the variance. The correct form is:

```python
w = 1 / (1 / w + (X_gamma ** 2) * fix_gamma[1])
```

Affected lines (all use the wrong `coef · Var` form):
- `src/dtscalibration/calibrate_utils.py:127` — single-ended `fix_gamma`
- `:136` — single-ended `fix_alpha`
- `:152-163` — single-ended `fix_dalpha`
- `:634, :661` — double-ended `fix_gamma`
- `:778, :812` — double-ended `fix_alpha`
- `:984` — double-ended (further `fix_*` branch)

**Symptom:** whenever any parameter is fixed, the inflated measurement uncertainty caused by parameter uncertainty is underestimated, so reported `tmpf_var` / `tmpb_var` are too small. Existing tests for `fix_gamma` (e.g. `tests/test_dtscalibration.py:707`) only check recovered temperatures, never the propagated variance, so this has gone undetected.

### 2. `ddof` ignores number of fitted parameters in residual variance
`src/dtscalibration/variance_helpers.py:35` (constant) and `:129` (exponential) compute residual variance as:

```python
var_I = resid.var(ddof=1)
```

The number of fitted parameters in `variance_stokes_constant_helper` is `nt + nxs` (one scale per time step, one per spatial location); in `variance_stokes_exponential_helper` it is `n_sections + nt * n_sections`. The unbiased denominator should be `N − npar`, not `N − 1`. The author was aware — the comment at `variance_helpers.py:34` says \"originally thought it was npar\" — but the choice was deliberately set to 1.

**Symptom:** the estimated Stokes-noise variance is biased downward. Calibration weights are too large, and downstream MC confidence intervals are too narrow. The `ddof` variable computed at `:75` is even dead code: it is used only to size the statsmodels design matrix at `:89` and is never used for the variance itself in the scipy branch.

**Fix:** use `np.sum(resid ** 2) / (N - npar)` with the correct `npar` for each estimator.

### Suggested tests
- **Analytic variance recovery under `fix_gamma`**: synthesize data with known `gamma`, calibrate with `fix_gamma=(gamma_true, sigma_gamma**2)` for a non-zero `sigma_gamma**2`, then assert `tmpf_var` equals the analytical expression `(T**2 / gamma)**2 * sigma_gamma**2 + measurement_term` to `rtol=1e-12`. Will fail on current code and demonstrate the wrong exponent.
- **`ddof` bias check**: with synthetic IID Gaussian residuals of known variance `sigma**2`, assert that `variance_stokes_constant_helper` recovers `sigma**2` to within the expected sampling error of an *unbiased* estimator (`std-err ≈ sigma**2 * sqrt(2 / (N - npar))`). Tighten existing `decimal=1` tolerances in `tests/test_variance_stokes.py` to expose the bias.
- **Limiting case**: with `nt=2, nxs=2` and a small synthetic dataset, the `(N-1)` vs `(N-npar)` discrepancy is large enough to verify exactly against a hand-computed expected variance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reported calibration uncertainties are systematically too narrow #231

1. Wrong variance propagation for fixed parameters

2. `ddof` ignores number of fitted parameters in residual variance

Suggested tests

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Reported calibration uncertainties are systematically too narrow #231

Description

1. Wrong variance propagation for fixed parameters

2. ddof ignores number of fitted parameters in residual variance

Suggested tests

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

2. `ddof` ignores number of fitted parameters in residual variance