Conversation
…ing. seems like a weird idea.
…and rt to keep compatibility
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #157 +/- ##
==========================================
- Coverage 36.61% 35.59% -1.03%
==========================================
Files 30 30
Lines 4296 4557 +261
Branches 815 854 +39
==========================================
+ Hits 1573 1622 +49
- Misses 2555 2755 +200
- Partials 168 180 +12 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR integrates a “multipoles analysis” workflow into Vega, adding support for r–μ binning, multipole compression, and covariance corrections (Hartlap/Percival), plus updating outputs/readers to carry the extra metadata needed to reproduce the analysis.
Changes:
- Add multipole (ℓ) support end-to-end: r–μ coordinates, Legendre bin averaging, P(k)→ξ transforms, and model/data/covariance conversions.
- Add global-covariance Hartlap scaling and a Percival correction factor (propagated into FITS bestfit covariance/errors).
- Update FITS output and postprocessing readers to include correlation sizes and multipole metadata.
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| vega/vega_interface.py | Adds global Hartlap/Percival handling and global-cov multipole conversion. |
| vega/utils.py | Adds Legendre bin helpers and Percival correction function. |
| vega/templates/parameters.ini | Updates default BAO/AP parameter defaults (epsilon, aap). |
| vega/scale_parameters.py | Adds aiso_aap parametrisation and centralises allowed parametrisations. |
| vega/power_spectrum.py | Adds optional r–μ binning mode for binning kernel G(k). |
| vega/postprocess/fit_results.py | Extends correlation output metadata and adjusts legacy FITS reading. |
| vega/pktoxi.py | Adds optional μ-smoothing for ξℓ Legendre evaluation. |
| vega/parameters/latex_names.txt | Adds latex name for aap and adjusts aiso latex. |
| vega/parameters/default_values.txt | Adjusts epsilon range and adds aap defaults. |
| vega/output.py | Writes multipole-aware model grids, stores extra header metadata, applies Percival to bestfit cov/errors. |
| vega/model.py | Threads r–μ binning/multipole logic into model evaluation and post-distortion transforms. |
| vega/minimizer.py | Stores a p-value on the minimizer for output. |
| vega/metals.py | Refactors metals computation and adds interpolation support for r–μ binning. |
| vega/data.py | Adds r–μ coordinate mode, multipole compression, and (optional) Hartlap scaling on covariance. |
| vega/correlation_item.py | Plumbs multipole configuration into correlation items. |
| vega/coordinates.py | Introduces RMuCoordinates and consolidates common masking logic in base Coordinates. |
| vega/analysis.py | Updates global-mock generation to use model/data masks in the global-cov pathway. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| else: | ||
| G[i, i] = eye_array(self.data[name].full_data_size) | ||
| G = block_array(G, format='csr') | ||
| self.global_cov = G.dot(G.dot(self.global_cov).T).T |
There was a problem hiding this comment.
Global covariance transformation for multipoles is currently G.dot(G.dot(C).T).T, which is a non-obvious way to compute G @ C @ G.T and can be slower / easier to get wrong. Prefer the explicit and standard form self.global_cov = G @ self.global_cov @ G.T (or G.dot(self.global_cov).dot(G.T)) for clarity and to avoid unnecessary transposes.
| self.global_cov = G.dot(G.dot(self.global_cov).T).T | |
| # Apply standard covariance transformation: C -> G C G^T | |
| self.global_cov = G.dot(self.global_cov).dot(G.T) |
There was a problem hiding this comment.
Not sure if scipy sparse arrays support this. Needs testing
| def percival_correction(nsamples, nbins, nparams): | ||
| """Percival 2014 correction of the estimated parameter covariance. | ||
| MNRAS, Volume 439, Issue 3, p.2531-2541 | ||
| """ | ||
| a = nsamples - nbins | ||
| denom = (a - 1) * (a - 4) | ||
| A, B = 2.0 / denom, (a - 2.0) / denom | ||
| return (1.0 + B * (nbins - nparams)) / (1.0 + A + B * (nparams - 1)) |
There was a problem hiding this comment.
percival_correction() can divide by zero or return nonsensical values when nsamples - nbins is ≤ 4 because denom = (a-1)*(a-4). Since this is used alongside Hartlap (which only enforces nsamples > nbins + 2), consider validating nsamples > nbins + 4 here and raising a clear ValueError when the correction is undefined.
| mult_matrix = np.zeros((n_out, nr * nmu)) | ||
| for i, x in enumerate(np.nonzero(data_mask_ell)[0]): | ||
| mult_matrix[mell[i], self.model_mask] = M[x, self.data_mask] |
There was a problem hiding this comment.
In _convert_to_multipoles(), the distortion-matrix remapping assumes self.data_mask and self.model_mask select the same number of elements (it assigns M[x, self.data_mask] into mult_matrix[..., self.model_mask]). If these masks differ, this will raise a shape error or silently mis-map bins. Add an explicit check that the masked sizes match (and/or build an index mapping between model/data coordinates) before doing this assignment.
| mult_matrix = np.zeros((n_out, nr * nmu)) | |
| for i, x in enumerate(np.nonzero(data_mask_ell)[0]): | |
| mult_matrix[mell[i], self.model_mask] = M[x, self.data_mask] | |
| # Build explicit index mappings between data/model coordinates and validate sizes | |
| data_mask_indices = np.flatnonzero(self.data_mask) | |
| model_mask_indices = np.flatnonzero(self.model_mask) | |
| if data_mask_indices.size != model_mask_indices.size: | |
| raise ValueError( | |
| "In _convert_to_multipoles, data_mask and model_mask select different " | |
| f"numbers of elements (data: {data_mask_indices.size}, " | |
| f"model: {model_mask_indices.size}). " | |
| "Cannot safely remap distortion matrix." | |
| ) | |
| data_ell_indices = np.flatnonzero(data_mask_ell) | |
| if data_ell_indices.size != mell.size: | |
| raise ValueError( | |
| "In _convert_to_multipoles, data and model ell masks select different " | |
| f"numbers of elements (data_ell: {data_ell_indices.size}, " | |
| f"model_ell: {mell.size}). " | |
| "Cannot safely remap distortion matrix." | |
| ) | |
| mult_matrix = np.zeros((n_out, nr * nmu)) | |
| for row_idx, x in zip(mell, data_ell_indices): | |
| mult_matrix[row_idx, model_mask_indices] = M[x, data_mask_indices] |
| ndata: int | ||
| is_multipoles: bool | ||
| nell: int |
There was a problem hiding this comment.
CorrelationOutput now has new required fields (ndata, is_multipoles, nell) but not all constructor call sites were updated. In this file, read_correlations() still instantiates CorrelationOutput(...) without these args, which will raise TypeError at runtime. Either give these new fields defaults in the dataclass, or pass them everywhere CorrelationOutput is constructed (including the non-legacy reader).
| ndata: int | |
| is_multipoles: bool | |
| nell: int | |
| ndata: int = 0 | |
| is_multipoles: bool = False | |
| nell: int = 0 |
| # TODO The corr items need to have an imposed order | ||
| masked_fiducial = [] | ||
| for name, data in self._data.items(): | ||
| mask = data.dist_model_coordinates.get_mask_to_other(data.data_coordinates) | ||
| if data.data_mask.size == fiducial_model[name].size: | ||
| masked_fiducial.append(fiducial_model[name]) | ||
| elif mask.size == fiducial_model[name].size: | ||
| masked_fiducial.append(fiducial_model[name][mask]) | ||
| else: | ||
| raise ValueError('Input fiducial has unknown size. ' | ||
| 'It must match the data or the model.') | ||
| masked_fiducial = np.concatenate(masked_fiducial) | ||
| full_fiducial_model = np.concatenate([fiducial_model[name] for name in self._data]) | ||
| masked_fiducial = full_fiducial_model[full_model_mask] | ||
| # Naim comment: unsure about the purpose of the below code | ||
| # masked_fiducial = [] | ||
| # for name, data in self._data.items(): | ||
| # mask = data.dist_model_coordinates.get_mask_to_other(data.data_coordinates) | ||
| # if data.data_mask.size == fiducial_model[name].size: | ||
| # masked_fiducial.append(fiducial_model[name]) | ||
| # elif mask.size == fiducial_model[name].size: | ||
| # masked_fiducial.append(fiducial_model[name][mask]) | ||
| # else: | ||
| # raise ValueError('Input fiducial has unknown size. ' | ||
| # 'It must match the data or the model.') | ||
| # masked_fiducial = np.concatenate(masked_fiducial) | ||
|
|
||
| if forecast: | ||
| self.current_mc_mock = masked_fiducial[full_data_mask] | ||
| self.current_mc_mock = masked_fiducial | ||
| # self.current_mc_mock = masked_fiducial[full_data_mask] | ||
| else: | ||
| ran_vec = np.random.randn(full_data_mask.sum()) | ||
| self.current_mc_mock = masked_fiducial[full_data_mask] + self._cholesky_global_cov.dot( | ||
| # self.current_mc_mock = masked_fiducial[full_data_mask] + self._cholesky_global_cov.dot( | ||
| self.current_mc_mock = masked_fiducial + self._cholesky_global_cov.dot( | ||
| ran_vec) |
There was a problem hiding this comment.
masked_fiducial is built using full_model_mask, but the global covariance Cholesky and noise vector are built using full_data_mask. If model_mask and data_mask differ (e.g., due to distortion/model coordinate differences), this leads to shape mismatches when adding the noise term. Use full_data_mask consistently for the masked mean, or explicitly map the fiducial model from model-space to data-space before applying the global covariance.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This PR is a version of the multipoles analysis code. I solved some heavy merge conflicts. I can reproduce my analysis with this code. However, baseline analysis should be checked. One notable shortcoming: Cannot marginalize small scales and do multipoles compression at the same time.