Skip to content

0.6.0 Rewrite#453

Open
ejolly wants to merge 645 commits into
cosanlab:masterfrom
ejolly:uv-cleanup
Open

0.6.0 Rewrite#453
ejolly wants to merge 645 commits into
cosanlab:masterfrom
ejolly:uv-cleanup

Conversation

@ejolly
Copy link
Copy Markdown
Collaborator

@ejolly ejolly commented Apr 15, 2026

No description provided.

@ejolly
Copy link
Copy Markdown
Collaborator Author

ejolly commented Apr 21, 2026

Force-pushed uv-cleanup after a git filter-repo pass to strip ~120 MB of
history-only junk (old notebook snapshots with embedded outputs, stale docs
builds, joblib caches, long-removed Python files). The tree at HEAD is
byte-identical to before the rewrite — the PR diff is unchanged
(403 files, +142246/-15578), only commit SHAs are rewritten.

Reason: uv sync from a git URL clones the full repo, so the bloat made
first-time installs of the 0.6 branch very slow (~180 MB clone, now ~50 MB).

Consequence for merging: because commit SHAs and ancestry changed, this PR
can no longer be merged with a fast-forward or merge commit — please
squash-merge (which is a good fit for a "0.6.0 Rewrite" PR anyway).

ejolly added 24 commits April 21, 2026 17:22
Reverts commits f8c5079 and 288173d which incorrectly deleted
~30,000 lines of code including CI workflows, integration tests,
pipelines, and tutorials.
Changed sigma estimation from np.std(res, ddof=p) to np.sqrt(RSS/df)
where RSS = sum(residuals²) and df = n - p.

The old formula subtracted the mean of residuals before computing,
which is biased for intercept-free regression models where E[res] ≠ 0.

Fixes: cosanlab#287
…anlab#182, cosanlab#177)

Added comprehensive documentation for prediction/classification algorithms:
- PredictTerminal: canonical docs with all algorithms and kwargs
- MultiSubjectPipeline.predict(): full algorithm list with examples
- BrainDataPipeline.predict(): algorithm list with examples

Documented class_weight='balanced' for handling imbalanced classes
in SVM and logistic regression.

Fixes: cosanlab#182, cosanlab#177
…lab#315)

Added 'upper'/'lower' tail options to permutation test functions to ensure
consistent direction across multiple tests when using FDR correction.

Options:
- 'two' or 2: Two-tailed test (default)
- 'upper' or 1: One-tailed upper (H1: stat > 0)
- 'lower' or -1: One-tailed lower (H1: stat < 0)

The old tail=1 behavior auto-detected direction per-test based on sign,
which breaks FDR correction semantics. Use 'upper' or 'lower' for MCP.

Updated functions:
- _compute_pvalue() in inference/utils.py
- validate_tail_parameter() in _validation.py
- one_sample_permutation_test()
- two_sample_permutation_test()
- correlation_permutation_test()
- matrix_permutation_test()

Fixes: cosanlab#315
- Use `int | None` syntax for optional parameters in cv() and reduce() methods
- Replace lowercase `callable` with `Callable` from collections.abc
- Fix *args position in MDS constructor call (must come before kwargs)
- Add type ignore comments for intentional sklearn signature overrides
- Configure ty rules to ignore false positives from optional imports
Comprehensive audit covering 8 focus areas:
- API consistency (imperative shell & functional core)
- Docstring completeness (99.4% presence, 248 incomplete)
- Type annotations (41% fully typed)
- Anti-patterns (64 long methods, 27 god methods)
- Test coverage (79.8% method coverage)
- Deprecation status (5 active, 8 candidates)
- Consolidation opportunities (~145 lines savable)

Deliverables:
- 8 detailed audit reports (01-08_*.md)
- Executive summary (audit_summary.md)
- Critical findings for v0.6.0 (critical_findings.md)
- API standardization plan (api_standardization_plan.md)
- Prioritized backlog of 78 issues (improvement_backlog.md)
ejolly added 30 commits May 5, 2026 22:38
…coords

Two pre-existing test failures on uv-cleanup, both unrelated to ridge
work but found while running the v0.6 regression sweep:

1. .predict_multi() was removed in v0.6 but no migration stub was left
   on BrainData, so calls AttributeError'd instead of raising the
   documented NotImplementedError. Mirror the .regress() pattern: keep
   a stub method that points at the future Model class (per migration
   guide). Drop the unrelated Simulator scaffolding from the guard test
   — it was creating data.nii.gz/y.csv in cwd as a side effect.

2. .plot(method="slices") forced the MNI-shaped DEFAULT_SLICE_CUT_COORDS
   (z: -40..68 by 9) regardless of the data's actual extent. nilearn
   0.12 rejects out-of-bounds cut_coords with an opaque ValueError, so
   plotting a small synthetic / native-space single-slice map blew up
   the moment user code didn't override cut_coords. Fix by bounds-
   trimming our defaults to the image's world-coord bounds via the
   image affine; if nothing remains, pass cut_coords=None to let nilearn
   auto-pick. User-supplied cut_coords still pass through untouched so
   debugging stays explicit.

Both fixes verified across the ridge + braindata sweep (102 passed,
no failures).
Encodes the design in SPEC.md as module skeletons + tests. No working
implementation yet — every parallel op raises NotImplementedError; only
structural helpers (run-id generation, cache-dir resolution, _DesignContext
__getitem__, _clone) are real. Behavior tests are xfail(strict=True) so
they flip green automatically as impl lands.

Module layout (12 → 6, per spec):
  core.py       metadata coercion, mask resolution, run/step IDs
  io.py         from_bids/from_glob/from_paths/read, write, load/unload
  execution.py  _ItemTask, _DesignContext, _apply, HDF5 fit-bundle IO,
                BrainCollectionWorkerError, BUNDLE_SCHEMA_VERSION
  inference.py  group reductions, ttest/anova, ISC, perm tests, align
  __init__.py   BrainCollection facade
  pipeline.py   unchanged (legacy CV pipeline)

Tests split under nltools/tests/data/collection/ into construction,
indexing, execution, inference, predict, io. 120 pass, 22 xfail, 77 skip.

BREAKING: BrainCollection is being rewritten from scratch. Removes
FittedBrainCollection, BrainCollectionCVResult (CV results are now plain
BrainData), fit_glm/fit_from_events/fit_ridge (one .fit() instead),
to_stacked/from_stacked/to_list/to_tensor, axis= on reductions,
output=/save= on fit, checkpoint(), and the per-axis
_map_axis0/1/2 / _aggregate_axis0/1/2 handlers. Shape semantics drop
the 3D framing; mean()/std()/etc. collapse subjects only. Old test
files for the removed APIs are deleted.
`DesignMatrix.convolve()` now renames every convolved column to
`<col>_c{i}` regardless of kernel rank and drops the source column.
Previously, 1-D kernels replaced columns in place (kept the original
name) while 2-D kernels suffixed `_c0`, `_c1`, … — callers couldn't
look up convolved columns deterministically.

Also fixes a metadata-drift bug: `dm.convolved` now records the
post-suffix column names (the names that exist in the dataframe),
so vertical `.append()`'s rename map keeps metadata in sync. Before,
`.convolved` carried pre-suffix names that didn't exist in the
2-D-kernel output, and the rename silently skipped them.

Touches `nltools.datasets.load_haxby_example` (signal-injection loop
referenced un-suffixed names — a no-op skip after the change) and
its tests; updates docstrings, basics/workflow tutorials, and the
v0.6.0 migration guide.

BREAKING: column lookups by trial-type name after a `.convolve()`
chain need `_c0` appended (notably `compute_contrasts("A - B")`
becomes `compute_contrasts("A_c0 - B_c0")`).
Closes the v0.5.x foot-gun where users had to round-trip through pandas
(``DesignMatrix(pd.concat([dm.to_pandas(), confounds_df], axis=1), …)``)
and then manually re-assert ``dm.convolved = […]`` because the constructor
silently dropped metadata.

* ``DesignMatrix(other_dm)`` now returns a copy carrying ``data``,
  ``sampling_freq``, ``convolved``, ``confounds``, and ``multi``. Explicit
  kwargs override inherited values. Previously raised ``TypeError``.
* ``dm.convolved`` / ``dm.confounds`` are read-only properties; assignment
  raises ``AttributeError`` pointing to the canonical paths (``convolved=``
  / ``confounds=`` constructor kwargs, or ``.append(other, axis=1)`` which
  auto-marks raw DataFrames as confounds and merges metadata).
* ``_coerce_horizontal_input`` now passes ``confounds=`` via the
  constructor instead of mutating after construction.
* Test fixtures updated to use constructor kwargs.
* Tutorial ``02_design_matrix.md`` gains a "Mixing Task Regressors with
  External Confounds" section showing the canonical pattern end-to-end.
* Tutorial ``01_glm.md`` updated for the ``_c0`` suffix on contrast
  strings (collateral from 942c9b3) so the docs build executes cleanly.
* Migration guide gets two new sections (read-only metadata,
  copy-constructor) plus checklist entries.

BREAKING: ``dm.convolved = […]`` and ``dm.confounds = […]`` now raise
``AttributeError``. Pass via the ``convolved=`` / ``confounds=`` constructor
kwargs, or use ``.append(other, axis=1)`` (raw DataFrames are auto-marked
as confounds; ``as_confounds=True`` promotes a DesignMatrix's columns).
Drop unused *args from BrainData.decompose and analysis.decompose, add `*`
keyword-only marker, and document **kwargs as the sklearn-estimator
passthrough it already was. Also fix stale docstring example in
plotting/decomposition.py (Brain_data → BrainData, algorithm= → method=).

BREAKING: BrainData.decompose and analysis.decompose are now keyword-only
after the data arg. Positional calls beyond `self`/`bd` (which were
silently swallowed by the old `*args`) now raise TypeError.
The general algorithm → method rename row covered decompose semantically,
but readers grepping for `decompose` would miss the rename. Add explicit
mentions in five surfaces — methods/alternatives table, kwarg-rename row's
Scope column, OLD/NEW example pair, per-component change table,
backward-compat status table, and must-fix checklist — so the rename is
discoverable from any entry point a reader might use.
Add hrf_model='glover' kwarg to DesignMatrix.__init__. When the
constructor loads a BIDS events file, it now auto-convolves the
regressors with the canonical Glover HRF, matching nilearn's
make_first_level_design_matrix default. Output columns get the
canonical _c0 suffix and .convolved is populated.

Pass hrf_model=None to opt into raw boxcar regressors — useful for
PPI / FIR designs / pedagogical material that introduces convolution
as a separate step.

Implementation routes through the existing convolve() helper after
events_to_dm produces the boxcar, keeping the _c0 suffix invariant
(942c9b3) as the single source of truth. The events_to_dm helper
itself stays boxcar-only — callers that want convolution drive it
through the constructor or call .convolve() explicitly.

Tests cover default-convolved, hrf_model=None opt-out, unknown-model
rejection, and tabular-file ignore. Migration guide updated in three
places (rename table, full example block with both default and
opt-out, file-paths section, must-fix checklist).

BREAKING: DesignMatrix(events_path, run_length=N, TR=t) previously
returned boxcar regressors; now returns HRF-convolved by default.
Callers that relied on the boxcar output (e.g., to build interaction
terms before convolution) need to add hrf_model=None.
Add Polars-native column manipulation to DesignMatrix:

- DesignMatrix.with_columns(*exprs, **named_exprs) — chainable wrapper
  over pl.DataFrame.with_columns. Returns a new DM with metadata
  preserved; new columns are NOT auto-tagged as convolved or confounds
  (they're user data). Named-kwarg values that aren't pl.Expr/pl.Series
  are coerced to pl.Series (lists/ndarrays) or pl.lit (scalars) so the
  call site doesn't have to wrap every input.

- __setitem__ accepts pl.Expr in addition to scalar/list/ndarray/Series.
  Lets users write `dm['xy'] = pl.col('x') * pl.col('y')` — the
  procedural alternative to with_columns for code that prefers in-place
  assignment style.

Both close real interop gaps that previously forced users out of the
DesignMatrix surface and into raw .data manipulation: combining
multiple regressors needed pl.sum_horizontal in a with_columns call,
and elementwise interaction terms needed Expr * Expr.

Tests cover named-kwarg with Expr/Series/array/list/scalar, replacing
existing columns, immutability of the source DM, metadata preservation,
no auto-tagging of new columns, and chaining with .drop().convolve().
find_spikes (and BrainData.find_spikes) now return a DesignMatrix with
spike indicator columns pre-marked as confounds, rather than a raw
pl.DataFrame. Removes two pieces of friction in the GLM workflow:

1. Callers no longer have to wrap the result with
   DesignMatrix(spikes.iloc[:, 1:], sampling_freq=1/tr) before
   appending it to a design matrix.

2. The legacy 'TR' column (1-indexed timestamp surrogate from the
   pandas era) is dropped — row position is the time axis in the
   Polars-backed DM, so the index column is redundant.

Add TR= and sampling_freq= keyword-only args to set the returned DM's
sampling rate. Pass exactly one (raises ValueError otherwise). Without
either, the DM has sampling_freq=None — appending it to a DM that does
have sampling_freq still works.

Tests use a tiny synthetic 4D nibabel image with two injected spikes
(no Simulator dependency) so the new coverage runs in milliseconds.
The existing slow Simulator-based test is updated to assert the new
return type.

BREAKING: Code that did `find_spikes(data).iloc[:, 1:]` or
`spikes.drop('TR', axis=1)` will break — drop the index manipulation
and use the returned DesignMatrix directly via .append(spikes, axis=1).
Add a "Worked example: PPI design" subsection to the file-paths
section showing the full end-to-end PPI flow using the new v0.6.0
idioms together: hrf_model=None opt-out, with_columns + pl.Expr
for combining motor variants and computing the interaction term,
mixed-input .append() for the seed timeseries + raw confound
DataFrames + the find_spikes-returned DesignMatrix.

The example reads close to the math: motor = sum_horizontal(motor
variants), vmpfc_motor = pl.col('vmpfc') * pl.col('motor_c0'),
and the metadata stays consistent across the chain (regressors of
interest stay out of .confounds without manual tagging).
Adds nltools.templates.list_resources(prefix=None) — companion to the
existing fetch_resource(relpath) helper. Hits the HF API once per session
(lru_cache keyed on repo_id + revision) and returns the sorted list of
files in the nltools/niftis dataset, optionally filtered by prefix.

Closes the discoverability gap that previously forced users to either
visit the HF web UI or memorize relpath strings. Together with
fetch_resource, these are the canonical entry points for loading
canonical brain images:

  >>> list_resources(prefix='masks/')
  ['masks/desikan_killiany_mni152nlin6_1mm.nii.gz',
   'masks/k50_2mm.nii.gz', 'masks/shen_268_2mm.nii.gz', ...]
  >>> path = fetch_resource('masks/k50_2mm.nii.gz')
  >>> mask = BrainData(path)            # or pass directly to nilearn

Pyodide raises a clear RuntimeError pointing users at seed_resources
(huggingface_hub isn't available in the browser).

Tests verify the listing is sorted, includes known files (k50 plus the
4 newly uploaded atlases), filters correctly by prefix, returns empty
for no-match prefixes, and round-trips through fetch_resource.
Adds a "Loading canonical brain images" subsection to the brain-space
configuration area, pointing users at fetch_resource + list_resources
as the canonical entry points for atlases, parcellations, ROI masks,
and templates from the nltools/niftis HF dataset.

Calls out the v0.5.1-era BrainData('https://...').to_nifti() round-trip
as an antipattern when the goal is just to feed a remote NIfTI to
nilearn — the path returned by fetch_resource drops straight into
nilearn helpers, nibabel.load, and BrainData(path) without conversion.

Notes the Pyodide caveat (list_resources requires huggingface_hub;
browser code should pre-seed via seed_resources).
…rough

Restored `BrainData.iplot()` after its removal during the v0.6.0 refactor.
The new viewer is `anywidget`-backed (replaces the old ipywidgets
implementation) and renders identically in Jupyter, marimo, VS Code, and
Jupyter Book v2 / mystmd built sites.

For 4D BrainData, the viewer auto-grows a volume-stepping slider
alongside the threshold slider. For static-doc rendering (mystmd) we
ship a `text/html` mimebundle fallback that pre-renders all volumes and
embeds a JS slider that swaps the iframe srcdoc — preserving 4D step-
through without a Python kernel in the loop.

BREAKING: `iplot(surface=True, anatomical=...)` → `iplot(view='surface',
bg_img=...)`. The `view=` kwarg is canonical with `view='ortho'`
(default). `bg_img=` matches nilearn naming. The `nltools[interactive_plots]`
extra is gone — `anywidget` is now a hard dep.
Adds a threshold panel to BrainData.iplot() with two toggleable axes:

- Value ↔ Percentile — slider units for the threshold
- Symmetric ↔ Independent — single |x| cutoff vs separate
  negative/positive cutoffs

Independent mode is implemented by Python-side masking of voxels in
(lower, upper) before passing threshold=0 to nilearn — nilearn's
view_img only supports a symmetric threshold, so we own the asymmetric
case. A 101-entry percentile lookup table is precomputed once at widget
init and synced to the client, so the Value↔Percentile toggle and
slider position updates happen entirely in JS without a Python
round-trip.

Also fixes three issues in the prior anywidget viewer:

1. model.save_changes() was being called before the debounced
   model.set() had run, so drag-end values never actually synced to
   Python. Now we re-render on the slider's "change" (mouse-up) event
   instead of debounced "input", and save_changes() is called inline
   with set().
2. _render_one silently clamped threshold to the per-volume vmax and
   floored it to 1e-6 — the slider lied about what was being rendered.
   Pass the user's value through unchanged.
3. Re-rendering on every drag pixel made dragging through a 100-step
   range stutter. Now drag-time updates only the JS readout; one
   nilearn render fires on release.

iplot() gains lower=, upper=, mode=, units= kwargs. threshold= kept
as a backward-compat alias for upper= in symmetric mode.

BREAKING: BrainViewerWidget.threshold / threshold_min / threshold_max /
threshold_step traitlets are gone — replaced by lower / upper /
vmax_abs / data_min / data_max / pct_table_*. Direct widget users
need to update; iplot() callers passing threshold= are unaffected.
Picks up signature/docstring changes from prior commits that didn't
re-run docs-generate (find_spikes_data TR/sampling_freq kwargs,
DesignMatrix hrf_model kwarg, with_columns method, decompose
keyword-only marker, list_resources on templates, Brain_data →
BrainData wording fixes, etc.).
… dataclass

Resolve the dartbrains MVPA-chapter migration blocker (predict() return shape
collapsed to a single accuracy BrainData with no .coef_ surfaced) by giving
BrainData.predict() the same shape as bd.fit(): a frozen result dataclass
with model-specific fields, dispatched by `method=`, and full kwargs surface
that subsumes the old fluent API.

New `Predict` dataclass (nltools/data/fitresults/) mirrors `Fit`: frozen,
all fields default to None, populated based on dispatch path, with
available()/asdict() introspection. Holds predictions, scores, mean/std,
cv_folds, weight_map (mean linear-classifier coefs across folds, projected
to voxel space), fold_weight_maps, accuracy_map (searchlight/roi),
final_estimator and final_weight_map (refit=True), aligned naming with
nilearn / Fit conventions.

Rewritten BrainData.predict():
  - method='whole_brain'|'searchlight'|'roi' selects analysis shape
  - model='svm'|'logistic'|'lda'|'ridge_classifier' (classification);
    'ridge'|'lasso'|'svr' (regression). 'ridge' is regression-only;
    classification users want 'ridge_classifier'. Pass any sklearn
    estimator/Pipeline directly for full flexibility.
  - scoring='auto' resolves via is_classifier(model) to 'accuracy' / 'r2'
  - standardize=True + reduce='pca'/n_components= per-fold preprocessing,
    weight_map back-projected through PCA to voxel space
  - refit=True also fits on full data and stores final_estimator +
    final_weight_map (for publishing a single weight map)
  - inplace=True populates result fields as attributes on self
  - n_jobs=1 default (searchlight on real brain at higher n_jobs is
    memory-heavy)
  - Linear models extract weight_map via .coef_; non-linear models
    (kernel SVM, RF) emit a UserWarning and leave weight_map=None

Removed BrainData fluent API:
  - Deleted nltools/data/braindata/pipeline.py (BrainDataPipeline,
    BrainDataCVResult)
  - Deleted BrainData.cv() method and its modeling.py helper
  - cv()/normalize()/reduce()/pipe() steps all collapse to kwargs;
    custom preprocessing chains use model=make_pipeline(...) which
    is the standard sklearn idiom
  - MultiSubjectPipeline (used by BrainCollection) is unchanged —
    multi-subject hyperalignment / SRM is where chaining still
    earns its keep

Timeseries prediction path (bd.predict(X=...) on a fitted encoding model)
keeps returning BrainData — the natural container for a voxel timeseries,
composes directly with .plot()/.standardize().

Tests: 17 in test_braindata_prediction.py (whole_brain × searchlight ×
ROI × classification/regression, model passthrough, scoring='auto', PCA
back-projection, refit, inplace, non-linear warning) + 13 in
test_predict_results.py (Predict dataclass contract). Existing fit→predict
modeling tests untouched.

Migration guide updated with new return-type semantics, model= rename,
ridge_classifier disambiguation, fluent-removal note, and a worked
sklearn-Pipeline-as-model example. dartbrains_triage.md closes the
"NEW issue" entry and unblocks the deferred Multivariate_Prediction.py
chapter.

BREAKING:
- BrainData.predict(y=) returns Predict dataclass instead of BrainData;
  result['weight_map'] → result.weight_map.
- estimator= renamed to model= (mirrors bd.fit(model=)).
- 'ridge' shortcut now means Ridge regressor; for RidgeClassifier use
  'ridge_classifier'.
- scoring='auto' default replaces hardcoded 'accuracy'.
- BrainData.cv() and the .cv().normalize().reduce().pipe().predict()
  fluent chain removed; pass cv=, standardize=, reduce='pca', and/or
  model=Pipeline(...) on bd.predict() instead.
- n_jobs default 1 (was -1) for searchlight/roi.
…coding tutorial

Make BrainData.predict() match bd.fit()'s philosophy: spatial outputs are
BrainData objects so users work in brain space directly. result.weight_map.plot()
just works; .data drops down to numpy. Adds the missing per-fold/per-ROI info
the ROI runner was discarding, and a new Haxby-based decoding tutorial wired
into the docs TOC.

Predict dataclass:
- Spatial fields (weight_map, fold_weight_maps, final_weight_map, accuracy_map)
  are BrainData wrapped on the input bd.mask, not raw ndarrays.
- New roi_labels field — atlas integer IDs in the same order as score arrays.
- scores/mean_score/std_score repurposed for ROI dispatch with array shapes:
  (n_folds, n_rois) / (n_rois,) / (n_rois,). Whole-brain shapes unchanged.

Runners:
- _run_whole_brain wraps weight maps via _to_braindata(arr, bd.mask).
- _run_roi rewritten to capture per-fold scores per parcel (was previously
  discarding fold-level info), populate roi_labels, and wrap accuracy_map.
- _run_searchlight wraps accuracy_map as BrainData.
- New _resolve_standardize_for_model: when model is a sklearn Pipeline,
  auto-defaults standardize=False with a UserWarning to avoid silently
  wrapping another StandardScaler. Override with standardize=True explicitly.
- inplace=True now attaches with predict_ prefix (bd.predict_weight_map,
  bd.predict_accuracy_map, …) — mirrors fit()'s glm_*/ridge_* naming.

Tutorial (docs/tutorials/workflows/05_decoding.{py,md}):
- Replaces stale raw-sklearn version with marimo + MyST notebook walking
  through whole-brain decoding (weight map interpretation, fold stability,
  refit weights) and ROI decoding with the bundled k200 atlas (per-region
  accuracy, top-N ranking via roi_labels). Pipeline-as-model aside.
- Wired into docs/myst.yml TOC.

Tests (118 passing across affected modules):
- TestBrainDataWrapping: spatial fields are BrainData, share bd.mask.
- TestROIDispatch: repurposed score-field shapes, roi_labels ordering,
  whole-brain fields stay None on ROI dispatch.
- TestPipelineStandardizeDetect: auto-flip warning fires; explicit
  standardize=False stays silent.
- New ROI test in test_predict_results.py documenting field shapes.
- Existing inplace test updated for predict_ prefix.

BREAKING:
- Predict.weight_map / fold_weight_maps / final_weight_map / accuracy_map
  are BrainData, not ndarray. Numpy via .data.
- bd.predict(..., inplace=True) attaches with predict_ prefix
  (bd.predict_weight_map, etc.), not flat names.
- ROI dispatch repurposes scores/mean_score/std_score with array shapes
  (n_folds,n_rois)/(n_rois,) instead of leaving them None.
- model=Pipeline(...) auto-flips standardize to False (with warning);
  pass standardize=True explicitly to keep the old wrapping behavior.
… columns

The 01_glm tutorial broke at the contrast step (`Column 'language_c0' not
found`) because DesignMatrix(events_p, …) auto-convolves at construction
(matching nilearn's `make_first_level_design_matrix(hrf_model='glover')`
default), and the explicit `.convolve()` later in the chain blindly
appended `_c0` again, producing `language_c0_c0`. The contrast string
`"language_c0 - string_c0"` then didn't match anything in the design.

`_c0_c0` has no biological meaning — convolving an HRF-shaped signal with
another HRF produces a doubly-blurred thing that doesn't correspond to
any real neural or hemodynamic process. So `.convolve()` is now
structurally idempotent: the column-name space is well-defined, and a
column named `<col>_c{i}` always means "convolved exactly once".

- Default path (`columns=None`): convolves only experimental regressors
  not already in `dm.convolved`. If nothing is left to convolve, returns
  `dm` unchanged with a warning.
- Explicit `columns=` path: refuses any name in `dm.convolved` with a
  ValueError pointing at the only mathematically sensible workflow
  (drop the convolved column, restore the boxcar source, convolve fresh
  with the new kernel).
- Metadata (`dm.convolved`): pre-existing entries always survive in the
  dataframe and the metadata; new convolved names are appended.

Tests:
- test_convolve_is_idempotent_on_already_convolved
- test_convolve_refuses_explicit_already_convolved_column
- test_convolve_partial_with_new_event_column (preserves prior convolved
  columns when a fresh boxcar is added and re-convolved alongside)
Eliminate the parallel weight-map fields that exposed two competing
estimators of the same thing — the CV-mean of per-fold coefs and the
all-data refit — and the docs-driven 'final_weight_map is the right one
to publish' caveat that came with them.

The CV-mean of K per-fold coef vectors doesn't correspond to any actual
fitted estimator (each fold saw a different subset). The all-data refit
is a single legitimate model with all the information used. CV gives
the honest score; the refit gives the publishable map. Pick one and
stop forcing users to learn the difference.

- Predict.weight_map: now coef_ from one fit on the full (X, y).
  Always populated for linear models, no opt-in.
- Predict.estimator: renamed from final_estimator. The fitted sklearn
  object — call .predict() on new data, or read .coef_ for the same
  numbers that populate weight_map.
- Predict.final_weight_map: removed.
- predict(refit=...): kwarg removed. The +1 fit cost is negligible
  for linear models and the kwarg only existed because the field
  layout demanded a choice.
- fold_weight_maps: unchanged, still the per-fold stack for stability
  analysis. The CV-mean is one line away if anyone wants it
  (`fold_weight_maps.data.mean(axis=0)`).

Tutorial (05_decoding.{py,md}):
- Drop refit=True from the call.
- Replace the final_weight_map section with a brief 'apply trained
  model to new data' cell using result.estimator.
- Reframe the weight-map prose to call out the CV-vs-all-data-fit
  distinction honestly instead of glossing over it.
- API summary table updated.

Tests:
- TestRefit class replaced with TestAllDataRefit:
  - test_estimator_is_fitted_and_callable_on_new_data
  - test_weight_map_is_from_all_data_fit_not_cv_mean
    (regression: asserts weight_map.data == estimator.coef_)
  - test_pca_back_projection_works_for_weight_map
- test_classification_populates_expected_fields adjusted.
- test_predict_results.py: test_refit_fields → test_estimator_field.

BREAKING:
- Predict.final_weight_map field removed (folded into weight_map).
- Predict.final_estimator renamed to Predict.estimator.
- bd.predict(refit=...) kwarg removed; the all-data fit is always-on.
- bd.predict(inplace=True) now attaches bd.predict_estimator (was
  bd.predict_final_estimator) and no longer attaches predict_final_*.
Non-overlapping ROI is structurally just whole_brain split across
disjoint feature subsets — same shape of output, just composed of
independently-trained pieces. The atlas is a label image so each voxel
belongs to exactly one parcel; per-parcel coef_ vectors slot back into
voxel space without conflict. There's no reason to leave weight_map /
fold_weight_maps as None on ROI dispatch when the data structure is
already there.

- Predict.weight_map (ROI): BrainData (1, n_voxels), per-parcel coef_
  from each parcel's all-data fit reassembled to voxel space. Voxels
  outside any parcel = NaN (matches accuracy_map convention).
- Predict.fold_weight_maps (ROI): BrainData (n_folds, n_voxels), same
  assembly per CV fold for stability analysis.
- Predict.estimator (ROI): dict[int, sklearn] keyed by atlas label.
  Each parcel's all-data fitted estimator — call .predict() on per-
  parcel new data, or read .coef_ for the same numbers in weight_map.
- Cross-parcel weight magnitudes live on different X distributions
  (different voxels seen) and are not directly comparable; within-
  parcel ranking is meaningful. Documented in the dataclass docstring,
  the BrainData.predict() docstring, and the tutorial.
- _extract_weight_map gained a quiet=True mode used by ROI's per-
  parcel loop so a non-linear-model call doesn't emit one warning per
  parcel. The runner aggregates a single warning at the call level if
  any parcel fails extraction (matches whole_brain's all-or-nothing
  contract: all weight fields collapse to None for the whole call).

Searchlight is intentionally not touched in this pass — the natural
single-number-per-voxel reduction (self-coefficient: weight the v-
centered model gave to v itself) is principled but partial, and
worth a separate design pass.

Tests:
- test_roi_populates_voxel_space_weight_map (replaces
  test_roi_leaves_whole_brain_fields_none)
- test_roi_weight_map_voxels_match_per_parcel_estimator_coef
  (regression: each voxel == its parcel estimator's matching coef)
- test_roi_non_linear_model_drops_weight_fields (all-or-nothing,
  single aggregate warning)

Tutorial (05_decoding.{py,md}):
- New "ROI weight map" cell with the cross-parcel-magnitude caveat
  called out explicitly.
- API summary table updated to list weight_map / fold_weight_maps /
  estimator under the ROI row.
k50 covers the same brain at coarser granularity — ventro-temporal cortex
still pops out cleanly for face/house decoding. Cuts the ROI runner cost
~4× (50 parcel-fits + refits vs 200) and trims the docs build pipeline
without changing what the tutorial demonstrates.
Four predict-rewrite follow-up commits (a3fe4da, 80c882b, 05b02ba,
c3b9f6e) materially changed what migrating Multivariate_Prediction.py
looks like. The original triage's bullets were written immediately after
0b953bf and are now partially outdated:

- weight_map / fold_weight_maps / accuracy_map are BrainData (not
  ndarray), so result.weight_map.plot() works directly — collapses any
  cells that wrapped them in BrainData(..., mask=...).plot().
- refit kwarg gone, final_weight_map field gone — result.weight_map is
  the all-data fit by default. Migration is one fewer knob.
- standardize=False is auto-detected for Pipeline models. Course doesn't
  need to remember to pass it.
- ROI dispatch now produces voxel-space weight_map + per-parcel
  estimator dict. Whole new "where" analysis available with one extra
  predict() call.

Also points the dartbrains migration target shape at the new reference
tutorial (docs/tutorials/workflows/05_decoding.md), which exercises the
final API end-to-end on Haxby.

No course-side work done — this is just keeping the triage's "what to
do when we touch Multivariate_Prediction.py" guidance accurate.
The deferred chapter is now migrated against the post-rewrite predict API
(commits a3fe4da / 80c882b / 05b02ba / c3b9f6e). Crossed out with a
resolution note covering: 7 predict() call sites converted from
algorithm=/cv_dict={...} to model=/cv=GroupKFold + groups=, dict-key
result access switched to dataclass attributes, data.Y attribute dropped
in favor of the predict(y=...) kwarg, BrainData([BrainData(f) for f in
paths]) workaround replaced (C4 fold-in), hyperparameter passes routed
through explicit sklearn estimators, ridgeCV/lassoCV shortcuts replaced
with RidgeClassifierCV()/LassoCV() directly, and the Cross-Validation
theory section rewritten to land the subject-grouping leak lesson now
that the first predict() call does CV by default.

The course-side change lives in the dartbrains nested repo; not committed
here. Pre-commit verification (notebook execution + figure smoke checks
per dartbrains CLAUDE.md) still pending and tracked under that section.
…t rename

Establish a single canonical axis for spatial-scope dispatch across the
BrainData facade, shared by predict() and distance() (and reductions),
following the searchlight → ROI → whole brain framing of Jolly &
Chang, 2021, *SCAN* (https://doi.org/10.1093/scan/nsab010). Until now
predict() overloaded `method=` for both algorithm choice ('svm') and
spatial scope ('whole_brain'), which collided with the project's
canonical `method=` slot for algorithm everywhere else. v0.6.0 is the
moment to fix it.

API conventions (added to CLAUDE.md, migration-guide.md, changelog.md):

- spatial_scale: Literal['whole_brain' | 'roi' | 'searchlight']
  — new canonical kwarg, distinct from method= (algorithm).
- Companion kwargs roi_mask= and radius_mm= already canonical.

BrainData facade:

- predict(method=) -> predict(spatial_scale=). VALID_METHODS renamed
  to VALID_SPATIAL_SCALES; dispatch logic, error message, and all
  existing predict tests + docs/tutorials/api updated. method= no
  longer overloaded — reserved for algorithm.
- distance(spatial_scale=, roi_mask=, radius_mm=). 'whole_brain'
  preserves existing single-Adjacency behavior; 'roi' returns a
  stacked Adjacency (one RDM per parcel) with provenance attached.
  'searchlight' raises NotImplementedError pending a follow-up.
- mean / std / median (spatial_scale=, roi_mask=) — parcellation-
  smoothing semantics: each voxel painted with its parcel's
  reduction per image. Default 'whole_brain' preserves prior
  behavior. Shared via reduce_per_roi() helper in analysis.py.
- align(spatial_scale=) — stub; raises NotImplementedError with a
  pointer to the v0.6.0 follow-up (per-parcel transforms +
  reassembly is its own slice).

Adjacency interop layer:

- New SpatialScale frozen dataclass (atlas, roi_labels, source_mask,
  kind: 'roi' | 'searchlight') in nltools/data/adjacency/spatial.py.
- Adjacency.spatial_scale: SpatialScale | None — optional attribute,
  validated stack-only with len(roi_labels) == len(stack). Survives
  copy() and shape-preserving __getitem__ (subset rows + roi_labels);
  dropped when collapsed to a single matrix (no per-parcel structure
  left to back-project).
- Adjacency.to_brain(values, fill=np.nan) -> BrainData — explicit
  back-projection. Errors if spatial_scale is None. Validates
  len(values) == len(stack). Delegates to a new shared helper.
- Adjacency.similarity(other, project=True) — sugar that runs the
  existing per-stack similarity, extracts each result dict's
  'correlation' key, and pipes through to_brain() for the canonical
  RSA chain in one call.

Shared painting helper (avoid duplicating with the legacy
nltools.mask.roi_to_brain, which takes an *expanded* mask and
depends on a buggy expand_mask for non-contiguous labels):

- nltools.mask.roi_to_brain_from_atlas(values, atlas, source_mask,
  roi_labels=None, fill=np.nan) — takes a labeled atlas image
  (the form SpatialScale carries). 1-D values -> single image
  BrainData; 2-D (n_images, n_parcels) -> (n_images, n_voxels)
  BrainData. Single source of truth for "paint per-parcel scalars
  from a labeled atlas onto voxel space," used by Adjacency.to_brain
  and reduce_per_roi. Predict's existing voxel-space assembly is
  unchanged in this pass; can migrate to the helper later.

Canonical RSA chain that motivated the design:

    rdms = brain.distance(metric='correlation',
                          spatial_scale='roi', roi_mask=atlas)
    brain_map = rdms.similarity(model_rdm, project=True,
                                permutation_method=None)

Tests:

- New nltools/tests/data/adjacency/test_spatial_scale.py: 14 tests
  covering SpatialScale construction/validation/immutability,
  Adjacency.spatial_scale storage + preservation through copy and
  __getitem__, drop-on-collapse semantics, and to_brain() painting
  + fill + length validation.
- New nltools/tests/data/braindata/test_braindata_spatial_scale.py:
  15 tests for distance(spatial_scale=) (whole_brain default,
  ROI shape + per-parcel RDM equality + roi_mask requirement +
  end-to-end to_brain round-trip), reductions(spatial_scale='roi')
  for mean/std/median, align stub, similarity(project=True)
  exact-equivalence to manual to_brain().
- predict tests: bulk-rename method= -> spatial_scale=; one error-
  message regex updated.

Migration:

- predict(method='whole_brain'|'roi'|'searchlight') ->
  predict(spatial_scale='whole_brain'|'roi'|'searchlight').
- predict(method=) is removed (no shim, v0.6.0 breaking sweep).
- BrainCollection.predict scaffold also renamed.

BREAKING: predict() no longer accepts method= for spatial scope; use
spatial_scale=. method= is reserved for algorithm choice.
Closes the two stubs left from the spatial_scale axis introduction.

distance(spatial_scale='searchlight', radius_mm=...):

- Iterates compute_searchlight_neighborhoods(bd.mask, radius_mm) and
  computes per-sphere pairwise distances over bd.data[:, neighbors].
- Returns a stacked Adjacency, one RDM per voxel center, carrying
  SpatialScale(kind='searchlight'). The atlas is a synthetic
  1-voxel-per-label image (each masked voxel labeled with its own ID,
  1..n_voxels), so Adjacency.to_brain(values) paints values[i] onto
  the i-th searchlight center — same RSA chain as the ROI case but
  each searchlight contributes a single voxel.

align(spatial_scale='roi', roi_mask=...):

- For each atlas parcel, slices bd + target to that parcel's voxels
  and runs the existing align() (procrustes / SRM). Reassembles:
  - 'transformed': BrainData of original (n_images, n_voxels) shape;
    each parcel's transformed values stitched into its voxels (NaN
    outside the atlas).
  - 'transformation_matrix': dict[atlas_label, ndarray]. Per-parcel
    matrices live on different voxel subsets and can't be painted
    into a single image — kept as a dict.
  - 'common_model': dict[atlas_label, ndarray | BrainData].
  - 'disparity', 'scale': per-parcel arrays.
  - 'roi_labels': ndarray of atlas IDs in stack order.
- Procrustes requires target to share the BrainData voxel axis
  (sliced parcel-wise). SRM common model is voxel-agnostic, so
  target stays a numpy array and is passed through unchanged.

align(spatial_scale='searchlight'):

- Stays a NotImplementedError but with a specific message: searchlight
  neighborhoods overlap, so a voxel belongs to many spheres and the
  'transformed' field has no canonical value to paint at that voxel.
  Users who want per-sphere transforms can iterate
  compute_searchlight_neighborhoods() and call .align() themselves.

Shared helper:

- _resolve_atlas_label_vec(bd, roi_mask) factored out (atlas coercion
  + resample-if-needed + label_vec extraction) — used by
  align_per_roi and available for future per-parcel ops.

Tests (5 new):

- TestDistanceSearchlight: stack length matches voxel count, per-
  center RDM matches manual cdist on the same neighborhood, end-to-
  end to_brain round-trip via similarity(project=True).
- TestAlignROI: dict shape contract (transformed BrainData, dict
  transforms, per-parcel disparity/scale/roi_labels), and per-parcel
  transformed values match an independent manual align on the same
  parcel slice.
- align(searchlight) error regex updated to match the new "overlap"
  wording.

345 regression tests passing.
New nltools.data.atlases module + BrainData.cluster_report() for
summarizing thresholded stat maps with peak coordinates, sub-peaks,
and atlas labels. 11 atlases (Harvard-Oxford, AAL, Schaefer-200,
Juelich, AICHA, Destrieux, Desikan-Killiany, MarsAtlas,
Neuromorphometrics, Talairach BA/Gyrus) lazy-fetched from the
nltools/niftis HF dataset on first use.

- ClusterReport result object: polars peaks/clusters DataFrames,
  thresholded BrainData, .plot() and .to_csv() helpers
- label_coords() for direct MNI-coord → region lookup
- Dual-mode thresholding: inline (stat_threshold=3.0) or
  pre-thresholded (stat_threshold=None) for FDR/FWER pipelines
- Sign-aware connected components for two_sided clusters
- Mass-weighted top-k regions per cluster
- Labeling logic adapted from atlasreader (BSD-3) with attribution
- Tutorial + API docs wired into myst TOC
Switch data loading from non-existent nltools.datasets.fetch_haxby to
nilearn's fetcher (matching 05_decoding), use mean-of-TRs per condition
instead of GLM betas, and update calls to current Adjacency/BrainData
API. Drop execute: skip so it runs in the build.
The old tutorial averaged raw Haxby BOLD intensity across TRs as condition
patterns. Haxby is in subject-native space with a large global/anatomical
baseline, so Pearson r between mean-patterns ran ~0.9999 — the 1 − r RDM
was essentially zero everywhere and the similarity plot showed all 1s,
making the hypothesis tests meaningless.

Switch to the language localizer dataset (same as the GLM tutorial:
preprocessed, MNI 3mm, 10 subjects). With only two conditions, work at the
trial level: LSA single-trial GLM → 24 trial betas → 24×24 RDM. The whole-
brain RDM is noisy (taught explicitly), so the tutorial then walks through
ROI RSA via spatial_scale='roi' + k50 atlas, projects per-ROI Spearman
correlations back to brain space, and inspects the top ROI's RDM directly.

Both .md (canonical) and .py (marimo notebook) refreshed in lockstep.
The BrainCollection rewrite is happening on the `collection-imp` branch, so
the scaffold-era contracts these tests assert against will change. Disable
collection collection here via `collect_ignore_glob` in the directory's
conftest so this branch's suite isn't tied to soon-to-change contracts.

Also fix the stale `test_default_method_is_whole_brain` (now
`test_default_spatial_scale_is_whole_brain`) — the kwarg was renamed to
`spatial_scale` in 39da8f1 but the test wasn't updated. The fix is kept
even though the directory is currently skipped, so the test passes cleanly
when collection-imp re-enables collection.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant