Skip to content

Conversation

@mauzey1
Copy link
Collaborator

@mauzey1 mauzey1 commented Jan 5, 2026

Resolves #880

This changes how CMOR performs chunking of the variable data along time and spatial dimensions.

Key changes

  • Chunking dimensions of the time and time bounds axes will be set to the length of the time axis, plus a dimension of size 2 for the bounds, if it is known when the time axis is created. Otherwise, the time and time bounds chunking dimensions will be set to 512 by default.
  • Chunking dimensions for data variables will be determined by whether chunks can be at least 4 MB in size.
    • If multiple time slices of data can fit within 4 MB, then chunking dimensions will have multiple time slices and full spatial dimensions of the data variable.
    • If a single time slice is greater than 4 MB, then chunking dimensions will have one time slice with spatial chunking dimensions determined by the netcdf library.

These changes do not enforce the "consolidated metadata" repacking requirement where the chunking index b-tree must be stored contiguously at the beginning of the file. However, it should reduce the work needed to be done by cmip7repack by avoiding rechunking the data variable.

@mauzey1 mauzey1 requested a review from durack1 January 5, 2026 20:24
@mauzey1 mauzey1 mentioned this pull request Jan 5, 2026
@mauzey1 mauzey1 marked this pull request as draft January 7, 2026 02:03
@mauzey1 mauzey1 marked this pull request as ready for review January 8, 2026 19:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Repacking

2 participants