Skip to content

Direct netCDF I/O for soca state/geometry (soca_io_mod)#1242

Merged
shlyaeva merged 14 commits into
developfrom
feature/soca-direct-io
May 20, 2026
Merged

Direct netCDF I/O for soca state/geometry (soca_io_mod)#1242
shlyaeva merged 14 commits into
developfrom
feature/soca-direct-io

Conversation

@travissluka
Copy link
Copy Markdown
Contributor

@travissluka travissluka commented May 13, 2026

Description

Resolves #1225

Rips out FMS I/O! Replaces SOCA's register_restart_field / save_restart / restore_state paths with direct nf90_* I/O in a new module src/soca/IO/soca_io_mod.F90. Same on-disk file format. The register-then-commit API mirrors FMS semantics so call sites in soca_geom_mod, soca_fields_mod, and soca_balance_mod are minimally adapted.

Why:

  • FMS I/O v1 is being retired and we have to move off it.
  • FMS I/O v2 was problematic when we tried it — not a drop-in.
  • Independent of that pressure, we wanted out anyway: a follow-up PR builds on this to greatly accelerate ensemble writes (one PE per member writing its own file in parallel, instead of the FMS-mandated collective on the compute communicator).

Issue(s) addressed

n/a

Dependencies

n/a

Impact

soca-only. Same files on disk; same public Fortran interface in soca_geom, soca_fields, soca_balance. FMS dependency for register_restart_field / save_restart / restore_state is removed; mpp_gather / mpp_scatter / mpp_broadcast are still used.

Manual Testing Instructions (optional)

n/a

Checklist

  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have run the unit tests before creating the PR

Introduces src/soca/IO/soca_io_mod.F90 with soca_io_file_reader /
soca_io_file_writer types (init/enqueue/commit/close), and routes all
state, geometry, and balance-coefficient I/O through it.

PE 0 calls nf90_get_var / nf90_put_var; mpp_broadcast and mpp_gather
distribute / collect data on the geometry's pelist. Removes all
fms_io_mod usages (register_restart_field, restore_state, save_restart,
read_data, file_exist, field_exist, fms_io_init, fms_io_exit) from
soca, including the global_soca_geom_counter shim. fms_init / fms_end
are retained (subsystem init only, no I/O).

Resolves the GDAS Tsnz_h shape mismatch by building nf90_get_var count
from the file's actual dim sizes, not the caller's buffer rank.

Per-domain state writes (ocn/sfc/ice/wav/bio) and reads now use the
new module; the FMS code paths are removed in the same change rather
than gated.

Refs #1125. Obsoletes PR #1241.
Under FMS, the writer would implicitly add the .nc suffix, so test
YAMLs and downstream readers reference 'ocn.<exp>.<typ>.<date>.nc'.
The direct-netCDF writer in soca_io_mod is faithful to whatever
string it's given, so files were landing without the extension and
no consumer could find them.
Six soca_add_test() calls listed test_socs_parameters_diffusion as a
dependency (should be test_soca_). Because the named test does not
exist, ctest treated the dependency as void and ran the variational
tests concurrently with parameters_diffusion under -j>1, causing
races on the diffusion calibration outputs.
Drops the SOCA_IO_SERIAL / SOCA_IO_PARALLEL mode bifurcation (and the
method= arg + io.method config knob) and rewrites the reader to do
per-PE strided nf90_get_var, mirroring FMS mpp_io's domain-decomposed
read pattern. Each PE opens the file once via NF90_NOWRITE and reads
its own compute-domain tile via nf90_get_var(start, count) -- the same
shape as fms_io's READ_RECORD_ for fileset.NE.MPP_SINGLE.

Adds a module-level read cache mirroring fms_io's get_file_unit +
files_read(i)%var(j) tables:
 - one nf90_open per (PE, filename) for the whole run
 - cached (varid, file_ndims, middle_dim sizes) per (file, var)
 - soca_io_close_all() flushes the cache; wired into soca_geom_end

Caller updates: drop method= from reader/writer init() and the
soca_io_method_from_config helper from soca_fields_mod / soca_geom_mod
/ soca_balance_mod.

1-deg 20-mem LETKF read phase (LocalEnsembleDA before solver ctor):
  baseline (FMS):                  3.91 s
  before this commit (no cache):  13.92 s
  with this commit:                3.97 s
Outputs are bit-identical.
…cache)

Adds a second reader_commit implementation -- PE 0 nf90_get_var the
global field + mpp_broadcast -- alongside the existing per-PE strided
path. Both paths share the file-handle + var-metadata cache. Select via
SOCA_IO_READ_MODE=broadcast|strided (default broadcast); the env var is
latched on first reader_commit.

Why broadcast as default: in DA cycling the page cache is always cold,
since each cycle reads files that the previous cycle / model run just
wrote. On cold cache the broadcast path (1 sequential read on PE 0,
kernel readahead happy) beats strided (8 concurrent strided readers
thrash the prefetcher and pay 8x the open-syscall cost).

1-deg 20-mem LETKF, rancor, taskset -c 0-7, cold cache (sample 1):
  broadcast: 13.5 s read
  strided:   30.5 s read
Warm cache reverses the result (strided 3.9 s vs broadcast 13.3 s) but
cycling never sees warm cache. On a parallel filesystem (Lustre/GPFS)
or multi-node setup the strided path may win again -- toggle the env
var to test.
Drops SOCA_IO_READ_MODE in favor of a yaml block on the geometry
config:
    geometry:
      io:
        read mode: broadcast | strided
If absent the module-level default (broadcast) is kept.

soca_geom_init calls soca_io_read_mode_from_config(f_conf) once at
startup; the choice latches for every subsequent reader_commit.
Unrecognized values abort so typos surface immediately.
Annotates testinput/letkf.yml's geometry block with the two read-mode
options so a contributor reading the canonical letkf example sees both
choices without having to hunt for the soca_io_mod docstring.
The write path emits FMS-style auto-numbered dim names (xaxis_N /
yaxis_N / zaxis_N / Time); the file-header docstring incorrectly
claimed xh / yh / Time. Correct the comment -- no code change.
Writer:
- enqueue holds pointers to caller buffers instead of allocating and
  copying (mirrors FMS register_restart_field contract). Compute-slice
  extraction moves from enqueue to commit, so peak per-writer memory
  drops from sum(var_bytes) to one (nx_c x ny_c) tile.
- commit uses the 3D mpp_gather overload for 3D vars: one collective
  per var instead of nlevels, and no per-level gbuf3d(:,:,k) = gbuf2d
  memcpy on root.
- Reuse gbuf3d / tile3 across 3D vars when nlevels matches.

Reader:
- commit_reader_strided hoists tile2 out of the per-var loop and
  reuses tile3 / tile4 when trailing dims match.
- Dead tile3/tile4 zero-inits removed (read_var_strided fully fills
  the buffer).

read_var_strided: fixed-size stack arrays for st / ct instead of
per-call heap allocation.

put_axis_coord_data: reuse idxbuf across axes.

Cleanup:
- Remove dead 'use mpi' and 'use fckit_configuration_module'.
- Remove dead cartesian_axis parameter on writer_enqueue_1d (it was
  stored but never written to the netcdf output).
- Drop unused nprocs out-param of mpi_pelist.
- soca_io_close_all: use ncc on nf90_close instead of silently
  discarding the status.
- commit_reader_scatter: matching dead-init cleanup; comment updated
  to flag it as pending parallel-ensemble I/O exercise.
- Trim verbose / redundant comments.
Copilot AI review requested due to automatic review settings May 13, 2026 02:51
@travissluka travissluka self-assigned this May 13, 2026
@travissluka
Copy link
Copy Markdown
Contributor Author

Note, @shlyaeva, this uses the strided method that you previously tested, and that is the default, so this PR should now be a drop-in replacement with no changes needed to yamls. The other reader method is disabled (but the code is still there so that it can be used with the ensemble I/O follow-up)

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR replaces SOCA’s previous FMS restart I/O pathway (register_restart_field / save_restart / restore_state) with direct NetCDF Fortran (nf90_*) I/O implemented in a new soca_io_mod module, and updates geometry/fields/balance call sites to use a register-then-commit reader/writer API.

Changes:

  • Added src/soca/IO/soca_io_mod.F90 providing soca_io_reader/soca_io_writer plus basic file/variable existence helpers and a cached read-handle table.
  • Updated soca_geom_mod, soca_fields_mod, and soca_balance_mod to use the new reader/writer API in place of FMS I/O.
  • Extended LETKF YAML test input with an io.read mode knob and adjusted test dependencies naming in test/CMakeLists.txt.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
test/testinput/letkf.yml Adds YAML configuration for geometry I/O read strategy.
test/CMakeLists.txt Fixes/aligns diffusion parameter test dependency name.
src/soca/LinearVariableChange/Balance/soca_balance_mod.F90 Switches sea-ice Jacobian read from read_data to soca_io_reader.
src/soca/IO/soca_io_mod.F90 Introduces direct NetCDF I/O module with writer/reader and cached read handles.
src/soca/IO/CMakeLists.txt Adds the new IO module to SOCA build sources.
src/soca/Geometry/soca_geom_mod.F90 Replaces gridspec read/write via FMS restart API with soca_io_reader/writer; closes cached handles on geom end.
src/soca/Fields/soca_fields_mod.F90 Replaces restart I/O and existence checks with soca_io_* equivalents; updates filename generation to include .nc.
src/soca/CMakeLists.txt Adds IO subdirectory to the build.
Comments suppressed due to low confidence (3)

src/soca/IO/soca_io_mod.F90:196

  • The writer stores pointers to src dummy arguments (self%vars(... )%data* => src) and relies on them remaining valid until commit(). Per the Fortran standard this is only safe if the actual arguments have the TARGET (or POINTER) attribute; otherwise the stored pointers can become undefined after enqueue_* returns. Either (a) change the API to copy data into internal allocatables, or (b) require/enforce TARGET on all actual arrays passed to enqueue (including allocatable components like geom%lon, varwrapper%data, etc.), and update call sites accordingly.
subroutine writer_enqueue_1d(self, name, src, long_name, units)
  class(soca_io_writer),         intent(inout) :: self
  character(len=*),              intent(in)    :: name
  real(kind=kind_real), target,  intent(in)    :: src(:)
  character(len=*), optional,    intent(in)    :: long_name, units

  call grow_if_needed(self)
  self%nvars = self%nvars + 1
  self%vars(self%nvars)%name    = name
  self%vars(self%nvars)%ndims   = 1
  self%vars(self%nvars)%nlevels = 1
  self%vars(self%nvars)%data1d => src
  self%vars(self%nvars)%long_name = name
  self%vars(self%nvars)%units     = 'none'
  if (present(long_name)) self%vars(self%nvars)%long_name = long_name
  if (present(units))     self%vars(self%nvars)%units     = units
end subroutine writer_enqueue_1d

src/soca/IO/soca_io_mod.F90:401

  • Same issue for 3D gathers: gbuf3d is only allocated on root, but it is passed to mpp_gather on all PEs. Passing an unallocated allocatable as an actual argument is not safe even if non-root never reads it. Allocate a dummy gbuf3d(1,1,1) (or otherwise ensure it is allocated) on non-root PEs before calling mpp_gather.
      nlev = self%vars(v)%nlevels
      if (is_root) then
        if (allocated(gbuf3d)) then
          if (size(gbuf3d, 3) /= nlev) deallocate(gbuf3d)
        end if
        if (.not. allocated(gbuf3d)) allocate(gbuf3d(self%nx_g, self%ny_g, nlev))
      end if
      if (allocated(tile3)) then
        if (size(tile3, 3) /= nlev) deallocate(tile3)
      end if
      if (.not. allocated(tile3)) allocate(tile3(nx_c, ny_c, nlev))
      tile3 = self%vars(v)%data3d(i_off : i_off + nx_c - 1, &
                                  j_off : j_off + ny_c - 1, :)
      call mpp_gather(self%isc, self%iec, self%jsc, self%jec, nlev, pelist, &
                      tile3, gbuf3d, is_root)
      if (is_root) then

src/soca/Geometry/soca_geom_mod.F90:257

  • The new greader%enqueue(...) calls pass allocatable components like self%lon, self%mask2d, etc. that are not declared with TARGET. Since soca_io_reader stores pointers to the enqueued buffers until commit(), these actual arguments should be TARGET (or the reader should be changed to copy into internal storage) to avoid non-standard/undefined pointer lifetimes.
    call greader%init(self%Domain%mpp_domain, str)
    call greader%enqueue("lonh",    self%lonh)
    call greader%enqueue("lath",    self%lath)
    call greader%enqueue("lonq",    self%lonq)
    call greader%enqueue("latq",    self%latq)
    call greader%enqueue("lon",     self%lon)
    call greader%enqueue("lat",     self%lat)
    call greader%enqueue("lonu",    self%lonu)
    call greader%enqueue("latu",    self%latu)
    call greader%enqueue("lonv",    self%lonv)
    call greader%enqueue("latv",    self%latv)
    call greader%enqueue("sin_rot", self%sin_rot)
    call greader%enqueue("cos_rot", self%cos_rot)
    call greader%enqueue("dx",      self%dx)
    call greader%enqueue("dy",      self%dy)
    call greader%enqueue("area",    self%cell_area)
    call greader%enqueue("mask2d",  self%mask2d)
    call greader%enqueue("mask2du", self%mask2du)
    call greader%enqueue("mask2dv", self%mask2dv)
    call greader%commit()

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/soca/Geometry/soca_geom_mod.F90 Outdated
Comment thread src/soca/IO/soca_io_mod.F90 Outdated
Comment thread src/soca/IO/soca_io_mod.F90 Outdated
Comment thread src/soca/IO/soca_io_mod.F90
Comment thread src/soca/LinearVariableChange/Balance/soca_balance_mod.F90 Outdated
Comment thread src/soca/Fields/soca_fields_mod.F90
Drop dangling references to the removed soca_io_read_mode_from_config /
'io.read mode' YAML key in soca_geom_mod and the letkf test input. The
selector was already gone from soca_io_mod; without these the PR fails
to compile.
- soca_io_mod: allocate gbuf2d / gbuf3d as 1x1 dummies on non-root so the
  actuals passed to mpp_gather are always allocated (assumed-shape dummy
  requires it). Pattern matches commit_reader_scatter.

- soca_io_mod: refresh stale 'Data is copied in' comment to reflect the
  pointer-based enqueue semantics (caller buffer kept alive/unmutated
  through commit; actual must satisfy TARGET association rules).

- soca_geom_mod: TARGET on the 'self' dummy in soca_geom_init,
  soca_geom_init_fieldset, and soca_geom_write so the allocatable
  components (self%lonh, self%lat, ..., self%mask2d*) are valid pointer
  targets for the soca_io reader/writer enqueue. TARGET on the local
  fieldData / fieldDataVars too.

- soca_fields_mod: TARGET on h_common and on the local vars(:) wrapper
  array so vars(n)%data sections used in enqueue are valid pointer
  targets.

- soca_balance_mod: TARGET on local kct for the same reason.
@travissluka
Copy link
Copy Markdown
Contributor Author

Thanks for the review. All six Copilot findings addressed:

  • soca_io_read_mode_from_config undefined — fixed in 4f33324 (dropped the matching import + call + YAML key that referenced the removed routine).
  • gbuf2d/gbuf3d only allocated on root — fixed in 71f14e2 (allocate 1×1 dummy on non-root so the actuals passed to mpp_gather are always allocated; pattern matches commit_reader_scatter).
  • Stale "Data is copied in" comment — refreshed in 71f14e2 to describe the pointer-based semantics and the TARGET-association contract.
  • Reader stores pointers to dst (F03 conformance) — addressed by adding TARGET to callers' actual arguments (see below), keeping the FMS-mirroring pointer design. Same for the writer.
  • kct missing TARGET — added in 71f14e2 on the local declaration.
  • h_common missing TARGET — added in 71f14e2, plus TARGET on the local vars wrapper array (so vars(n)%data sections are valid pointer targets), and TARGET on the self dummy of soca_geom_init / soca_geom_init_fieldset / soca_geom_write (so self%lonh and friends are valid targets).

All 7 writer/reader-heavy soca ctests pass locally; cluster-faithful timing unchanged from the prior commit on this branch.

@shlyaeva
Copy link
Copy Markdown
Collaborator

@travissluka thank you! I'll test this today in the workflow and report back.

Copy link
Copy Markdown
Collaborator

@shlyaeva shlyaeva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @travissluka! This approval is based solely on testing with 0.25 degree soca in GFSv17 "hybrid" configuration (the one that also runs LETKF) for 1-2 cycles. The timings of I/O are slightly better than on develop 🎉 . I have not reviewed any of this code, and don't plan to review it.

(side note unrelated to this PR: the overall timing of 3DVar is significantly slower than with the code from a few weeks ago, I suspect something may have changed in how we calibrate vertical diffusion. we'll investigate that separately).

@travissluka
Copy link
Copy Markdown
Contributor Author

Thank you @travissluka! This approval is based solely on testing with 0.25 degree soca in GFSv17 "hybrid" configuration (the one that also runs LETKF) for 1-2 cycles. The timings of I/O are slightly better than on develop 🎉 . I have not reviewed any of this code, and don't plan to review it.

I think the slight improvement is because FMS was copying fields before writing them, whereas now we just pass pointers around. Memory usage should also be slightly less now.

Copy link
Copy Markdown
Collaborator

@shlyaeva shlyaeva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing me to looking at the memory, I just rebuilt with soca develop with using the same oops/etc branches for apples to apples comparison (the previous one again wasn't!), and actually there are two issues (one crucial).

  1. [major, needs fixing before the merge]. The issue with vertical diffusion that I mentioned above is due to this branch.
    On develop, consistent with other builds:
  0: group 1 of 1
  0: Generating vertical scales
  0:   Reading from file
  0:   mask: NONE
  0:   scale type: Gaussian
  0:   vtScales:  min: 5  mean: 7.17912  max: 15
  0: Diffusion: vertical iterations: 450
  0: Calculating vertical normalization...
  0: Info     : Writing file: ./staticb/vt_ocean.nc

on this branch:

  0: ==================================================================================
  0:  saber::Diffusion calibration
  0: ----------------------------------------------------------------------------------
  0:
  0: group 1 of 1
  0: Generating vertical scales
  0:   Reading from file
  0:   mask: NONE
  0:   scale type: Gaussian
  0:   vtScales:  min: -2.30585  mean: 7.85725  max: 32.6274
  0: Diffusion: vertical iterations: 2128
  0: Calculating vertical normalization...
  0: Info     : Writing file: ./staticb/vt_ocean.nc
  0: ==================================================================================

I don't know yet why this happens, I'll dig into what exactly is happening in our workflow here, compare some files and analyze.

  1. Memory is actually worse here than on develop, especially for LETKF.
    Var:
    develop:
  0: OOPS_STATS IncrementalAssimilation start            - Runtime:     4.81 sec,  Local Memory:   470.56 MB
  0: OOPS_STATS Variational end                          - Runtime:   115.71 sec,  Local Memory:     4.19 GB
  0: OOPS_STATS Run end                                  - Runtime:    117.60 sec,  Memory: total:   418.78 GB, per task: min =   825.51 MB, max =  4189.71 MB

this branch:

  0: OOPS_STATS IncrementalAssimilation start            - Runtime:     5.85 sec,  Local Memory:   517.79 MB
  0: OOPS_STATS Variational end                          - Runtime:   156.98 sec,  Local Memory:     3.81 GB
  0: OOPS_STATS Run end                                  - Runtime:    159.86 sec,  Memory: total:   526.89 GB, per task: min =     1.05 GB, max =     3.81 GB

You'll notice the difference in runtime -- this is due to vertical diffusion behaving differently.

LETKF:
develop:

  0: OOPS_STATS LocalEnsembleDA before solver ctor       - Runtime:    30.10 sec,  Local Memory:     1.78 GB
  0: OOPS_STATS LocalEnsembleDA before solver            - Runtime:     97.67 sec,  Memory: total:   742.32 GB, per task: min =     2.83 GB, max =     5.08 GB
  0: OOPS_STATS Run end                                  - Runtime:    631.32 sec,  Memory: total:   898.97 GB, per task: min =     2.98 GB, max =     8.45 GB

this branch:

  0: OOPS_STATS LocalEnsembleDA before solver ctor       - Runtime:    11.51 sec,  Local Memory:     4.31 GB
  0: OOPS_STATS LocalEnsembleDA before solver            - Runtime:     77.66 sec,  Memory: total:  1420.82 GB, per task: min =     5.66 GB, max =     7.71 GB
  0: OOPS_STATS Run end                                  - Runtime:    546.03 sec,  Memory: total:  1597.63 GB, per task: min =     6.00 GB, max =     9.41 GB

@travissluka can you look into whether we can bring the memory down? I'll try to figure out why the vertical diffusion estimation is misbehaving.

@travissluka
Copy link
Copy Markdown
Contributor Author

travissluka commented May 14, 2026

@shlyaeva 1) I'm able to reproduce the vertical correlation bug locally, so i can look into it on my side (it might have something to do with the levels in the yaml not matching up with the number of levels in the file, and the old code just silently ignored that mismatch... or not.. i dunno 2) I found the memory leak, the I/O code was forgetting to close file handles until the application exit.

Once I get number 1 fixed i'll push changes for a retest

Edit: I'm not actually able to reproduce the vertical correlation bug. Given the value range, that does oddly seem like it's reading in temperature values, not correlation values 🤔

…e checks

Three related fixes in the direct-netCDF reader path:

1. read_var_strided previously hardcoded "last Fortran dim is trailing
   time" and set ct(file_ndims)=1 unconditionally. For a file with
   spatial-only layout (Temp(z,y,x) with no leading Time), this read
   only level 1 of z into a multi-level destination, leaving the rest
   uninitialized -- showing up as garbage scale values in the vertical
   diffusion calibration. Now discriminate file_ndims == dst_rank (no
   time, fill every dim from the file) vs file_ndims > dst_rank
   (trailing time + middle squeeze, e.g. CICE Tsnz_h's nksnow=1). A
   total-element-count check catches silent partial-fills and dim-size
   mismatches (e.g. file z=75 vs destination z=25).

2. Drop the read_cache / cached_open / soca_io_close_all machinery.
   Each reader_commit nf90_opens, reads all enqueued vars (with inline
   inq_varid + inquire_variable + inquire_dimension; microseconds), and
   nf90_closes. Holding NetCDF4 handles open across commits was bloating
   LETKF per-task memory by ~MB-to-GB scale (HDF5 metadata + chunk
   caches per open file). Restores per-task max to match develop.

3. Add check_buf_1d / check_buf_2d helpers called from every
   reader/writer enqueue. Catches unallocated and wrong-sized caller
   buffers at the enqueue site instead of silently storing the pointer
   for a later get_var/put_var to mishandle.
@travissluka
Copy link
Copy Markdown
Contributor Author

travissluka commented May 15, 2026

Both items addressed in 2d7ab3c.

Vertical diffusion: your vt_scales file has Temp(z, y, x) with no leading Time dim. The reader hardcoded "last Fortran dim is trailing time" and sliced it to 1, so only z-level 1 read correctly; levels 2-N stayed uninitialized → your [-2.31 .. 32.6] garbage. Now discriminates file_ndims == dst_rank (no time) vs file_ndims > dst_rank (trailing time + CICE-style middle squeeze), with a total-element-count check to catch level-size mismatches too.

Memory: dropped the read-side file/var cache that was keeping NetCDF4 handles (and their HDF5 chunk/metadata caches) open across commits. Per-task max now matches develop on the local emulator.

72/72 ctests pass.

@travissluka travissluka requested a review from shlyaeva May 15, 2026 18:25
Copy link
Copy Markdown
Collaborator

@shlyaeva shlyaeva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've retested cleanly (two separate experiments, the only difference is soca branch), and the latest fix fixed both the issue with reading vertical diffusion parameters, and with memory. Thank you @travissluka!

Var:
this branch:

Runtime:    132.91 sec,  Memory: total:   405.00 GB, per task: min =   800.67 MB, max =  3408.76 MB

develop (ran after this branch):

Runtime:    111.65 sec,  Memory: total:   418.95 GB, per task: min =   824.59 MB, max =  4297.21 MB

LETKF:
this branch:

Runtime:    541.38 sec,  Memory: total:   880.34 GB, per task: min =     2.91 GB, max =     6.54 GB

develop:

Runtime:    642.86 sec,  Memory: total:   899.06 GB, per task: min =     2.97 GB, max =     8.61 GB

Runtimes for the other tasks using soca I/O (bmat tasks, parallel recentering) have reasonably similar runtime and memory use.

Copy link
Copy Markdown
Contributor

@DavidNew-NOAA DavidNew-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No objections on my end

@shlyaeva shlyaeva requested a review from Dooruk May 19, 2026 20:06
@eap
Copy link
Copy Markdown
Contributor

eap commented May 19, 2026

@travissluka
I tried building this with my spack-stack 2.1 environment and got the following build error. This could still be a problem specific to how I configured the build but I'd appreciate if you could check it out. Feel free to use the jedi-tools EC2 scripts to launch a spack-stack 2.1 VM.

[ 95%] Building Fortran object soca/src/soca/CMakeFiles/soca.dir/IO/soca_io_mod.F90.o
/home/ubuntu/gcc_jedi_bundle_test/jedi-bundle/soca/src/soca/IO/soca_io_mod.F90:671:52:

  671 |                        ishift=ishift, jshift=jshift)
      |                                                    1
Error: There is no specific subroutine for the generic ‘mpp_scatter’ at (1)
/home/ubuntu/gcc_jedi_bundle_test/jedi-bundle/soca/src/soca/IO/soca_io_mod.F90:689:52:

  689 |                        ishift=ishift, jshift=jshift)
      |                                                    1
Error: There is no specific subroutine for the generic ‘mpp_scatter’ at (1)
/home/ubuntu/gcc_jedi_bundle_test/jedi-bundle/soca/src/soca/IO/soca_io_mod.F90:711:56:

  711 |                            ishift=ishift, jshift=jshift)
      |                                                        1
Error: There is no specific subroutine for the generic ‘mpp_scatter’ at (1)
/home/ubuntu/gcc_jedi_bundle_test/jedi-bundle/soca/src/soca/IO/soca_io_mod.F90:715:56:

  715 |                            ishift=ishift, jshift=jshift)
      |                                                        1
Error: There is no specific subroutine for the generic ‘mpp_scatter’ at (1

@travissluka
Copy link
Copy Markdown
Contributor Author

@travissluka I tried building this with my spack-stack 2.1 environment and got the following build error. This could still be a problem specific to how I configured the build but I'd appreciate if you could check it out. Feel free to use the jedi-tools EC2 scripts to launch a spack-stack 2.1 VM.

It turns out those ishift/jshift keywords were removed in the newer version of FMS that's in spack stack 2.1. I'll refactor to remove that dependency.

FMS 2025.02 removed the ishift/jshift optional args from mpp_scatter,
breaking compilation on spack-stack 2.1. Pre-apply the shift in the
indices we pass; behavior unchanged on FMS 2024.02 (spack-stack 1.9.2).
@travissluka
Copy link
Copy Markdown
Contributor Author

Pushed in a547504. No answer changes expected. Note: this code path is dormant on this PR — wired up in the follow-on parallel-ensemble I/O PR — so it unblocks spack-stack 2.1 compile but isn't exercised functionally yet.

Copy link
Copy Markdown
Collaborator

@Dooruk Dooruk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran two identical 0.25-deg multi-node 3DFGAT tests with a January 15th SOCA version and this branch. Residual norms are identical. I do see some improvement which is great, especially with the runtime in the new version:

These are with Spack-stack 1.9

This branch:

OOPS_STATS Run end      - Runtime:    255.88 sec,  Memory: total:  1140.07 GB, per task: min =     1.51 GB, max =     4.14 GB
Run: Finishing oops::Variational<SOCA, UFO and IODA observations> with status = 0
OOPS Ending   2026-05-20 14:41:34 (UTC+0000)

Jan15th:

OOPS_STATS Run end      - Runtime:    358.17 sec,  Memory: total:  1292.89 Gb, per task: min =     1.54 Gb, max =     4.51 Gb
Run: Finishing oops::Variational<SOCA, UFO and IODA observations> with status = 0
OOPS Ending   2026-05-20 14:39:44 (UTC+0000)

@DavidNew-NOAA
Copy link
Copy Markdown
Contributor

@travissluka I was getting the following error compiling your branch with spack stack 2.1

/home/ubuntu/jedi/jedi-bundle/soca/src/soca/Geometry/soca_geom_mod.F90:34:5:

   34 | use time_interp_external_mod, only : time_interp_external_init
      |     1
Fatal Error: Cannot open module file ‘time_interp_external_mod.mod’ for reading at (1): No such file or directory
compilation terminated.

time_interp_external_mod was removed I believe in FMS/2025.1. Changing time_interp_external_mod to the new time_interp_external2_mod fixes the compilation error.

@travissluka
Copy link
Copy Markdown
Contributor Author

@travissluka I was getting the following error compiling your branch with spack stack 2.1

/home/ubuntu/jedi/jedi-bundle/soca/src/soca/Geometry/soca_geom_mod.F90:34:5:

   34 | use time_interp_external_mod, only : time_interp_external_init
      |     1
Fatal Error: Cannot open module file ‘time_interp_external_mod.mod’ for reading at (1): No such file or directory
compilation terminated.

time_interp_external_mod was removed I believe in FMS/2025.1. Changing time_interp_external_mod to the new time_interp_external2_mod fixes the compilation error.

@DavidNew-NOAA Well, that's obnoxious. Does time_interp_external2_mod exist in the earlier FMS? (e.g. if we switch that now, does it break backward compatibility with spack-stack < 2.1 ?) Since NASA has now approved this PR, I think we (@shlyaeva , @guillaumevernieres ) should go ahead and merge this, and the final spack stack 2.1 fixes can be applied afterward

@travissluka travissluka added the ready for merge PR is reviewed and is ready for merge label May 20, 2026
@DavidNew-NOAA
Copy link
Copy Markdown
Contributor

DavidNew-NOAA commented May 20, 2026

@travissluka I concur.

@shlyaeva
Copy link
Copy Markdown
Collaborator

Sounds good to me, I'm merging since both I and Doruk approved. If there are remaining issues, let's do follow-up PRs!

@shlyaeva shlyaeva merged commit 4b0c6a8 into develop May 20, 2026
3 checks passed
@shlyaeva shlyaeva deleted the feature/soca-direct-io branch May 20, 2026 21:07
@Dooruk
Copy link
Copy Markdown
Collaborator

Dooruk commented May 20, 2026

Have other centers switched to spack 2 or 2.1?

FWIW, I naively googled is "time_interp_external2_mod" in fms/2024.02 and found the following:

https://egitlab.gfdl.noaa.gov/NOAA-GFDL/FMS/-/blob/2024.02/time_interp/time_interp_external2.F90?ref_type=tags

Welp, nevermind 😄

travissluka added a commit that referenced this pull request May 23, 2026
Rebased onto develop after the direct-netCDF soca_io_mod PR (#1242) landed.
End-to-end ensemble-parallel I/O on top of the stateless direct-netCDF reader.
Opt-in via the new oops StateSet/IncrementSet bulk-readEnsemble/writeEnsemble
dispatch path; soca provides the model-level implementations.

State/Increment:
  - State::{readEnsemble,writeEnsemble} and Increment::{readEnsemble,
    writeEnsemble} (Fields + interface Fortran bindings to match)
  - writeEnsemble: gather each member to a strided writer PE
    (soca_io_ensemble_root_pe assignment), then phase-2 per-member netCDF
    writes happen concurrently across writer PEs
  - readEnsemble: per-member loop today, honoring single-state read mode
    (strided vs scatter); parallel-across-members reads via
    soca_io_readers_commit_ensemble are wired for a follow-up

soca_io_mod:
  - writer staging split into define/gather/write/close phases plus
    soca_io_writers_commit_ensemble; reader staging mirrored with
    read/distribute/close and soca_io_readers_commit_ensemble. Reader
    opens+closes ncid inside stage_read (no global file-handle cache,
    matching develop's stateless model post-2d7ab3cf)
  - new public knobs: soca_io_ensemble_write_parallel,
    soca_io_ensemble_read_scatter, soca_io_single_state_read_scatter,
    soca_io_async_mpi, soca_io_ensemble_root_pe
  - mpi_pelist_and_comm helper for the writer/reader PE subsets

Geometry:
  - soca_io_config_from_yaml called from soca_geom_init resolves the
    parallel/sequential, scatter/strided, async-mpi knobs from geometry.io
    once; values persist module-level for the run

Companion to JCSDA-internal/oops setID fix in StateSet::buildFromConfigs
(needed so PseudoModel and HTLM dispatch see correct per-member IDs when
StateSet takes the HasReadEnsemble path).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready for merge PR is reviewed and is ready for merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update soca to fms_io v2

6 participants