Refactored soca -> cice#1246
Conversation
Introduces src/soca/IO/soca_io_mod.F90 with soca_io_file_reader / soca_io_file_writer types (init/enqueue/commit/close), and routes all state, geometry, and balance-coefficient I/O through it. PE 0 calls nf90_get_var / nf90_put_var; mpp_broadcast and mpp_gather distribute / collect data on the geometry's pelist. Removes all fms_io_mod usages (register_restart_field, restore_state, save_restart, read_data, file_exist, field_exist, fms_io_init, fms_io_exit) from soca, including the global_soca_geom_counter shim. fms_init / fms_end are retained (subsystem init only, no I/O). Resolves the GDAS Tsnz_h shape mismatch by building nf90_get_var count from the file's actual dim sizes, not the caller's buffer rank. Per-domain state writes (ocn/sfc/ice/wav/bio) and reads now use the new module; the FMS code paths are removed in the same change rather than gated. Refs #1125. Obsoletes PR #1241.
Under FMS, the writer would implicitly add the .nc suffix, so test YAMLs and downstream readers reference 'ocn.<exp>.<typ>.<date>.nc'. The direct-netCDF writer in soca_io_mod is faithful to whatever string it's given, so files were landing without the extension and no consumer could find them.
Six soca_add_test() calls listed test_socs_parameters_diffusion as a dependency (should be test_soca_). Because the named test does not exist, ctest treated the dependency as void and ran the variational tests concurrently with parameters_diffusion under -j>1, causing races on the diffusion calibration outputs.
Drops the SOCA_IO_SERIAL / SOCA_IO_PARALLEL mode bifurcation (and the method= arg + io.method config knob) and rewrites the reader to do per-PE strided nf90_get_var, mirroring FMS mpp_io's domain-decomposed read pattern. Each PE opens the file once via NF90_NOWRITE and reads its own compute-domain tile via nf90_get_var(start, count) -- the same shape as fms_io's READ_RECORD_ for fileset.NE.MPP_SINGLE. Adds a module-level read cache mirroring fms_io's get_file_unit + files_read(i)%var(j) tables: - one nf90_open per (PE, filename) for the whole run - cached (varid, file_ndims, middle_dim sizes) per (file, var) - soca_io_close_all() flushes the cache; wired into soca_geom_end Caller updates: drop method= from reader/writer init() and the soca_io_method_from_config helper from soca_fields_mod / soca_geom_mod / soca_balance_mod. 1-deg 20-mem LETKF read phase (LocalEnsembleDA before solver ctor): baseline (FMS): 3.91 s before this commit (no cache): 13.92 s with this commit: 3.97 s Outputs are bit-identical.
…cache) Adds a second reader_commit implementation -- PE 0 nf90_get_var the global field + mpp_broadcast -- alongside the existing per-PE strided path. Both paths share the file-handle + var-metadata cache. Select via SOCA_IO_READ_MODE=broadcast|strided (default broadcast); the env var is latched on first reader_commit. Why broadcast as default: in DA cycling the page cache is always cold, since each cycle reads files that the previous cycle / model run just wrote. On cold cache the broadcast path (1 sequential read on PE 0, kernel readahead happy) beats strided (8 concurrent strided readers thrash the prefetcher and pay 8x the open-syscall cost). 1-deg 20-mem LETKF, rancor, taskset -c 0-7, cold cache (sample 1): broadcast: 13.5 s read strided: 30.5 s read Warm cache reverses the result (strided 3.9 s vs broadcast 13.3 s) but cycling never sees warm cache. On a parallel filesystem (Lustre/GPFS) or multi-node setup the strided path may win again -- toggle the env var to test.
Drops SOCA_IO_READ_MODE in favor of a yaml block on the geometry
config:
geometry:
io:
read mode: broadcast | strided
If absent the module-level default (broadcast) is kept.
soca_geom_init calls soca_io_read_mode_from_config(f_conf) once at
startup; the choice latches for every subsequent reader_commit.
Unrecognized values abort so typos surface immediately.
Annotates testinput/letkf.yml's geometry block with the two read-mode options so a contributor reading the canonical letkf example sees both choices without having to hunt for the soca_io_mod docstring.
The write path emits FMS-style auto-numbered dim names (xaxis_N / yaxis_N / zaxis_N / Time); the file-header docstring incorrectly claimed xh / yh / Time. Correct the comment -- no code change.
Writer: - enqueue holds pointers to caller buffers instead of allocating and copying (mirrors FMS register_restart_field contract). Compute-slice extraction moves from enqueue to commit, so peak per-writer memory drops from sum(var_bytes) to one (nx_c x ny_c) tile. - commit uses the 3D mpp_gather overload for 3D vars: one collective per var instead of nlevels, and no per-level gbuf3d(:,:,k) = gbuf2d memcpy on root. - Reuse gbuf3d / tile3 across 3D vars when nlevels matches. Reader: - commit_reader_strided hoists tile2 out of the per-var loop and reuses tile3 / tile4 when trailing dims match. - Dead tile3/tile4 zero-inits removed (read_var_strided fully fills the buffer). read_var_strided: fixed-size stack arrays for st / ct instead of per-call heap allocation. put_axis_coord_data: reuse idxbuf across axes. Cleanup: - Remove dead 'use mpi' and 'use fckit_configuration_module'. - Remove dead cartesian_axis parameter on writer_enqueue_1d (it was stored but never written to the netcdf output). - Drop unused nprocs out-param of mpi_pelist. - soca_io_close_all: use ncc on nf90_close instead of silently discarding the status. - commit_reader_scatter: matching dead-init cleanup; comment updated to flag it as pending parallel-ensemble I/O exercise. - Trim verbose / redundant comments.
Drop dangling references to the removed soca_io_read_mode_from_config / 'io.read mode' YAML key in soca_geom_mod and the letkf test input. The selector was already gone from soca_io_mod; without these the PR fails to compile.
- soca_io_mod: allocate gbuf2d / gbuf3d as 1x1 dummies on non-root so the actuals passed to mpp_gather are always allocated (assumed-shape dummy requires it). Pattern matches commit_reader_scatter. - soca_io_mod: refresh stale 'Data is copied in' comment to reflect the pointer-based enqueue semantics (caller buffer kept alive/unmutated through commit; actual must satisfy TARGET association rules). - soca_geom_mod: TARGET on the 'self' dummy in soca_geom_init, soca_geom_init_fieldset, and soca_geom_write so the allocatable components (self%lonh, self%lat, ..., self%mask2d*) are valid pointer targets for the soca_io reader/writer enqueue. TARGET on the local fieldData / fieldDataVars too. - soca_fields_mod: TARGET on h_common and on the local vars(:) wrapper array so vars(n)%data sections used in enqueue are valid pointer targets. - soca_balance_mod: TARGET on local kct for the same reason.
…e checks Three related fixes in the direct-netCDF reader path: 1. read_var_strided previously hardcoded "last Fortran dim is trailing time" and set ct(file_ndims)=1 unconditionally. For a file with spatial-only layout (Temp(z,y,x) with no leading Time), this read only level 1 of z into a multi-level destination, leaving the rest uninitialized -- showing up as garbage scale values in the vertical diffusion calibration. Now discriminate file_ndims == dst_rank (no time, fill every dim from the file) vs file_ndims > dst_rank (trailing time + middle squeeze, e.g. CICE Tsnz_h's nksnow=1). A total-element-count check catches silent partial-fills and dim-size mismatches (e.g. file z=75 vs destination z=25). 2. Drop the read_cache / cached_open / soca_io_close_all machinery. Each reader_commit nf90_opens, reads all enqueued vars (with inline inq_varid + inquire_variable + inquire_dimension; microseconds), and nf90_closes. Holding NetCDF4 handles open across commits was bloating LETKF per-task memory by ~MB-to-GB scale (HDF5 metadata + chunk caches per open file). Restores per-task max to match develop. 3. Add check_buf_1d / check_buf_2d helpers called from every reader/writer enqueue. Catches unallocated and wrong-sized caller buffers at the enqueue site instead of silently storing the pointer for a later get_var/put_var to mishandle.
# Conflicts: # src/soca/Fields/soca_fields_mod.F90
Small, no-behavior-change cleanups before the bigger refactors: - Q5: delete unused 5-arg adjustThicknessCategories back-compat overload. Update the four TestIcePhysics calls to the 9-arg form; rewrite the _InfeasibleTarget test since legacy semantics no longer apply. - Q9: remove rho ice/snow/ocean from FreeboardParameters. They were always CICE's defaults; making them configurable would let users write a restart inconsistent with CICE's physics. Use icephysics::Constants directly. Strip the now-unused rho lines from 3 test yamls. - Q10: rewrite seed new ice docstring to match the 2026-05-20 change (qice from Tfrz - Tice_seed_offset, only Tsfcn from donor). - Q12: move applyThermoStage and CatRecord to the private section. - Q13: replace the 12-line per-cat view loop with 6 ppiCatViews calls. - Q16: delete snowEnthalpyToTsfc (only used by the removed Tsfcn-from-qsno path), the unused 'sum' helper in IcePhysics.cc, and the (void)sum warning suppression. Replace round-trip unit test with a known-value test. - Q7/Q8: docstring fixes for 'min thickness gap' (numerical robustness knob) and 'max snow thickness' (explicitly per-ice-area, m). All 10 postprocice/soca2cice/icephysics/ensanpproc tests still pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The rebin pass is what makes per-cat thicknesses sit inside their bins; running without it leaves the restart in a state CICE will quickly rebin on its own, and is only useful as a legacy diagnostic. Production users get correct behavior automatically; the legacy no-rebin path stays available via 'itd.rebin: false' in yaml. soca2cice_new.yml and soca2cice_new_bgfallback.yml now run with rebin on (no yaml change needed; they don't set itd: explicitly). Testrefs are empty so no refresh needed. All 10 postprocice tests pass, including the Fortran-vs-C++ regression with its 2-cell tolerance. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Six soca2cice_new* yamls collapsed to four:
- soca2cice_new (old non-rebin baseline) deleted; redundant now that
rebin is the default. soca2cice_new_rebin renamed to take its slot
with the itd: block dropped (all defaults match).
- soca2cice_new_thermo deleted; its scope (thermo: {update snow thermo,
reset ponds}) is a strict subset of soca2cice_new_seed.
Kept: soca2cice_new (new production baseline, rebin on), _freeboard
(only freeboard coverage), _seed (full Stage C donor-halo seed),
_bgfallback (analysis variables resolver).
Also added soca2cice_new_freeboard.yml/_seed.yml + their .test files
to the explicit soca_test_input/soca_test_ref lists so they survive a
cmake reconfigure (previously these were only registered by
soca_add_test, which doesn't add them to the configure-time symlink
sweep).
All 8 remaining postprocice tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- 'min aice' was an input clamp on analysis aice, defaulted to 0.0 (a no-op), and never set by any caller. Its role is fully covered by 'min aice output' (which clamps the *output* and is set to 1e-4 by default to drop near-zero analysis aice cells). The input clamp now just enforces nonneg; remove the parameter. - 'min vice' renamed to 'min cat ice volume' to disambiguate from 'min new ice thickness' (per-cell new-ice fallback). Docstring rewritten to explain it's the per-cat volume floor that drives the mass-conserving sweep after Stage A/B. Default unchanged at 1e-5. The legacy Fortran Soca2Cice path keeps its own 'min aice' / 'min vice' keys; those are separate parameters. All 8 postprocice tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The only consumer of ocean T/S inside postProcess was the SST-on-ice2noice adjustment: when the analysis removes ice from a previously-iced cell, the surface ocean was optionally warmed toward Tfrz so the post-DA SST was consistent with no ice. The salinity field was copied in but never read. Both are removed: - SstUpdateParameters block and its 'sst update:' yaml key. - copyOceanFields helper (static); the ocean T/S copy at the top of postProcess; the new_tocn writable view. - The Case-1 ICE2NOICE branch's SST warming code. Rationale: surface-ocean adjustment on ice removal is a DA-increment concern, not a CICE-restart concern. Folding it back into the MOM6 IAU increment (gdasapp/utils/soca/gdas_incr_handler.h, removed in a parallel commit) was working around the wrong layer doing the work. Future revisions, if needed, should adjust SST upstream where the ocean increment is generated. readRestart's State now carries only sea-ice fields; PostProcessIce neither reads nor writes ocean variables (ocean T/S on the analysis is unused). All 8 postprocice tests pass, including the Fortran regression test with its 1e-10 tolerance / 2-cell budget. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The generic soca_fields_write_rst no longer scans the config for a 'cice template' key. Instead: - new Fortran subroutine soca_fields_write_cice that handles the config -> soca_fields_write_cice_rst dispatch; - new C entry soca_state_write_cice_f90 / soca::State::writeCice; - PostProcessIce::writeRestart now calls pproc.writeCice(cfg) instead of pproc.write(cfg) with the magic 'cice template' key. write_rst goes back to being a pure domain-by-domain restart writer that knows nothing about CICE update mode. Callers that want CICE update mode go through writeCice explicitly. All 8 postprocice tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The three call sites (soca's Postproc.h, AnalysisPostproc.h, and
gdasapp's gdas_incr_handler.h) all did the same triple:
State restart = ppIce.readRestart(geom, analysis.validTime());
State pproc(geom, restart);
ppIce.postProcess(pproc, restart, analysis);
ppIce.writeRestart(pproc, analysis.validTime());
Replaced with a single public method:
State PostProcessIce::postprocess(const State & analysis) const;
which opens the CICE restart, applies Stage A/B/C in update mode, writes
the postprocessed restart, and returns an aggregate-ice State on
analysis's geometry carrying {sea_ice_area_fraction, sea_ice_thickness,
sea_ice_snow_thickness} matching what was written.
readRestart, writeRestart, ciceRestartVariables, and the case-dispatch
loop (now runPostprocess) are private. The aggregate-ice return value
is what the gdasapp pipeline / scripted callers can consume without
having to know the per-cat CICE schema.
Postproc.h and AnalysisPostproc.h migrated. The gdasapp side migrates in
a parallel commit. All 8 postprocice tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
soca_coding_norms (include-what-you-use linter) flagged this on line 57. The header has been using std::make_unique since the file was introduced but pulled <memory> transitively; the linter wants it explicit. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Self-review pass on the PR-as-a-single-change:
- Remove unused includes from PostProcessIce.cc (<limits>, <numeric>,
atlas/mesh/Mesh.h, atlas/mesh/Connectivity.h,
atlas/mesh/actions/BuildCellCentres.h).
- Drop project-internal labels from comments and docstrings:
- 'Stage A / Stage B / Stage C+' inside the per-cell loop → use what
the code actually does ('Per-cell pass', 'ITD rebin', 'Snow
distribution', 'Freeboard enforcement').
- 'Phases A+B+C' in the donor-halo gather header → 'Sparse halo
exchange'.
- 'user decision 2026-05-11', '+2.6 K warm bias on the production
grid', 'With the shuffle path removed', 'rebin shuffle', 'dd'
shorthand, external Python script filenames ('mc6util.adjust_*',
'insert_iconc_ithkn_cice6_restart.py').
- Stage references in PostProcessIce.h ('Parameters for the Stage C
thermo / pond pass', 'after Stages A/B', etc).
- Fix the log message at the end of runPostprocess: was printing
(the background, unchanged) instead of (the
postprocessed state). Renamed all 3 info-log lines to use the
consistent 'PostProcessIce:' prefix.
- Group all parameter pickups at one place rather than 'additional
knobs picked up later in the function'; move seedK with the rest.
- Rewrite stale parameter docstrings ('Parameters for adding soca
increment to CICE restart files' → 'Top-level parameters'; 'Stage C
thermo' → 'Per-layer thermo / pond reset', etc.) and drop the
'matches the Python reference scripts' / 'matches dd' qualifiers
in favor of the actual physical rationale.
- Update Tice_seed_offset / Tsno_clip_max comments in IcePhysics.h to
state what they are, not what they 'match' in an external script.
- Add src/soca/PostProcess/README.md documenting the workflow, per-cell
pass, thermo pass, new-ice seeding, full yaml configuration schema
with defaults, and the implementation notes (per-cat field naming,
KDTree halo, IcePhysics helpers). Style matches the existing
Soca2Cice README.
All 8 postprocice tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
B-list (questions on the comprehensive PR diff): - B4: validate `itd.category bounds` length (ncat+1) at PostProcessIce construction, not only when rebin runs. Default is the CICE5 ncat=5 layout; ncat != 5 without bounds override now fails fast regardless of rebin/freeboard. - B5: drop the redundant `itd:` block from ensanpproc.yml. With rebin defaulted on and the only knobs in that block being the CICE5 defaults, the block was pure noise. - B6: rewrite the AnalysisPostproc.h comment around seekAndReplace to state where %mem% actually fires (cice restart: input/output paths). - B1: move tools/postproc_compare/ out of the soca submodule. Per the README it's a "study tool, not a CI test", and once Soca2Cice is deprecated this becomes a one-shot deprecation artifact. The directory lives at jedi-bundle/tools/ now (untracked, alongside the user's other local helper scripts). D-list (cleanup found in the self-review): - D1: drop the double `pproc = restart;` in runPostprocess. `postprocess` already did `State pproc(geom_, restart);` (the copy-constructor). - D2: stop adding aggregate ice diagnostic atlas Fields onto pproc.fieldSet(). The aggregates were quiet violations of the State invariant (extras bypassing soca::Fields's tracked-fields list) and only existed to be read back by the postprocess() wrapper. Instead, runPostprocess now constructs the aggregate-ice result State on the analysis geometry and writes the aggregates directly into its fieldset; it returns the State. pproc keeps the per-cat schema the CICE-restart writer expects. - D3: drop the empty `print(std::ostream &)` override and the util::Printable inheritance. PostProcessIce is never streamed anywhere; the inheritance was dead weight. - D4: bg_aice atlas Field replaced with std::vector<double> for consistency with a_hice_vec / a_hsno_vec. Same memory cost, no pseudo-field on the working FieldSet. All 8 postprocice tests + coding norms pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The aggregate diagnostics are no longer added onto pproc and read back; runPostprocess builds the result State directly. Reword the workflow description accordingly. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
PostProcessIce uses the CICE restart input both as the update-mode template
and (configurably) as the output path. When users configure
'cice restart: { input: foo.nc, output: foo.nc }' for in-place update,
the byte-copy is wasted work: the file already contains its own template.
writer_commit_update now checks for equal template/filename and skips the
copy when they match, opening the existing file directly NF90_WRITE.
Non-equal paths still go through the byte-copy as before, preserving the
safe "keep an unmodified backup" semantics for users who want them.
Path comparison is a literal string equality; cosmetic differences like
'./foo.nc' vs 'foo.nc' would still trigger the (harmless) self-copy. Good
enough for the common gdasapp case where the two yaml fields hold the
exact same string.
Verified manually: running soca_postproc.x with equal input/output paths
mutates the bg restart in place (different md5 after the run); all 9
existing postprocice tests still pass (they configure distinct paths).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
PostProcessIce: production-grid validation across 7 datesSide-by-side comparison of the new C++/atlas Dates (one per season + multi-year coverage): For each cycle we hold the same DA analysis (from GFSv17 with artificially increased ice thickness and snow depth to test insertion of those variables) and the same CICE background fixed, and compare the resulting CICE restart written by each of the three pipelines. Aggregate cell-level RMSE vs analysisMask: cells where
* Per-category ICE2ICE perturbation vs backgroundMask: per-cat slots where both
PlotsThree production-grid comparisons for
Clearest contrast:
Same pattern in the SH: All 7 dates exhibit the same qualitative behaviour - the new path is consistently close to the analysis on ice thickness and snow (significantly better than legacy). Bottom lineNo degradation compared to Dmitry's scripts, significant improvement for ice thickness and snow depth compared to the legacy soca2cice. |
|
I'm going to run a few cycles with the ensembles using this postprocessing and report back here. In the meantime, this is mostly ready for review, I don't anticipate major changes. If it looks painful to review, just know that I've been working on this on and off for half a year, and I'm with you 😆 . |
CICE6's linear_itd aborts when per-cat thickness `hi = vicen/aicen` falls
outside `[hicat[k], hicat[k+1]]`. Diagnostic scan of the existing
production-grid restarts (datatoanalyze/, 7 dates) found ~5% of ice cells
in every restart had at least one populated cat with out-of-bin hi --
~5,000-16,000 cells per date. CICE6 was crashing on these.
Two bugs at play:
1. **Default `category bounds` was CICE5 layout** with bounded top
(`{0, 0.6445, 1.391, 2.470, 4.567, 9.334}`) whereas CICE6 GFSv17
uses `{0, 0.64, 1.39, 2.47, 4.57, 1000}` (open top). Silent
mismatch produced restarts CICE refused to read. Make
`category bounds` a RequiredParameter -- users must spell out the
layout matching their CICE version.
2. **The standalone min-vice cleanup block** in PostProcessIce.cc
(now removed) redistributed mass from dropped cats to surviving
cats proportional to surviving aicen, without respect to bins.
This shifted survivors' per-cat hi outside their bins, producing
the CICE6 abort cases. The dd Python reference (mod_cice6_utils.py
adjust_thkncats_aice) doesn't have a separate min-vice cleanup --
its scipy.optimize.minimize call uses the bin edges as variable
bounds, so survivors are guaranteed in-bin. We rely on the rebin's
existing `ainMin = 1e-8` clip + Step 7 renormalisation (which
preserves per-cat hi), and add a final per-cat bin clip as
defence in depth:
const double hClipped = std::min(std::max(h, hLo), hHi);
if (hClipped != h) vicen[k] = aicen[k] * hClipped;
Aicen is left alone (Σaicen preserved); a small amount of mass is
lost to the clip, much smaller than the test failures the bug
produced.
Drops the now-unused `min cat ice volume` yaml parameter.
Test fallout: the 4 soca2cice_new* testrefs shift slightly (min-vice
cleanup is gone). The freeboard test shifts hardest (hice 0.4153 ->
0.4044, because the cleanup was undoing some of freeboard's
ice-volume growth). Test yamls updated to set `category bounds`
explicitly. 9/9 postprocice tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Log::info() writes only from task 0 by oops convention, so all the
counter / max-magnitude diagnostics in runPostprocess were silently
reporting only rank 0's piece of the grid -- misleading on multi-task
runs.
allReduceInPlace each counter (sum) and each magnitude (max) on
geom_.getComm() before the log lines, so what gets printed is the
global total / global max. Covers:
- rebin_visited, rebin_aicen_mutations (sum)
- rebin_max_delta_aicen_all (max)
- rebin_failures, freeboard_failures (sum)
- bin_clip_slots (sum)
- bin_clip_max_abs_dh (max)
- seedNewIce fallbacks (sum, separately because
seedNewIce runs after the main loop)
Also adds the bin-clip diagnostic itself (was uncounted): per-cat slots
clipped + max |dh| in metres. Only printed when at least one slot was
clipped.
9/9 postprocice tests still pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Log::warning fans out to every MPI task, so the per-rank warnings were printing N identical copies of the global-summed counts. Switch the ITD-rebin-failure, freeboard-failure and noice->ice Tfrz-fallback messages to Log::info (rank-0-only). These are diagnostic counters about non-fatal events the code already handles, so info is the right level. Mention the bin clip in the rebin-failure message so the log explains how out-of-bin cells get repaired. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Replace the 'min-vice cleanup' step in the per-cell pass with the new 'per-cat bin clip' step (the cleanup is gone; the clip is the guarantee that the restart is CICE-readable). - Drop the 'min cat ice volume' configuration entry (parameter no longer exists). - Mark 'itd: category bounds' as required in both the schema block's inline comment and the closing 'required fields' bullet. Switch the example value to the CICE6 GFSv17 layout ([0, 0.64, 1.39, 2.47, 4.57, 1000]) and document the CICE5 alternative; explain that the previous silent default was a footgun. - Tidy two surrounding references to 'min-vice cleanup' that pointed at the dropped step. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>



Description
Sea-ice analysis postprocessing in C++/atlas (
PostProcessIce): replaces the FortranSoca2Cicevariable change with a C++/atlas postprocessor that projects an aggregate SOCA ice analysis onto a per-category CICE restart. The new algorithm is largely based on @DmitryDukhovskoy's Python scripts to insert ice concentration, thickness and snow depth (https://github.com/DmitryDukhovskoy/RTOFS_utilities/blob/gaea_pub/prepare_cice6/insert_iconc_ithkn_cice6_restart.py and https://github.com/DmitryDukhovskoy/RTOFS_utilities/blob/gaea_pub/prepare_cice6/insert_hsnow_cice6_restart.py). Validated on 7 production GDAS cycles spanning different seasons.What's new
soca::PostProcessIcein src/soca/PostProcess/,with a single public entry point:
Reads the CICE background restart, applies the per-cell pass, writes the postprocessed restart in update mode, returns an aggregate-ice State
{aice, hice, hsno}matching what was written. Used by the standalonesoca_postproc.xapp, byAnalysisPostproc.h's ensemble loop, and by gdasapp's increment handler.Per-cell pass (see README for details): case dispatch (LAND / ICE2NOICE / NOICE2ICE / ICE2ICE), ITD rebin (on by default), aicen-weighted snow distribution, optional freeboard enforcement, mass-conserving min-vice cleanup. Pure column-physics helpers (BL99 enthalpy, BZ99 salinity profile, thickness-category solver, freeboard) live in
IcePhysics.h/.ccwith their own unit tests inTestIcePhysics.cc.New-ice thermo seeding: cells that transitioned from
bg_aicen=0tonew_aicen>0get Tsfcn from the area-weighted mean of a KDTree donor with any ice (global lat/lon tree, donor data gathered once via a sparse halo exchange); sub-surface qice/sice synthesized from CICE physics at the ocean freezing point.soca_postproc.xstandalone application(src/mains/Postproc.h): takes
background+increment, formsanalysis = bg + incr, runsPostProcessIce. Also accepts an explicitanalysisblock.Dedicated CICE restart writer on the soca::State side:
soca::State::writeCice(cfg)calls a new Fortran entrysoca_fields::write_cicethat wraps the update-mode CICE writer (byte-copy input restart → output, overwrite only modelled variables, ~40 unmodelled CICE vars pass through).fields_metadata.yml: new<LEVEL>placeholder for templated per-layer entries (CICE per-layer restart varsqice00N,sice00N,qsno00N); combines with<CATEGORY>to expandncat × ice_leventries automatically. Per-cat iceio nameupdated to restart naming (aice<N>,vice<N>,vsno<N>- dropping the_hsuffix which is the CICE history convention; restarts don't carry it).AnalysisPostproc.h: ensemble loop migrated to callPostProcessIce::postprocessper member.Soca2Cicevariable change no longer used here.Tests
Four end-to-end tests on the 72×35 grid (one per major code path) plusa pure unit test on the column physics:
test_soca_soca2cice_new- baseline (rebin on)test_soca_soca2cice_new_freeboard- freeboard enforcementtest_soca_soca2cice_new_seed- new-ice seeding + thermotest_soca_soca2cice_new_bgfallback- analysis variables resolver(only aice analysed, hice/hsno fall back to background)
test_soca_icephysics- pure C++ unit test of the column-physics helperstest_soca_postprocice_vs_soca2cice- Fortran-vs-C++ regression(
Soca2Ciceoutput compared bit-wise toPostProcessIceoutput;budget = 2 differing cells at 1e-10 tolerance)
test_soca_ensanpproc- ensemble postproc end-to-endAll pass. The legacy Fortran
Soca2Cicepath is still available throughsoca_convertstate.xfor the regression test; it can be deprecated in a follow-up PR.Configuration
A minimal yaml stanza:
Full schema with defaults in src/soca/PostProcess/README.md.
Checklist