Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 70 additions & 0 deletions REFACTORING_PR_BREAKDOWN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# Refactoring PR Breakdown

This document outlines how to break down the large refactoring into smaller, manageable PRs.

## Suggested PR Breakdown:

### PR 1: Move `restore.jl` to Checkpointer
- Move `restore!` functions to `Checkpointer.jl`
- Update includes in `climaland_bucket.jl` and `climaatmos.jl`
- Update calls to `Checkpointer.restore!`
- **Why first**: Self-contained, minimal dependencies

### PR 2: Create Input module and move `cli_options.jl`
- Create `Input.jl` module
- Move `argparse_settings()` and `parse_commandline()` from `cli_options.jl`
- Update references to use `Input.argparse_settings` and `Input.parse_commandline`
- **Note**: Can keep `cli_options.jl` temporarily for backward compatibility (or delete if nothing else uses it)

### PR 3: Move `arg_parsing.jl` functions to Input
- Move `get_coupler_config_dict()` and `get_coupler_args()` to `Input.jl`
- Update `setup_run.jl` to use `Input.get_coupler_config_dict` and `Input.get_coupler_args`
- Update other files that use these functions
- Delete `arg_parsing.jl` after migration

### PR 4: Create Postprocessor module structure
- Create `Postprocessor.jl` with basic structure
- Move `postprocess()` and `postprocess_sim()` from `setup_run.jl`
- Move `simulated_years_per_day`, `walltime_per_coupling_step`, `save_sypd_walltime_to_disk`
- Update `SimCoordinator` to call `Postprocessor.simulated_years_per_day`, etc.
- **Why now**: Establishes the module without the large files

### PR 5: Move diagnostics functions to Postprocessor
- Move `coupler_diagnostics.jl` functions to `Postprocessor.jl`
- Update `SimCoordinator` to use `Postprocessor.CD.orchestrate_diagnostics`
- **Why now**: Self-contained, relatively small

### PR 6: Move plotting functions to Postprocessor
- Create `Postprocessor/` directory
- Move `diagnostics_plots.jl` and `debug_plots.jl` to `Postprocessor/`
- Update includes in `Postprocessor.jl`
- Update `postprocess_sim` to call these functions

### PR 7: Move leaderboard functions to Postprocessor
- Move `leaderboard/` directory to `Postprocessor/leaderboard/`
- Update includes in `Postprocessor.jl`
- Update `postprocess_sim` to call leaderboard functions

### PR 8: Move benchmarks to Postprocessor
- Move `benchmarks.jl` to `Postprocessor/`
- Update includes in `Postprocessor.jl`

### PR 9: Move Postprocessor.jl into Postprocessor/ folder
- Move `Postprocessor.jl` to `Postprocessor/Postprocessor.jl`
- Update include path in `ClimaCoupler.jl`
- Cleanup

### PR 10: Cleanup and final touches
- Delete old files (`cli_options.jl`, etc.)
- Update any remaining references
- Move `get_field` methods to appropriate component files
- Final cleanup

## Tips for Managing Interdependencies:

1. **Use feature flags or temporary compatibility shims** if needed
2. **Keep old files temporarily** and add deprecation warnings
3. **Test each PR independently** - each should be functional on its own
4. **Update imports gradually** - you can keep both old and new imports working temporarily

The key is to move in dependency order: start with leaf modules (Checkpointer), then modules that depend on them (Input), then modules that depend on those (Postprocessor).
23 changes: 21 additions & 2 deletions experiments/ClimaEarth/components/atmosphere/climaatmos.jl
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@ if pkgversion(CA) < v"0.28.6"
CC.Adapt.@adapt_structure CA.RRTMGPInterface.RRTMGPModel
end

include("../shared/restore.jl")

###
### Functions required by ClimaCoupler.jl for an AtmosModelSimulation
Expand Down Expand Up @@ -140,7 +139,7 @@ end

function Checkpointer.restore_cache!(sim::ClimaAtmosSimulation, new_cache)
comms_ctx = ClimaComms.context(sim.integrator.u.c)
restore!(
Checkpointer.restore!(
Checkpointer.get_model_cache(sim),
new_cache,
comms_ctx;
Expand Down Expand Up @@ -735,3 +734,23 @@ function climaatmos_restart_path(output_dir_root, t)
end
error("Restart file for time $t not found")
end

###
### Additional accessor functions for debugging ClimaAtmosSimulation
###

# Helper function for specific humidity
specific_humidity(::CA.DryModel, integrator) = [eltype(integrator.u)(0)]
specific_humidity(::Union{CA.EquilMoistModel, CA.NonEquilMoistModel}, integrator) =
integrator.u.c.ρq_tot

# Additional debug fields for ClimaAtmosSimulation
Interfacer.get_field(sim::ClimaAtmosSimulation, ::Val{:ρq_tot}) =
specific_humidity(sim.integrator.p.atmos.moisture_model, sim.integrator)
Interfacer.get_field(sim::ClimaAtmosSimulation, ::Val{:ρe_tot}) = sim.integrator.u.c.ρe_tot

# Plot field names for ClimaAtmosSimulation
# TODO is this the right name? where is this used?
function plot_field_names(sim::ClimaAtmosSimulation)
return (:w, :ρq_tot, :ρe_tot, :liquid_precipitation, :snow_precipitation)
end
3 changes: 1 addition & 2 deletions experiments/ClimaEarth/components/land/climaland_bucket.jl
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@ import ClimaCoupler: Checkpointer, FluxCalculator, Interfacer, FieldExchanger
using NCDatasets
include("climaland_helpers.jl")

include("../shared/restore.jl")

###
### Functions required by ClimaCoupler.jl for a SurfaceModelSimulation
Expand Down Expand Up @@ -411,7 +410,7 @@ end
function Checkpointer.restore_cache!(sim::BucketSimulation, new_cache)
old_cache = Checkpointer.get_model_cache(sim)
comms_ctx = ClimaComms.context(sim.model)
restore!(
Checkpointer.restore!(
old_cache,
new_cache,
comms_ctx,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -533,7 +533,7 @@ end
function Checkpointer.restore_cache!(sim::ClimaLandSimulation, new_cache)
old_cache = Checkpointer.get_model_cache(sim)
comms_ctx = ClimaComms.context(sim.model.soil)
restore!(
Checkpointer.restore!(
old_cache,
new_cache,
comms_ctx,
Expand Down
Binary file removed experiments/ClimaEarth/input/bucket_ic_august.nc
Binary file not shown.
133 changes: 4 additions & 129 deletions experiments/ClimaEarth/setup_run.jl
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,9 @@ import ClimaCoupler:
Checkpointer,
FieldExchanger,
FluxCalculator,
Input,
Interfacer,
Postprocessor,
TimeManager,
Utilities
import ClimaCoupler.Interfacer:
Expand Down Expand Up @@ -88,10 +90,6 @@ dictionary and the simulation-specific configuration dictionary, which allows th
We can additionally pass the configuration dictionary to the component model initializers, which will then override the default settings of the component models.
=#

include("cli_options.jl")
include("user_io/arg_parsing.jl")
include("user_io/postprocessing.jl")
include("user_io/coupler_diagnostics.jl")

"""
CoupledSimulation(config_file)
Expand All @@ -106,7 +104,7 @@ needed to run a coupled simulation.
function CoupledSimulation(
config_file = joinpath(pkgdir(ClimaCoupler), "config/ci_configs/amip_default.yml"),
)
config_dict = get_coupler_config_dict(config_file)
config_dict = Input.get_coupler_config_dict(config_file)
return CoupledSimulation(config_dict)
end

Expand Down Expand Up @@ -146,7 +144,7 @@ function CoupledSimulation(config_dict::AbstractDict)
parameter_files,
era5_initial_condition_dir,
ice_model,
) = get_coupler_args(config_dict)
) = Input.get_coupler_args(config_dict)

# Get default shared parameters from ClimaParams.jl, overriding with any provided parameter files
override_file = CP.merge_toml_files(parameter_files; override = true)
Expand Down Expand Up @@ -596,89 +594,6 @@ function CoupledSimulation(config_dict::AbstractDict)
return cs
end

"""
run!(cs::CoupledSimulation)

Evolve the given simulation, producing plots and other diagnostic information.

Keyword arguments
==================

`precompile`: If `true`, run the coupled simulations for two steps, so that most functions
are precompiled and subsequent timing will be more accurate.
"""
function run!(
cs::CoupledSimulation;
precompile = (cs.tspan[end] > 2 * cs.Δt_cpl + cs.tspan[begin]),
)

## Precompilation of Coupling Loop
# Here we run the entire coupled simulation for two timesteps to precompile several
# functions for more accurate timing of the overall simulation.
precompile && (step!(cs); step!(cs))

## Run garbage collection before solving for more accurate memory comparison to ClimaAtmos
GC.gc()

#=
## Solving and Timing the Full Simulation

This is where the full coupling loop, `solve_coupler!` is called for the full timespan of the simulation.
We use the `ClimaComms.@elapsed` macro to time the simulation on both CPU and GPU, and use this
value to calculate the simulated years per day (SYPD) of the simulation.
=#
@info "Starting coupling loop"
walltime = ClimaComms.@elapsed ClimaComms.device(cs) begin
s = CA.@timed_str begin
while cs.t[] < cs.tspan[end]
step!(cs)
end
end
end
@info "Simulation took $(walltime) seconds"

sypd = simulated_years_per_day(cs, walltime)
walltime_per_step = walltime_per_coupling_step(cs, walltime)
@info "SYPD: $sypd"
@info "Walltime per coupling step: $(walltime_per_step)"
save_sypd_walltime_to_disk(cs, walltime)

# Close all diagnostics file writers
isnothing(cs.diags_handler) ||
foreach(diag -> close(diag.output_writer), cs.diags_handler.scheduled_diagnostics)
foreach(Interfacer.close_output_writers, cs.model_sims)

return nothing
end

"""
postprocess(cs; conservation_softfail = false, rmse_check = false)

Process the results after a simulation has completed, including generating
plots, checking conservation, and other diagnostics.
All postprocessing is performed using the root process only, if applicable.

When `conservation_softfail` is true, throw an error if conservation is not
respected.

When `rmse_check` is true, compute the RMSE against observations and test
that it is below a certain threshold.

The postprocessing includes:
- Energy and water conservation checks (if running SlabPlanet with checks enabled)
- Animations (if not running in MPI)
- AMIP plots of the final state of the model
- Error against observations
- Optional additional atmosphere diagnostics plots
- Plots of useful coupler and component model fields for debugging
"""
function postprocess(cs; conservation_softfail = false, rmse_check = false)
if ClimaComms.iamroot(ClimaComms.context(cs)) && !isnothing(cs.diags_handler)
postprocessing_vars = (; conservation_softfail, rmse_check)
postprocess_sim(cs, postprocessing_vars)
end
return nothing
end

"""
setup_and_run(config_dict)
Expand All @@ -701,43 +616,3 @@ function setup_and_run(config_dict)
run!(cs)
return cs
end

"""
step!(cs::CoupledSimulation)

Take one coupling step forward in time.

This function runs the component models sequentially, and exchanges combined fields and
calculates fluxes using the selected turbulent fluxes option. Note, one coupling step might
require multiple steps in some of the component models.
"""
function step!(cs::CoupledSimulation)
# Update the current time
cs.t[] += cs.Δt_cpl

## compute global energy and water conservation checks
## (only for slabplanet if tracking conservation is enabled)
ConservationChecker.check_conservation!(cs)

## step component model simulations sequentially for one coupling timestep (Δt_cpl)
FieldExchanger.step_model_sims!(cs)

## update the surface fractions for surface models
FieldExchanger.update_surface_fractions!(cs)

## exchange all non-turbulent flux fields between models, including radiative and precipitation fluxes
FieldExchanger.exchange!(cs)

## calculate turbulent fluxes in the coupler and update the model simulations with them
FluxCalculator.turbulent_fluxes!(cs)

## compute any ocean-sea ice fluxes
FluxCalculator.ocean_seaice_fluxes!(cs)

## Maybe call the callbacks
TimeManager.callbacks!(cs)

# Compute coupler diagnostics
CD.orchestrate_diagnostics(cs)
return nothing
end
Loading