Skip to content

Tempo version 3#353

Open
AndersJensen-NOAA wants to merge 31 commits into
ufs-community:ufs/devfrom
AndersJensen-NOAA:tempo_v3
Open

Tempo version 3#353
AndersJensen-NOAA wants to merge 31 commits into
ufs-community:ufs/devfrom
AndersJensen-NOAA:tempo_v3

Conversation

@AndersJensen-NOAA
Copy link
Copy Markdown
Collaborator

@AndersJensen-NOAA AndersJensen-NOAA commented Feb 4, 2026

Description of Changes:

Tempo version 3. See release notes and documentation in the TEMPO: repository https://github.com/NCAR/TEMPO

Tests Conducted:

Tempo Regression tests

Dependencies:

NOAA-EMC/ufsatm#1063
ufs-community/ufs-weather-model#3078

Documentation:

https://ncar.github.io/TEMPO/

Issue (optional):

Contributors (optional):

@AndersJensen-NOAA AndersJensen-NOAA linked an issue Feb 17, 2026 that may be closed by this pull request
@grantfirl
Copy link
Copy Markdown
Collaborator

@AndersJensen-NOAA Could you please fill out the pull request template so that this PR can be reviewed?

Comment thread .gitmodules
@AndersJensen-NOAA
Copy link
Copy Markdown
Collaborator Author

@grantfirl I made a few changes and updated to tempo v3.0.6. On ursa, when I run my regression test, sometimes it fails with oom, but it seems to depend on the node. If I resubmit the job_card it will pass.

@grantfirl
Copy link
Copy Markdown
Collaborator

@grantfirl I made a few changes and updated to tempo v3.0.6. On ursa, when I run my regression test, sometimes it fails with oom, but it seems to depend on the node. If I resubmit the job_card it will pass.

Are you going to merge AndersJensen-NOAA#6?

@grantfirl
Copy link
Copy Markdown
Collaborator

@grantfirl I made a few changes and updated to tempo v3.0.6. On ursa, when I run my regression test, sometimes it fails with oom, but it seems to depend on the node. If I resubmit the job_card it will pass.

The changes that you added were already fixed in AndersJensen-NOAA#6. Plus, it updates to the latest ufs/dev, which needs to get done anyway (and there were many manual merge conflicts). It also fixes the physical constant problem identified in this PR review.

@grantfirl
Copy link
Copy Markdown
Collaborator

Regarding the OOM failures, EPIC code managers have requested that this gets fixed because even if the failures are intermittent, it still messes with code management practices. I think on some GitHub comment you said that TEMPOv3 would fix the OOM failures, but I'm seeing more, not less of these with v3 on Ursa.

@AndersJensen-NOAA
Copy link
Copy Markdown
Collaborator Author

@grantfirl I merged your changes in. I might have messed up the part where you deleted to old TEMPO scheme, so I will go back an fix that.

@AndersJensen-NOAA
Copy link
Copy Markdown
Collaborator Author

also @grantfirl some files were deleted in your PR so do I need a new ccpp config?

@grantfirl
Copy link
Copy Markdown
Collaborator

also @grantfirl some files were deleted in your PR so do I need a new ccpp config?

I think that the only thing that was deleted was the original TEMPO submodule, so unless the ccpp_prebuild_config file was referencing something in the old TEMPO submodule instead of TEMPO_v3, I don't think that should be the case. Plus, I didn't run into any issues during the ccpp_prebuild phase of building, so I think that we should be good.

@AndersJensen-NOAA
Copy link
Copy Markdown
Collaborator Author

@grantfirl: I have everything updated now from ccpp, ufsatm and UFS weather model. Can you take a look and confirm and see if your test works?

@AndersJensen-NOAA
Copy link
Copy Markdown
Collaborator Author

also @grantfirl some files were deleted in your PR so do I need a new ccpp config?

I think that the only thing that was deleted was the original TEMPO submodule, so unless the ccpp_prebuild_config file was referencing something in the old TEMPO submodule instead of TEMPO_v3, I don't think that should be the case. Plus, I didn't run into any issues during the ccpp_prebuild phase of building, so I think that we should be good.

ignore this, I hadn't updated ufsatm which contained the updated ccpp prebuild needed with your ccpp updates.

Comment thread .gitmodules Outdated
branch = main
[submodule "physics/MP/TEMPO/tempo_v3"]
path = physics/MP/TEMPO/tempo_v3
url = https://github.com/grantfirl/TEMPO.git
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be switched to:
url = https://github.com/NCAR/TEMPO
branch = tempo_3.1.0

since the commit hash of TEMPO that you're pointing to is in that branch, right?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but ultimately 3.1.0 will be merged into main.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, but for purposes of this PR chain, will 3.1.0 be merged into main before the chain is merged? Would it make sense to have a PR in TEMPO that merges the tempo_3.1.0 branch into main that is then part of this PR chain?

@grantfirl
Copy link
Copy Markdown
Collaborator

grantfirl commented May 14, 2026

@grantfirl: I have everything updated now from ccpp, ufsatm and UFS weather model. Can you take a look and confirm and see if your test works?

I tried to run the existing TEMPO tests and 2 out of 3 ran to completion with the other (control_p8_ugwpv1_tempo_aerosol_intel) experiencing an OOM error. The run_dir of this test is:

/scratch3/BMC/gmtb/Grant.Firl/stmp2/Grant.Firl/FV3_RT/rt_863677

Edit: I checked out the branches manually. Otherwise, there would have been a problem with the ccpp-physics repo pointing to the wrong branch of TEMPO, as discussed upthread.

@grantfirl
Copy link
Copy Markdown
Collaborator

So, the OOM errors do seem to still be intermittent, but it would certainly be good to have a test that solidly completes every time that is closer to how TEMPO is intended to be used.

@AndersJensen-NOAA
Copy link
Copy Markdown
Collaborator Author

So, the OOM errors do seem to still be intermittent, but it would certainly be good to have a test that solidly completes every time that is closer to how TEMPO is intended to be used.

@grantfirl TEMPO has larger lookup tables than Thompson, so a bit more memory seems to be needed. On ursa, I'm having good luck with runs by adding this to the job_card: #SBATCH --mem=300G. Each ursa node has about 384G of memory (though not all is usable), but when the job is submitted with the current default regression-test settings asking for 150 tasks, the job only asks for 288G of memory. I'm seeing variability in memory usage from 270-296G, so that explains the intermittent failures. I was able to run 4 times in a row without issue when asking for 300G of memory. I'm exploring ways to read in only parts of the larger lookup tables depending on the options used. For example, TEMPO without the hail-aware option only needs access to part of the largest lookup table, so we could save memory by only reading in needed data. I'll put that on the to-do list.

@AndersJensen-NOAA
Copy link
Copy Markdown
Collaborator Author

@grantfirl I just pushed changes to the tempo tests that should fix the OOM issue.

@AndersJensen-NOAA
Copy link
Copy Markdown
Collaborator Author

@grantfirl @dustinswales
I ran regression tests here: /scratch4/BMC/wrfruc/jensen/ufs_tempo_dev_test/tests
I think a few may have failed, but I don't really know how to diagnose these tests, so can I get some help? Thanks!

@grantfirl
Copy link
Copy Markdown
Collaborator

/scratch4/BMC/wrfruc/jensen/ufs_tempo_dev_test/tests

@AndersJensen-NOAA I can take a look. Are you sure that they completed? I don't see the /scratch4/BMC/wrfruc/jensen/ufs_tempo_dev_test/tests/logs/RegressionTests_ursa.log file only a backup from yesterday.

@AndersJensen-NOAA
Copy link
Copy Markdown
Collaborator Author

/scratch4/BMC/wrfruc/jensen/ufs_tempo_dev_test/tests

@AndersJensen-NOAA I can take a look. Are you sure that they completed? I don't see the /scratch4/BMC/wrfruc/jensen/ufs_tempo_dev_test/tests/logs/RegressionTests_ursa.log file only a backup from yesterday.

@grantfirl If they did not complete, then I don't know why they didn't. How do I debug that?

@grantfirl
Copy link
Copy Markdown
Collaborator

/scratch4/BMC/wrfruc/jensen/ufs_tempo_dev_test/tests

@AndersJensen-NOAA I can take a look. Are you sure that they completed? I don't see the /scratch4/BMC/wrfruc/jensen/ufs_tempo_dev_test/tests/logs/RegressionTests_ursa.log file only a backup from yesterday.

@grantfirl If they did not complete, then I don't know why they didn't. How do I debug that?

When you ran rt.sh, did you save a log of the output?

I'm seeing compilation failures in TEMPO. I see a bunch of fail_compile_* listed in your test directory. If you go to the run_dir and find the associated directory for the failing test, e.g. compile_atm_dyn32_phy32_debug_gnu, look at the err and out files to find the compilation failures.

@grantfirl
Copy link
Copy Markdown
Collaborator

It looks like the TEMPO tests (control_p8_ugwpv1_tempo_aerosol_intel and control_p8_ugwpv1_tempo_intel, regional_wofs_tempo_intel) completed successfully. They failed due to the result change, which is expected.

control_wam_debug_gnu failed due to a time-out, which happens occasionally and usually isn't our fault.

cpld_debug_sfs_intel, cpld_debug_sfs_intel, cpld_debug_sfs_intelllvm failed due to OOM.

It looks like the compilation failures are related to real type errors. It looks like all of the compilation failures are with tests that are varying the default real types. So, I would just doublecheck any recent changes in TEMPO related to real types.

@AndersJensen-NOAA
Copy link
Copy Markdown
Collaborator Author

It looks like the TEMPO tests (control_p8_ugwpv1_tempo_aerosol_intel and control_p8_ugwpv1_tempo_intel, regional_wofs_tempo_intel) completed successfully. They failed due to the result change, which is expected.

control_wam_debug_gnu failed due to a time-out, which happens occasionally and usually isn't our fault.

cpld_debug_sfs_intel, cpld_debug_sfs_intel, cpld_debug_sfs_intelllvm failed due to OOM.

It looks like the compilation failures are related to real type errors. It looks like all of the compilation failures are with tests that are varying the default real types. So, I would just doublecheck any recent changes in TEMPO related to real types.

@grantfirl Thanks! I'll compile the cpld_debug tests and see if I can find the issue.

@AndersJensen-NOAA
Copy link
Copy Markdown
Collaborator Author

@grantfirl @dustinswales

I need help getting these two regression tests to pass:
fail_compile_atm_mpas_dyn32_gnu
fail_compile_atm_mpas_dyn32_debug_gnu

I modified GFS_rrtmpg_pre for the TEMPO hookup, and now the two mpas tests fail. It appears that those tests are radiation only physics tests specific to MPAS. Since we aren't using MPAS yet in the UFS and since these aren't actually full physics tests, I think those tests should actually be turned off.

If not, can one of you fix the TEMPO hookup on the CCPP side?

Or better yet, address #361

Thanks.

@grantfirl
Copy link
Copy Markdown
Collaborator

@grantfirl @dustinswales

I need help getting these two regression tests to pass: fail_compile_atm_mpas_dyn32_gnu fail_compile_atm_mpas_dyn32_debug_gnu

I modified GFS_rrtmpg_pre for the TEMPO hookup, and now the two mpas tests fail. It appears that those tests are radiation only physics tests specific to MPAS. Since we aren't using MPAS yet in the UFS and since these aren't actually full physics tests, I think those tests should actually be turned off.

If not, can one of you fix the TEMPO hookup on the CCPP side?

Or better yet, address #361

Thanks.

We use those tests to keep the MPAS-in-UFS functionality working, so they need to stay.

It looks like your RTs were run with the wrong (old) version of TEMPO checked out. I addressed the comment in #353 (comment). Please try to pull down the latest commit of this PR branch with the .gitmodules fix and run git submodule update --init --recursive from the ccpp/physics directory. Then, make sure that https://github.com/NCAR/TEMPO/tree/b1ee10c4e53f5cb9b68ac1c4e770cbe126d1e24e is actually checked out in the TEMPO submodule.

Once that's done, please run those 2 failing mpas-based tests again. I'm guessing that it'll fix them.

@AndersJensen-NOAA
Copy link
Copy Markdown
Collaborator Author

@grantfirl @dustinswales
I need help getting these two regression tests to pass: fail_compile_atm_mpas_dyn32_gnu fail_compile_atm_mpas_dyn32_debug_gnu
I modified GFS_rrtmpg_pre for the TEMPO hookup, and now the two mpas tests fail. It appears that those tests are radiation only physics tests specific to MPAS. Since we aren't using MPAS yet in the UFS and since these aren't actually full physics tests, I think those tests should actually be turned off.
If not, can one of you fix the TEMPO hookup on the CCPP side?
Or better yet, address #361
Thanks.

We use those tests to keep the MPAS-in-UFS functionality working, so they need to stay.

It looks like your RTs were run with the wrong (old) version of TEMPO checked out. I addressed the comment in #353 (comment). Please try to pull down the latest commit of this PR branch with the .gitmodules fix and run git submodule update --init --recursive from the ccpp/physics directory. Then, make sure that https://github.com/NCAR/TEMPO/tree/b1ee10c4e53f5cb9b68ac1c4e770cbe126d1e24e is actually checked out in the TEMPO submodule.

Once that's done, please run those 2 failing mpas-based tests again. I'm guessing that it'll fix them.

@grantfirl @dustinswales
I need help getting these two regression tests to pass: fail_compile_atm_mpas_dyn32_gnu fail_compile_atm_mpas_dyn32_debug_gnu
I modified GFS_rrtmpg_pre for the TEMPO hookup, and now the two mpas tests fail. It appears that those tests are radiation only physics tests specific to MPAS. Since we aren't using MPAS yet in the UFS and since these aren't actually full physics tests, I think those tests should actually be turned off.
If not, can one of you fix the TEMPO hookup on the CCPP side?
Or better yet, address #361
Thanks.

We use those tests to keep the MPAS-in-UFS functionality working, so they need to stay.

It looks like your RTs were run with the wrong (old) version of TEMPO checked out. I addressed the comment in #353 (comment). Please try to pull down the latest commit of this PR branch with the .gitmodules fix and run git submodule update --init --recursive from the ccpp/physics directory. Then, make sure that https://github.com/NCAR/TEMPO/tree/b1ee10c4e53f5cb9b68ac1c4e770cbe126d1e24e is actually checked out in the TEMPO submodule.

Once that's done, please run those 2 failing mpas-based tests again. I'm guessing that it'll fix them.

Still failed:
/scratch4/BMC/wrfruc/Anders.Jensen/RT_RUNDIRS/Anders.Jensen/FV3_RT/rt_3613567/compile_atm_mpas_dyn32_debug_gnu/

@AndersJensen-NOAA
Copy link
Copy Markdown
Collaborator Author

Any other ideas, @grantfirl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optional arguments

7 participants