Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sixth reconciliation PR from production/RRFS.v1 #365

Open
wants to merge 3 commits into
base: dev/emc
Choose a base branch
from

Conversation

grantfirl
Copy link

Description

This PR is identical to NOAA-EMC#89 from @MatthewPyle-NOAA

Provides the atmos_cubed_sphere changes needed to take advantage of FMS parallel IO changes. Changes courtesy of Dan Kokron of GDIT.

Fixes # (issue)

How Has This Been Tested?

UFS WM regression tests on Hera using full rt.conf file.

Checklist:

Please check all whether they apply or not

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • Any dependent changes have been merged and published in downstream modules

Copy link
Contributor

@bensonr bensonr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@grantfirl
Copy link
Author

@bensonr @XiaqiongZhou-NOAA You may want to re-review. I made one small change to turn ENABLE_PARALLELRESTART to OFF by default in CMakeLists.txt since turning it on caused UFS WM RT failures on some platforms. Turning this feature off by default was deemed a good temporary solution to get the functionality merged into develop for platforms that need it and use it, although they'll need to make sure to turn it on in their workflows somehow and not rely on the default behavior to do it for them. See NOAA-EMC/fv3atm#896 for discussion.

@laurenchilutti
Copy link
Contributor

@XiaqiongZhou-NOAA @bensonr could you please give this one more review.
Thanks!

@bensonr
Copy link
Contributor

bensonr commented Jan 24, 2025

@grantfirl - as I look at this, I can't help but think this is over-engineered and most likely mostly unnecessary. So I think a call to discuss is needed.

@grantfirl
Copy link
Author

@bensonr Sounds good. I know very little about the functionality being added here, but I believe that it was implemented in context of the RRFS application, so I'm guessing that the meeting should contain at least @MatthewPyle-NOAA and @dkokron. My only involvement up to this point was to bring all development from the production/RRFS.v1 branch into develop, so I'm not sure that I need to be involved other than to know what to do code management-wise (i.e. whether to continue to include this in the development trying to get into the develop branch of fv3atm or not).

@bensonr
Copy link
Contributor

bensonr commented Jan 24, 2025

@grantfirl - that all sounds good to me and have them put something on my calendar (it's up-to-date)

@jkbk2004
Copy link

jkbk2004 commented Feb 3, 2025

@bensonr @XiaqiongZhou-NOAA @laurenchilutti any chance to get this pr moving with the review?

@laurenchilutti
Copy link
Contributor

@bensonr @grantfirl Has a meeting been set up for this?

@grantfirl
Copy link
Author

@bensonr @grantfirl Has a meeting been set up for this?

I've asked @dkokron and @MatthewPyle-NOAA to coordinate this since I'm not really involved in this work other than to move development from the RRFS branches to the develop branches.

@bensonr
Copy link
Contributor

bensonr commented Feb 5, 2025

@dkokron reached out to me, I explained that I think this can all be done in a cleaner way. To date, no call has been scheduled.

@dkokron
Copy link

dkokron commented Feb 5, 2025 via email

@grantfirl
Copy link
Author

grantfirl commented Feb 5, 2025

Grant, What is the sticking point for this merge? I don't see why the merge cannot move forward with the proposed workaround of disabling the parallelIO feature unless the user is on WCOSS2 Dan

On Wed, Feb 5, 2025 at 9:46 AM Rusty Benson @.> wrote: @dkokron https://github.com/dkokron reached out to me, I explained that I think this can all be done in a cleaner way. To date, no call has been scheduled. — Reply to this email directly, view it on GitHub <#365 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACODV2FJT54AEJDWOPKEHRD2OIW5NAVCNFSM6AAAAABTCZY5GWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMZXGI4TMMZRGA . You are receiving this because you were mentioned.Message ID: @.>

This PR can't be merged if it isn't approved. It's up to atmos_cubed_sphere code managers whether they want to merge this as-is and return afterwards to address any issues. @bensonr @XiaqiongZhou-NOAA @laurenchilutti have control over what gets merged into the dev/emc branch.

@bensonr
Copy link
Contributor

bensonr commented Feb 7, 2025

I'll approve this for now, but my recommendation is to start changing the culture of "just get it done" programming into one that accentuates extensibility, readability, and maintainability by creating the needed domain options during initialization in routine domain_deomp with proper inline documentation:

  • domain with the default io_layout
  • domain_for_read_all with io_layout=layout
  • domain_for_write_one with io_layout=(1,1)

This will eliminate most, if not all, of the special code in external_ic and fv_io required to create specialized domains. You could then go a step further and base your read and write actions on namelists and if-tests instead of compile time macros.

@lharris4
Copy link
Contributor

lharris4 commented Feb 7, 2025

I'll approve this for now, but my recommendation is to start changing the culture of "just get it done" programming into one that accentuates extensibility, readability, and maintainability by creating the needed domain options during initialization in routine domain_deomp with proper inline documentation:

I strongly agree with @bensonr 's recommendation. I have all-too-often been bitten by one-off codes and usually find that a generic code with proper branching and error handling makes everyone's efforts far easier, and often has benefits far beyond the task at hand.

@grantfirl
Copy link
Author

Thanks @lharris4 @bensonr @XiaqiongZhou-NOAA @laurenchilutti. I agree with all sentiments and would also prefer not to see temporary solutions merged. Apologies if undue pressure to approve was applied. The deadline applying the pressure is for the SRW App release, and, as far as I know, it would be acceptable to omit these changes if necessary to grant more time to work on this particular issue and "get it right". It is easy enough to decouple the rest of the changes in other repos associated with this PR and move forward with those if necessary. It's probably "easier" to merge as-is and address concerns afterwards, but, I'm not sure who has a mandate to do that if the original author of the changes is no longer funded to work on it.

All of this to say, you all should exercise control as you see fit. If you're still OK letting this through, that's fine, but, as you know, it is also completely acceptable to reject this as-is until it meets your standards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants