Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[INSTALL]: PIO 2.6.2 in 1.6.0 on Gaea-C5 #1395

Open
DeniseWorthen opened this issue Dec 5, 2024 · 10 comments
Open

[INSTALL]: PIO 2.6.2 in 1.6.0 on Gaea-C5 #1395

DeniseWorthen opened this issue Dec 5, 2024 · 10 comments
Assignees
Labels
NOAA-EMC OAR-EPIC NOAA Oceanic and Atmospheric Research and Earth Prediction Innovation Center

Comments

@DeniseWorthen
Copy link

Package name

parallelio

Package version/tag

2.6.2

Build options

current

Installation timeframe

Please install in 1.6 on Gaea C5 for testing (a bug fix was made after the current 2.5.10).

Other information

No response

@climbfuji climbfuji added NOAA-EMC OAR-EPIC NOAA Oceanic and Atmospheric Research and Earth Prediction Innovation Center labels Dec 5, 2024
@RatkoVasic-NOAA
Copy link
Collaborator

@DeniseWorthen
It is installed on Gaea - C5, you can use:
/ncrc/proj/epic/spack-stack/spack-stack-1.6.0/envs/pio-2.6.2-addon/install/modulefiles/Core

@DeniseWorthen
Copy link
Author

Thanks @RatkoVasic-NOAA. Let me try it.

@DeniseWorthen
Copy link
Author

@RatkoVasic-NOAA Is there anything else I need to do? I'm getting an error

/gpfs/f5/nggps_emc/scratch/Denise.Worthen/RT_RUNDIRS/Denise.Worthen/FV3_RT/rt_3691697/cpld_bmark_p8_intel/./fv3.exe: error while loading shared libraries: libmpifort_intel.so.12: cannot open shared object file: No such file or directory

I have in my modulefiles

diff --git a/modulefiles/ufs_common.lua b/modulefiles/ufs_common.lua
index 062fa384..7280bda3 100644
--- a/modulefiles/ufs_common.lua
+++ b/modulefiles/ufs_common.lua
@@ -9,7 +9,7 @@ local ufs_modules = {
   {["hdf5"]            = "1.14.0"},
   {["netcdf-c"]        = "4.9.2"},
   {["netcdf-fortran"]  = "4.6.1"},
-  {["parallelio"]      = "2.5.10"},
+  {["parallelio"]      = "2.6.2"},
   {["esmf"]            = "8.6.0"},
   {["fms"]             = "2024.01"},
   {["bacio"]           = "2.4.1"},
diff --git a/modulefiles/ufs_gaea.intel.lua b/modulefiles/ufs_gaea.intel.lua
index 834c8fc4..888493ed 100644
--- a/modulefiles/ufs_gaea.intel.lua
+++ b/modulefiles/ufs_gaea.intel.lua
@@ -5,7 +5,7 @@ help([[

 whatis([===[Loads libraries needed for building the UFS Weather Model on Gaea ]===])

-prepend_path("MODULEPATH", "/ncrc/proj/epic/spack-stack/spack-stack-1.6.0/envs/fms-2024.01/install/modulefiles/Core")
+prepend_path("MODULEPATH", "/ncrc/proj/epic/spack-stack/spack-stack-1.6.0/envs/pio-2.6.2-addon/install/modulefiles/Core")

 stack_intel_ver=os.getenv("stack_intel_ver") or "2023.2.0"
 load(pathJoin("stack-intel", stack_intel_ver))

@RatkoVasic-NOAA
Copy link
Collaborator

@DeniseWorthen I can try to replicate. Which branch do you use? Current develop? Test is cpld_bmark_p8_intel, right?

@RatkoVasic-NOAA
Copy link
Collaborator

@DeniseWorthen can you open permission for /gpfs/f5/nggps_emc/scratch/Denise.Worthen/

chmod 755 /gpfs/f5/nggps_emc/scratch/Denise.Worthen/

@DeniseWorthen
Copy link
Author

@RatkoVasic-NOAA Yes, I used top-develop. I ran a sub-set of tests. There was a mixture of running/passing and others w/ the error.

These passed

rt_cpld_control_p8_mixedmode_intel.log:72:Test cpld_control_p8_mixedmode_intel PASS
rt_cpld_control_sfs_intel.log:18:Test cpld_control_sfs_intel PASS
rt_cpld_debug_gfsv17_intel.log:59:Test cpld_debug_gfsv17_intel PASS
rt_cpld_debug_noaero_p8_intel.log:59:Test cpld_debug_noaero_p8_intel PASS
rt_cpld_debug_p8_intel.log:60:Test cpld_debug_p8_intel PASS

I think all the rest failed w/ the error. See for example cpld_control_p8_intel.

@DeniseWorthen
Copy link
Author

Just notices this in the err file for cpld_control_p8

environment: line 17: /opt/cray/pe/lmod/lmod/libexec/lmod: Input/output error
/bin/bash: /opt/cray/pe/lmod/lmod/init/bash: Input/output error
++ date +%s
+ echo -n ' 1733501049,'
+ set +x
environment: line 17: /opt/cray/pe/lmod/lmod/libexec/lmod: Input/output error
environment: line 17: /opt/cray/pe/lmod/lmod/libexec/lmod: Input/output error
environment: line 17: /opt/cray/pe/lmod/lmod/libexec/lmod: Input/output error
environment: line 17: /opt/cray/pe/lmod/lmod/libexec/lmod: Input/output error

@RatkoVasic-NOAA
Copy link
Collaborator

Yes, something is weird. Error with libmpifort - that it does not exist is strange: fv3.exe is linked with existing /opt/cray/pe/mpich/8.1.28/ofi/intel/2022.1/lib/libmpifort_intel.so.12.0.0
library.

@RatkoVasic-NOAA
Copy link
Collaborator

I see, it failed in:

      CALL ESMF_Initialize(configFileName="ufs.configure"               & !<-- top level configuration
                          ,defaultCalKind =ESMF_CALKIND_GREGORIAN       & !<-- Set up the default calendar.
                          ,VM             =VM                           & !<-- The ESMF Virtual Machine
                          ,rc             =RC)

Is that again err with threading?
1082: libmpi_intel.so.1 00007F2344CF9304 PMPI_Init_thread Unknown Unknown

@DeniseWorthen
Copy link
Author

I switched the esmf-managed threading off in cpld_control_p8, and I still the get same error about shared object and the lmod Inupt/ouput error on "line 17" (no clue where/what that is about).

The ESMF_Initialize is just reading the configuration. I think it just happens to die at that line---right when it is trying to start up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NOAA-EMC OAR-EPIC NOAA Oceanic and Atmospheric Research and Earth Prediction Innovation Center
Projects
None yet
Development

No branches or pull requests

6 participants