Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring tracers butzin second try #576

Merged
merged 17 commits into from
Apr 3, 2024

Conversation

JanStreffing
Copy link
Collaborator

@JanStreffing JanStreffing commented Mar 14, 2024

superseds: #572 build via patch, rather than rebase. Reduction of MR from +8,685 −150 to +700 −7 lines

@JanStreffing JanStreffing added the help wanted Extra attention is needed label Mar 14, 2024
@JanStreffing JanStreffing added this to the FESOM 2.6 milestone Mar 14, 2024
@JanStreffing JanStreffing mentioned this pull request Mar 14, 2024
@mbutzin
Copy link
Collaborator

mbutzin commented Mar 19, 2024

superseds: #572 build via patch, rather than rebase. Reduction of MR from +8,685 −150 to +700 −7 lines

After my latest commits we now have version which, at least, can be successfully compiled. Whether the results will make sense is another question ..

@JanStreffing
Copy link
Collaborator Author

JanStreffing commented Mar 19, 2024

superseds: #572 build via patch, rather than rebase. Reduction of MR from +8,685 −150 to +700 −7 lines

After my latest commits we now have version which, at least, can be successfully compiled. Whether the results will make sense is another question ..

I guess you are compiling with intel. The GNU based test case still fails:


[ 99%] Building Fortran object src/CMakeFiles/fesom.dir/oce_setup_step.F90.o
cd /__w/fesom2/fesom2/build/src && /usr/bin/mpifort -DMETISRANDOMSEED=35243 -DMETIS_VERSION=5 -DPARMS -DPART_WEIGHTED -D__async_icebergs -I/usr/include -I/__w/fesom2/fesom2/lib/parms/src/../include -I/__w/fesom2/fesom2/src/async_threads_cpp -I/__w/fesom2/fesom2/build/src/async_threads_cpp  -g   -O2 -g -ffloat-store -finit-local-zero -finline-functions -fimplicit-none -fdefault-real-8 -ffree-line-length-none -c /__w/fesom2/fesom2/src/oce_setup_step.F90 -o CMakeFiles/fesom.dir/oce_setup_step.F90.o
/__w/fesom2/fesom2/src/oce_setup_step.F90:398:6:

  398 |       idlist((n_ic3d+1):(n_ic3d+3)) = (/14/)
      |      1
Error: Different shape for array assignment at (1) on dimension 1 (3 and 1)
/__w/fesom2/fesom2/src/oce_setup_step.F90:399:6:

  399 |       filelist((n_ic3d+1):(n_ic3d+3)) = (/'R14C.nc'/)
      |      1
Error: Different shape for array assignment at (1) on dimension 1 (3 and 1)
/__w/fesom2/fesom2/src/oce_setup_step.F90:400:6:

  400 |       varlist((n_ic3d+1):(n_ic3d+3))  = (/'R14C'/)
      |      1
Error: Different shape for array assignment at (1) on dimension 1 (3 and 1)
make[2]: *** [src/CMakeFiles/fesom.dir/build.make:1425: src/CMakeFiles/fesom.dir/oce_setup_step.F90.o] Error 1
make[2]: Leaving directory '/__w/fesom2/fesom2/build'
make[1]: *** [CMakeFiles/Makefile2:142: src/CMakeFiles/fesom.dir/all] Error 2
make[1]: Leaving directory '/__w/fesom2/fesom2/build'
make: *** [Makefile:133: all] Error 2
Error: Process completed with exit code 2.

@mbutzin
Copy link
Collaborator

mbutzin commented Mar 19, 2024

superseds: #572 build via patch, rather than rebase. Reduction of MR from +8,685 −150 to +700 −7 lines

After my latest commits we now have version which, at least, can be successfully compiled. Whether the results will make sense is another question ..

I guess you are compiling with intel. The GNU based test case still fails:


[ 99%] Building Fortran object src/CMakeFiles/fesom.dir/oce_setup_step.F90.o
cd /__w/fesom2/fesom2/build/src && /usr/bin/mpifort -DMETISRANDOMSEED=35243 -DMETIS_VERSION=5 -DPARMS -DPART_WEIGHTED -D__async_icebergs -I/usr/include -I/__w/fesom2/fesom2/lib/parms/src/../include -I/__w/fesom2/fesom2/src/async_threads_cpp -I/__w/fesom2/fesom2/build/src/async_threads_cpp  -g   -O2 -g -ffloat-store -finit-local-zero -finline-functions -fimplicit-none -fdefault-real-8 -ffree-line-length-none -c /__w/fesom2/fesom2/src/oce_setup_step.F90 -o CMakeFiles/fesom.dir/oce_setup_step.F90.o
/__w/fesom2/fesom2/src/oce_setup_step.F90:398:6:

  398 |       idlist((n_ic3d+1):(n_ic3d+3)) = (/14/)
      |      1
Error: Different shape for array assignment at (1) on dimension 1 (3 and 1)
/__w/fesom2/fesom2/src/oce_setup_step.F90:399:6:

  399 |       filelist((n_ic3d+1):(n_ic3d+3)) = (/'R14C.nc'/)
      |      1
Error: Different shape for array assignment at (1) on dimension 1 (3 and 1)
/__w/fesom2/fesom2/src/oce_setup_step.F90:400:6:

  400 |       varlist((n_ic3d+1):(n_ic3d+3))  = (/'R14C'/)
      |      1
Error: Different shape for array assignment at (1) on dimension 1 (3 and 1)
make[2]: *** [src/CMakeFiles/fesom.dir/build.make:1425: src/CMakeFiles/fesom.dir/oce_setup_step.F90.o] Error 1
make[2]: Leaving directory '/__w/fesom2/fesom2/build'
make[1]: *** [CMakeFiles/Makefile2:142: src/CMakeFiles/fesom.dir/all] Error 2
make[1]: Leaving directory '/__w/fesom2/fesom2/build'
make: *** [Makefile:133: all] Error 2
Error: Process completed with exit code 2.

Correct, intel@albedo.

@mbutzin
Copy link
Collaborator

mbutzin commented Mar 19, 2024

(n_ic3d+1):(n_ic3d+3)
I think this should read
(n_ic3d+1):(n_ic3d+1)
and pushed a corresponding commit ... What does GNU say?

@JanStreffing
Copy link
Collaborator Author

Good progress! The test case compiles, at runtime there is still an issue though. Btw, you can check the tests below this post. I've been copying and pasting the error from there.

  --> FESOM STARTS TIME LOOP                                 
 Updating SSS restoring data for month            1
  --check opt_visc--> Mean Ratio Resol/Rrossby =    5.0000000000000000     

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0xf20170b5d21 in ???
#1  0xf20170b4ef5 in ???
#2  0xf2016d0d08f in ???
#0  0x148098b00d21 in ???
#1  0x148098affef5 in ???
#2  0x14809875808f in ???
#3  0x5abd48913847 in diff_tracers_ale_
	at /__w/fesom2/fesom2/src/associate_mesh_ass.h:33
#4  0x5abd48912800 in solve_tracers_ale_
	at /__w/fesom2/fesom2/src/oce_ale_tracer.F90:251
#3  0x5d6016874847 in diff_tracers_ale_
	at /__w/fesom2/fesom2/src/associate_mesh_ass.h:33
#4  0x5d6016873800 in solve_tracers_ale_
	at /__w/fesom2/fesom2/src/oce_ale_tracer.F90:251
#5  0x5abd488f0448 in oce_timestep_ale_
	at /__w/fesom2/fesom2/src/oce_ale.F90:3248
#5  0x5d6016851448 in oce_timestep_ale_
	at /__w/fesom2/fesom2/src/oce_ale.F90:3248
#6  0x5abd487dff18 in __fesom_module_MOD_fesom_runloop
	at /__w/fesom2/fesom2/src/fesom_module.F90:511
#7  0x5abd4878fe2f in MAIN__
	at /__w/fesom2/fesom2/src/fesom_main.F90:15
#8  0x5abd4878fe2f in main
	at /__w/fesom2/fesom2/src/fesom_main.F90:10
#6  0x5d6016740f18 in __fesom_module_MOD_fesom_runloop
	at /__w/fesom2/fesom2/src/fesom_module.F90:511
#7  0x5d60166f0e2f in MAIN__
	at /__w/fesom2/fesom2/src/fesom_main.F90:15
#8  0x5d60166f0e2f in main
	at /__w/fesom2/fesom2/src/fesom_main.F90:10
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 0 on node d14426014abc exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

Is this a good enough hint to figure out what happened? oce_ale_tracer.F90:251

@JanStreffing
Copy link
Collaborator Author

Great the checks have passed.

@JanStreffing
Copy link
Collaborator Author

@patrickscholz Ready for review

@patrickscholz patrickscholz merged commit 35bcfe7 into refactoring Apr 3, 2024
4 checks passed
@JanStreffing JanStreffing deleted the refactoring_tracers_butzin_second_try branch August 15, 2024 12:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants