-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Two-stage compile has 0 file reported at the second stage #371
Comments
The number of files ** reported in 'stage 2 compiled ** files` is incorrect, I have a fix. But compilation in two phases works for me (the mechanism works, it is just printing the wrong number). Looking at the log file, I can't see a problem - the file is compiled twice (first only for interfaces, then fully), used in the right way in the linking command. My feeling is that this crash is unrelated? Don't we have an issue with the CI that if two of them are running at the same time they'll overwrite each other's data? |
There seems to be two additional problem with two-stage compilation (besides reporting the wrong number):
The first one needs some debugging (there seems to be error handling??), for the second one, my current idea is to redirect the module files of stage 2 to a different (temporary) directory, so the mod files from stage 1 are not overwritten. |
The error message are indeed that it can't read a module file ('Error in readingthe compiled module file'), which would fit the above suspicion. I also seem to see that somehow the exception about the error is lost? The return code from compilation is confirmed to be I've added the following logging:
And grepping for the filename, I see:
That's it. I see neither the "No error" nor the "CompileFortran compiling ... ERROR" at all??? Debug print do indeed confirm that tool.run raises the exception. |
I tried to add error capturing to the
|
OK, I found the error, it is indeed a bug in fab. Why my debug logging messages did not show up ... no idea. My best idea is that apparently python logging uses syslog, and that has a limit of 2b messages (and I added quite a bit of logging, command line parameters, compiler output). Maybe it was just coincidence that the messages I added exceeded the buffer length (and then got chopped off). |
Unfortunately, my solution for ifort doesn't work 😢 Parallel compilation for stage2 with ifort still crashes now and again because it is reading an incomplete .mod file. I added a scratch directory for module output path in stage 2 , and an explicit include path to the original stage 1 directory, e.g.:
So it adds Problem seems to be that ifort always searches in the Simple reproducer, where the directory
Removing the
|
I'll try to ask Intel. For now best solution: in phase 2, let each compile process write in its own directory. Therefore no compilation process will ever read anything else from the module path. That means a lot of directories, each with one file (using the source code filename as unique directory name) |
When two-stage compilation is enabled, the build failed for LFRic gungho at the moment with an error
/scratch/hc46/hc46_gitlab/lfric_fab/gungho_model-mpif90-ifort/build_output/_prebuild/physics_mappings_alg_mod.cb6a16b9a.o
not found at link stage.It is observed that the following messges are logged during the compile_fortran step:
Starting two-stage compile: mod files, multiple passes
...
Finalising two-stage compile: object files, single pass
...
stage 2 compiled 0 files
There are actually a number of compiling commands being executed between
Finalising two-stage compile: object files, single pass
andstage 2 compiled 0 files
. So I am wondering whether there is a bug in updating the build tree.The build log is here: https://git.nci.org.au/bom/ngm/lfric/lfric_atm-fab/-/jobs/80826
The text was updated successfully, but these errors were encountered: