-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Try to debug DGO issue with Int64 #292
Conversation
amontoison
commented
Jun 7, 2024
•
edited
Loading
edited
@nimgould I was able to connect to the virtual machine and reproduce the error with You just need to uncomment the following lines here and after that you can click on the yellow build of |
OK, I am on ... but now what. How do I test an individual package? Am I supposed to use a meson command? I don't know what it/they are ... I'll need to keep editing files, recompiling them and then run the dgo test. Sorry, I need help to proceed. Oh, and I see that the shell has no emacs, so I'll be pretty helpless |
Sorry, Just read the README, now I see how to do this. Still no usable editor, though. And the issue on the macos is to do with ssids, not dgo. Indeed, none of the failures are now for dgo, sheesh, this action system is so maddening! I've now re-commented the ssh workflow out. |
I am trying to see what is going wrong wth nvfortran. I tried this locally: CC=nvc CXX=nvc++ FC=nvfortran meson setup builddir/pc64.lnx.nvf_64 -Dc_std=none -Dcpp_std=none -Dgalahad_int64=true ... which is ok until [519/1348] Compiling Fortran object li...on-generated_single_cutest_dummy.f90.o On examing the geneated libgalahad_single_64.so.p/single_cutest_dummy.f90 file, I see on line 644 and onwards that it has inserted the cutest_routines.h header file verbatim, i.e., /*
/*
Poor old fortran can make no sense of this, and it doesn't happen with other compilers (it leaves the cpp header files alone) Any ideas? |
I commented out the nvidia tests as the copmpiler clearly has issues and isn't ready for proper deployment; it was unable to resolve generic interfaces in many places (and all the other compilers had no issues) |
I think the best solution is to add multiple |
@nimgould I wonder if the issue is not just with the |
That is possible, I suppose, but then why doesn't the compiler object that the variable is the wrong type for the write function? Moreover, this would be true for all write statements (in both HSL and GALAHAD), and we don't see warnings from any other runs. I will output the varaibles before the write to check |
Ah ha, bug splatted. It was simply that in the C interface, I had commented out the copy of the hash control components from C to fortran, so they took random values! Of the two remaining failures, both are timeouts. The Windows one looks like it needs a bit more time, not sure about the Mac one, though. I cannot reproduce here, as the same Mac test seems to work |
OK, doubling the timeout cured the Windows issue. Unfortunately, now one of the Ubuntu intel ones is failing (odd that it didn't before, and all that has changed is the timeout!) when testing the Julia. I can see why that might be, and can put in a precaution. The other timeout failure, on the Mac, produces no output from the test (for sbls), so I can't say what is happening. |
"Precaution" works, but now another timeout for the Windows 64bit. Will tihis cycle of inconsistent runtimes ever cease ... I'll double the timeout and try again ... |
I give up ... the more I increase the timeout period, the more runs timeout |
Is there something wrong with these Windows virtual machines? Timeout for nlst_single after 120 seconds, while for the Mac and Ubuntu the run is 0.4 seconds |
And now, not changing a thing, the times dropped to 1 second, and the tests passed. So, only the Mac issue to sort out. |
If we have a timeout, it means that we have an infinite recursion during the test. |
No, this is all deterministic. Times vary considerably during both compilation and runs |
Sometimes it times out, others it doesn't, with a factor of 10 in different times |