-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI with Julia v1.11 #3836
base: main
Are you sure you want to change the base?
CI with Julia v1.11 #3836
Conversation
I installed julia 1.11 on the Caltech cluster, but we haven't made a module yet (but it's coming today or so) |
There's only one manifest? |
I assume he's referring to this new feature of Julia 1.11: https://julialang.org/blog/2024/10/julia-1.11-highlights/#manifest_versioning |
The PR adds another Manifest specifically for v1.11 and keeps the older Manifest that works for v1.10. Or we can just have one Manifest (the one for v1.11) and drop the one for v1.10 |
The Manifest was deleted in #3783 |
oh great! I missed that! |
Deleting it seemed to help increase the likelihood that CI passed. Although, it did not fully solve the problem (and note a few other changes were also made on #3783). |
Noting that internal_tide.jl gives NaN with Julia v1.11 while all is OK with Julia v1.10; something with immersed boundaries....? I'm looking into it. |
I think it's a plotting issue. We are filling up the immersed boundaries with NaN and, apparently, we cannot plot NaNs anymore? The error says: ERROR: LoadError: On worker 2:
| Looking up a non-finite or NaN value in a colormap is undefined. |
I ran the script and the actual simulation NaN-ed. |
That means Oceanangians isn't compatible with julia 1.11. Do any other tests catch the issue? We can use this opportunity to add more tests. |
I’m trying to make an mwe |
Good point! 👍🏼 Good to keep that in mind! But the differences we see in the internal_tide.jl example shouldn't be due to random number generator. |
One possibility is that syntax changed for something in a subtle way, so the code still runs but some function is being called incorrectly. Not sure what that could be though |
Here a list of changes: the change to We should check if there are changes only on CPU, or on both CPU and GPU. @ali-ramadhan I'm assuming your test was on CPU. |
Good find. Is there a way to redefine/import |
I don't think we use |
I can't reproduce this |
@glwagner Does the I can try to reproduce on a different machine to confirm. Debugger.jl probably won't get fixed soon so if I find some time this week I can step through with Julia 1.10 and just do some good old print debugging between the two versions. |
In the test I did, the difference was identically 0. However, I just noticed that I was not using the latest On 0.95.5, it looks like internal_tide.mp4I have an Mac M1 Max and using 1.11.2: julia> versioninfo()
Julia Version 1.11.2
Commit 5e9a32e7af2 (2024-12-01 20:02 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: macOS (arm64-apple-darwin24.0.0)
CPU: 10 × Apple M1 Max
WORD_SIZE: 64
LLVM: libLLVM-16.0.6 (ORCJIT, apple-m1)
Threads: 1 default, 0 interactive, 1 GC (on 8 virtual cores)
Environment:
JULIA_NUM_PRECOMPILE_TASKS = 6
LD_LIBRARY_PATH = /Users/gregorywagner/Software/hdf5-1.14.2/lib:
JULIA_EDITOR = vim I'll redo these tests on the current |
Is there a small-ish reproducer to look at? |
Also on my side, on my mac laptop, the internal_tide.mp4 |
That sounds good! Let's see if the docs built now! |
We need to make one and something relatively small should be possible. It looks like its not an issue on Mac so we have to try to reproduce the docs failure on the machine we use for CI. I can try in a few days. |
That's fixed by #4093. |
Ok, looks like the error has progressed. Here's the CPU enzyme error now: oceananigans_build_20621_cpu-enzyme-extension-tests.log the top is
|
I suggest we put a compat entry for Julia 1.10. Otherwise people use Julia v1.11 and run into problems! |
This PR switches the CI to use Julia v1.11.
It also adds a Manifest with
v1.11
ending so that there is still compatibility with previous versions.Note the the distributed CI still does not have Julia v1.11 (right @Sbozzolo?) so there Julia v1.10 is used. This is possible because there are two Manifests.