Replies: 11 comments 25 replies
-
@iuryt which linux architecture is your cluster, out of curiosity? |
Beta Was this translation helpful? Give feedback.
-
I also just happened across this tool: https://github.com/johnnychen94/jill.py which might be useful to some people. A few notes about determining which binary to download: Issuing the
whereas on a Power 9 system I obtain
(more on that later). For ARM a StackOverflow post might help (and here's a similar post for x86). There are also options for glibc and the musl libc. They say that most users should use glibc. More about musl libc here. I don't know what GPG means so hopefully someone can chime in there. Finally a screenshot for reference: |
Beta Was this translation helpful? Give feedback.
-
I'm not sure... is the issue installing Julia, or installing Oceananigans? How would docker help? |
Beta Was this translation helpful? Give feedback.
-
@glwagner GPG is just a signature for the file. It provides an install check that the download isn't install some malicious software masquerading as an official download. @iuryt I think a challenge you have is the system you are running on has RH6.1 which is quite old. Docker/Singularity could help a bit, but the GPU piece is awkward. Even with Docker/Singularity the GPU piece needs up to date underlying OS drivers. Perhaps you could ping your sys admin folks and find out if they have plans to upgrade? From this https://access.redhat.com/support/policy/updates/errata it looks like RHEL 6 was possibly last fully supported about 6 years ago? |
Beta Was this translation helpful? Give feedback.
-
I could setup Julia-1.6.5 on the local UMassD cluster and while importing Oceananigans, I received this message:
What is weird is that I am currently using |
Beta Was this translation helpful? Give feedback.
-
Another tip is to use interactive slurm sessions to be able to use Julia interactively (thus solving issues with compile time, especially if source code changes and For this I use the command
which I put an alias in my
then typing
at the terminal requests an interactive session on one node with 4 GPUs (4 can be increased to the number of GPUs available per node on the given cluster). I can then use tmux to open 4 panes each with its own environment variable To change the time requested for the interactive job, change Once the interactive job has been allocated, additional terminals may be opened on the node by typing
(at least, this works on the clusters I work with --- if a different |
Beta Was this translation helpful? Give feedback.
-
I am facing similar challenges as @iuryt trying to run on Satori or Stampede2 (CPU only). I have seen discussions about using both HPCs here and there, but all date back a little while. Curious to hear if anyone has been successful at running Oceananigans on either recently! |
Beta Was this translation helpful? Give feedback.
-
@raphaelouillon and @iuryt this worked (for me)
I seem to have
|
Beta Was this translation helpful? Give feedback.
-
Note - Julia does not support ppc64le (Satori) and possibly not much KNL (Stampede2 I think) binaries. For Satori, compiling from source ( https://github.com/JuliaLang/julia/releases/download/v1.7.2/julia-1.7.2.tar.gz ) seemed to work? |
Beta Was this translation helpful? Give feedback.
-
@iuryt @glwagner @christophernhill, building Julia 1.7.2 from source did it for me (for some reason I had issues building 1.6.6). Thanks all! |
Beta Was this translation helpful? Give feedback.
-
@glwagner @christophernhill apologies if this is readily addressed in the documentation but I couldn't find the info: When running on several GPUs, how does the memory usage per GPU change? Does it decrease more or less linearly with number of GPUs? I am looking at running simulations with 10⁸ grid cells or more and memory seems to be the limiting factor on GPU (at least on the V100 with 32gb of memory). This also got me wondering if anyone had tried running on the M1 Ultra architecture with 128Gb of unified memory (I saw that there was a recent discussion on this). Am I missing something or would that give the M1 Ultra more memory than an Nvidia A100? I also imagine that it would be much slower, but if memory is the bottleneck, still potentially interesting to try. |
Beta Was this translation helpful? Give feedback.
-
Hi,
I know that it might seem too general, but I still think it's worth having a discussion about the "best" ways to run this beautiful model on HPCs.
I have been struggling to do that for some of the following reasons:
If we want to make it easier for people to use, is Docker an interesting option for Oceananigans?
Beta Was this translation helpful? Give feedback.
All reactions