-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory consumption #262
Comments
I think there might be some issue with the parallel scheduler that leaves multiple workers alive on the node. I'll try to have alook |
I've made a small patch in the new version that might resolve that issue if you wanna give it a try. |
I will, thank you very much!
|
I've checked and now I don't see any excessive memory consumption (leak). Thank you! Fix works |
@mloubout need to reopen :) I now have 11 identical workers with 128 Gb RAM. How do you think is there is something that could be optimazed (freed) during JUDI computations? Because the data I use for FWI have regular geometry for every shot. There should not be any shots with bigger source receiver offsets. By the way I use the following options:
JUDI: v3.4.5
|
That's quite hight dt_comp this will only produce nans. |
This is just a warning it would crash if it was actually allocating too much |
Yes the
Finally it crashed :) |
It doesn't matter, if you set |
JUDI will ignore |
Good to know, thank you! |
Hi,
Recently I've done some computation on single Ubuntu node with 64 GB RAM and it finished successively.
Then I've tryied to do the same computations on small cluster (5 Centos 7 nodes) with 128 GB RAM and at some point I've noticed that sometimes I see the warning like
not enogh memory, starting swapping
and the RAM is about 110 GB filled. And after some time I alway get ar error that the connection lost or something, so I can't to perform even a single iteration of FWI.That means on single Ubuntu node it was enough to have 64 GB RAM without swapping and on small CentOS 7 cluster 128 GB is not enough.
Any ideas of the possible reasons?
Julia's cluster manager is SSH based.
Julia 1.9.3
JUDI v3.3.10
The text was updated successfully, but these errors were encountered: