-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quda work ndg force #612
Quda work ndg force #612
Conversation
Awesome, this is working great. Here's a comparison on Juwels Booster on 4 nodes, 64c128 at the physical point with consistent random numbers. no force offloading (for reference)
light force offloading only
+ ND force offloading(first trajectory includes tuning)
|
The speed-up will be even greater on a machine like Leonardo or LUMI-G. I'll let Andrey know that he can run some first tests for the finite-temperature runs. I'll put you in CC @Marcogarofalo |
super!
|
14500 -> 12500 -> 9000 ! |
@Marcogarofalo There's an issue with the timing on the QUDA side. It seems like the time spent in |
What I mean is the following:
This doesn't affect anything on our side but it does mess with the QUDA profile. |
here is a comparison of the data before and after the last commit, the speedup can not be seen in such a small test: debug level 1 rel precision + no strict checks b415eb6
debug level 1 rel precision + no strict checks, e29573f
debug level 4 rel precision + no strict checks b415eb6
debug level 4 rel precision + no strict checks, e29573f
|
No description provided.