-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
some qibojit benchmarks on NVIDIA Grace-Hopper (WIP) #165
Comments
Hi @migueldiascosta thank you so much for those benchmarks, this architecture looks interesting. Do you have also number/plots with A100? (cc @andrea-pasquale). |
added A100 data (ran at NSCC) to https://gist.github.com/migueldiascosta/0a0dbe061982bc4cc2bc7171785a4b86 |
Thanks a lot, these are quite interesting performance results for GH200. |
@migueldiascosta @scarrazza I know this is not directly related here, but I think it could be interesting to run benchmarks on the Clifford simulator that @BrunoLiegiBastonLiegi is integrating with |
I think this is a good idea, so we could have some numbers for A6000, A100 and GH200. |
Will look into that - btw, are those A6000 benchmarks with |
I don't know what those names mean, which I guess means it's with neither |
see e.g. qiboteam/qibojit-benchmarks#45 the current GH200 data in the gist was obtained with (because I had seen the latter spend most of the time on the single-CPU-thread conversion of the final state vector to a numpy array, and it was not what I was interested in benchmarking...) |
I believe the numbers quoted there have been obtained with |
Indeed, all the numbers in the qibojit paper were obtained with Therefore @migueldiascosta is right, if |
Indeed, but maybe there are other differences between |
I also noticed that and I am not sure how to explain. One thing that could have changed other than the scripts is the libraries versions. It has been two years since publication so qibo, qibojit and probably dependencies as well may have changed during that time. That is unless you are using the older versions. Given that we still have access to most of the hardware we did the benchmarks on, we could retry the benchmarks from our side using the same versions and script you used. This way we will have a much more accurate comparison. |
Yes, there could also be differences there, but now I'm thinking the ~1s constant time in your data is simply the import time, which is added to the "total_simulation_time" in |
Actually, that's mentioned in the paper: "Furthermore, a constant of about one second is required to import the library, which can be relevant (comparable or larger than execution time) for simulation of small circuits. This is unlikely to impede practical usage as it is only a small constant overhead that is independent of the total simulation load." |
data and some plots at https://gist.github.com/migueldiascosta/0a0dbe061982bc4cc2bc7171785a4b86, as requested by @scarrazza
The text was updated successfully, but these errors were encountered: