-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed is too slow in 3.7.5 with icpx
than 3.6.5 with icpc
for both PBE and EXX calculations
#5103
Comments
@xdzhu What're your ABACUS installation dependencies? |
I compared the time cost of these two versions. It seems arised from |
Both with intel OneAPI 2023.1.0 and GCC 13.1.0. 3.6.5 with LibRI_0.1.0_loop3 |
I have noticed that in When I change the CXX and MPI_CXX to |
@xdzhu What're your hardware setting? |
@QuantumMisaka The calculation node hardware is with Intel(R) Xeon(R) Gold 6248 CPU @ 2.50GHz (2*20C), 40 cores, and I run ABACUS with following command: |
3.7.x
than 3.6.5
for both PBE and EXX calculations3.7.5 with icpx
than 3.6.5 with icpc
for both PBE and EXX calculations
could you:
|
If the result is OK and the only issue is the performance, according to this official guide, one possible reason as listed in the "Performance" section is that "-O3" is no longer sufficient to enable advanced loop optimization & vectorization; "-xhost" might be necessary. Do we have any benchmark on this compiler flag? @caic99 |
The test case in the zip file takes very long time... do you have smaller examples with the same issue? @xdzhu |
@jinzx10 I've tested it on a previous version of ABACUS, and it does not help a lot (-1% time) since the weightlifting parts are the math libs (here MKL and ELPA). |
I have another concern about the compilers. "mpicxx" might be a wrapper of g++; the wrapper for icpx might be "mpiicpx". On my local PC (WSL2 Ubuntu 22.04) where intel compilers are installed via apt, mpicxx is clearly a wrapper of g++ as shown below:
while mpiicpx is clearly different from mpicxx:
I think it might worth trying mpiicpx instead of mpicxx. |
Details
Recently, I perform SOC + EXX calculation. You can check the INPUT and output files in
hse-3.6vs3.7-lowerspeed.zip
When I choose 3.6.5 version to calculate, the speed is OK. Evey PBE step costs 13s and EXX costs 178s. Although it faces the slower PBE speed between every EXX step.
When I change to 3.7.5, speed is very slow. Evey PBE step costs 43s and EXX costs 270s, which is twice than the 3.6.5 version above.
Task list for Issue attackers (only for developers)
The text was updated successfully, but these errors were encountered: