You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm encountering validation failures in BabelStream's OpenACC version on the main branch related to the number of iterations. Specifically, when the number of iterations is less than 723, validation failures appear:
$ acc-stream -n 722
BabelStream
Version: 4.0
Implementation: OpenACC
Running kernels 722 times
Precision: double
Array size: 268.4 MB (=0.3 GB)
Total size: 805.3 MB (=0.8 GB)
Validation failed on c[]. Average error 2.3104e-14
Function MBytes/sec Min (sec) Max Average
Copy 797592.848 0.00067 0.00069 0.00068
Mul 792595.514 0.00068 0.00068 0.00068
Add 831047.225 0.00097 0.00098 0.00097
Triad 831176.744 0.00097 0.00098 0.00097
Dot 719506.962 0.00075 0.00077 0.00075
compared to
$ acc-stream -n 723
BabelStream
Version: 4.0
Implementation: OpenACC
Running kernels 723 times
Precision: double
Array size: 268.4 MB (=0.3 GB)
Total size: 805.3 MB (=0.8 GB)
Function MBytes/sec Min (sec) Max Average
Copy 796974.794 0.00067 0.00069 0.00067
Mul 791823.981 0.00068 0.00068 0.00068
Add 830542.399 0.00097 0.00098 0.00097
Triad 830553.534 0.00097 0.00098 0.00097
Dot 719081.005 0.00075 0.00077 0.00075
The average error quantity increases with lower numbers of iterations. This exact behavior appears in all the following test environments:
OLCF Summit system, compiled with NVHPC 21.3 to target NVIDIA V100 GPUs
OLCF Summit system, compiled with GCC 12.1.0 to target NVIDIA V100 GPUs
NERSC Perlmutter system, compiled with NVHPC 22.7 to target NVIDIA A100 GPUs
NERSC Perlmutter system, compiled with GCC 11.2.0 to target NVIDIA A100 GPUs
Personal laptop, compiled with NVHPC 23.5 to target a NVIDIA GeForce RTX 3060 Mobile GPU
Personal laptop, compiled with GCC 12.1.0 to target a NVIDIA GeForce RTX 3060 Mobile GPU
Some possible causes that Tom suggested are synchronisations being skipped somewhere, probably with the memory transfers, or, some bad type punning, or something funny happening with the pointer captures (they're pulled out to local variables because all OpenACC compilers failed to work otherwise).
The text was updated successfully, but these errors were encountered:
One more thought: the wording of the wait clause is pretty weird in OpenACC 2.6, so I wonder if this line is missing the wait clause as we copy back to the host.
Does adding the clause fix anything?
Note: if it does this will be strange as all the other kernels have the wait clause so I would have expected that all kernels will have finished before the copy back starts...
I'm encountering validation failures in BabelStream's OpenACC version on the main branch related to the number of iterations. Specifically, when the number of iterations is less than 723, validation failures appear:
compared to
The average error quantity increases with lower numbers of iterations. This exact behavior appears in all the following test environments:
Some possible causes that Tom suggested are synchronisations being skipped somewhere, probably with the memory transfers, or, some bad type punning, or something funny happening with the pointer captures (they're pulled out to local variables because all OpenACC compilers failed to work otherwise).
The text was updated successfully, but these errors were encountered: