-
Notifications
You must be signed in to change notification settings - Fork 633
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting errors running tensor-cores example #23
Comments
which version of nvcc are you using?, and did you solved it? |
Hey everyone, If you want to run this sample on Turing you will have to make sure that you are using the I'm just leaving this here for future reference, hoping I'll save somebody a lot of head-scratching. |
Device: RTX3090 |
Hi, I have the same issue. The results seem correct after reducing MATRIX_M, MATRIX_N, and MATRIX_K from 16384 to 1024. However, I did not get speed up with 1024x1024 matrices: I guess we would just use cuBLAS or refering to the faster implementation here.
|
Running the example from the
posts/tensor-cores
folder as discussed at https://devblogs.nvidia.com/programming-tensor-cores-cuda-9/, it appears the nubmers are not as close as expected. I am getting the following outputThe text was updated successfully, but these errors were encountered: