-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Low GPU utilization on 1080ti, 2080ti and TitanX #80
Comments
Hi, thanks for pointing that out. The code has been written to support most of nowadays "old" GPUs starting even with the support of compute capabilities 1.3 onwards :P Feel free to modify the code and I appreciate any pull request that make GPU utilization more dynamically. Thanks! Best regards, Andreas |
Do you know about any problems with that ans how did you derive that numbers? |
The parameters above are basically hardware-dependent and I used the occupancy tool the derive these values. Another parameter is the sectorWidth which basically defines the amount of shared memory used per thread block. So I guess a good starting point would be to increase the thread-count and sectorWidth to see if the performance/utilization increase. |
Perhaps we could add code based on CUDA capability protected with #ifdef |
Hello,
I have been using the library for one of my reasearh projects. I noticed that the GPU is not fully utilized. By reading the code I noticed that there are some hardcoded values. For example:
gpuNUFFT/CUDA/src/gpu/std_gpuNUFFT_kernels.cu
Line 134 in 7d5fc93
gpuNUFFT/CUDA/inc/cuda_utils.hpp
Line 155 in 4508792
gpuNUFFT/CUDA/inc/cuda_utils.cuh
Line 75 in 4508792
Do you know if it is possible to change these values to increase the parallelism?
Or is there another way to do so? I'm happy to splend some time making these values parametric based on the architecture.
Another possible strategy would be the "CUDA Dynamic Parallelism" if these values cannot be changed (https://devblogs.nvidia.com/cuda-dynamic-parallelism-api-principles/).
Thanks.
Marco
The text was updated successfully, but these errors were encountered: