You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for the report. I can reproduce this, but have no idea what causes it.
It works on the CPU, with threads=false (to use KA) and verbose=true (to know):
julia> N=5; L=3; gpr(N,L)
┌ Info: left index ranges
│ nl = Base.OneTo(7)
│ nl1 = Base.OneTo(7)
│ l = 1:2
│ xl = Base.OneTo(2)
└ xl1 = Base.OneTo(2)
┌ Info: reduction index ranges
│ ni = Base.OneTo(7)
│ i = Base.OneTo(3)
│ xi = Base.OneTo(2)
│ nj = Base.OneTo(7)
│ j = Base.OneTo(3)
└ xj = Base.OneTo(2)
[ Info: running KernelAbstractions CPU actor
7×7×2×2×2 Array{Float32, 5}:
[:, :, 1, 1, 1] =
19.8817 22.2586 23.2881 20.2121 19.9547 22.5193 20.0603
...
On the GPU, it still seems to hang if I comment out * (i <= l) * (j > l) * (j > i + 1).
I wonder if this is just too many loops for KA to handle, or hits some e.g. factorial optimisation step? 11 nested loops is quite deep, and it may be that nobody tested that many. If so, the next step is probably to run it with verbose=2 which will print out the kernel being used, from which we can try to reproduce this without Tullio.
On julia 1.7.2 creating a new environment with only the included packages (see below)
never returns (and GPU usage 100%)
Pkg status status
CuDevice(0): TITAN RTX
CUDA 11.0.0
Thanks a lot!
The text was updated successfully, but these errors were encountered: