You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What steps will reproduce the problem?
1. use a big image like 512 x 512
2. put lots of filters (like 64)
3. have lots of color channels (again 64?)
What is the expected output? What do you see instead?
I expect a big filtered image, but instead it crashes.
The blocks are defined such that blocks.y > (2^16) so CUDA refuses to launch
the kernel.
I'm not sure I understand how to set the number of modules when doing a normal
convolution, but it seems that an outer loop is required. The trouble with an
outer loop is that the data is arranged in such a way that it is impossible to
apply just a fraction of the filters, or to process just some of each image.
The data arrangement makes it natural to process just some of the image
channels... but the color channels don't come into the blocking structure.
Basically... can I use this kernel to perform big convolutions?
Original issue reported on code.google.com by [email protected] on 7 Mar 2012 at 6:55
The text was updated successfully, but these errors were encountered:
Original issue reported on code.google.com by
[email protected]
on 7 Mar 2012 at 6:55The text was updated successfully, but these errors were encountered: