New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

CUDA backend performance tuning #30

Open

yfguo opened this issue Apr 6, 2020 · 0 comments

Milestone

Contributor

yfguo commented Apr 6, 2020

We need to investigate and study the best strategy for performance tuning in the CUDA backend.

One knob is the thread block size vs number of blocks.

pavanbalaji mentioned this issue

CUDA: Allow for more thread blocks than the X dimension of the block grid #41

Merged

5 tasks

pavanbalaji added this to the yaksa-1.0b2 milestone

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment