CUDA: Allow for more thread blocks than the X dimension of the block grid #41

pavanbalaji · 2020-04-13T03:50:18Z

Pull Request Description

This PR allows us to have as many thread blocks as allowed in all of the three dimensions of the block grid combined.

Expected Impact

This would allow us to pack/unpack larger data sizes than before.

Author Checklist

Reference appropriate issues (with "Fixes" or "See" as appropriate)
Commits are self-contained and do not do two things at once
Commit message is of the form: module: short description and follows good practice
Add comments such that someone without knowledge of the code could understand
Have read and agree to the Yaksa CLA terms (https://github.com/pmodels/yaksa/wiki/Yaksa-Contributor-License-Agreement)

pavanbalaji · 2020-04-13T03:50:46Z

This PR fixes #17

src/backend/cuda/include/yaksuri_cudai.h

Even though we use a single dimension right now, we should send all three dimensions to the kernel. This allows us to eventually tune the number of dimensions used. Signed-off-by: Pavan Balaji <[email protected]>

This allows us to handle much larger pack/unpack sizes, and should be sufficient for the forseeable future. Fixes pmodels#17 Signed-off-by: Pavan Balaji <[email protected]>

gcongiu

The PR looks nice I have only one comment

gcongiu · 2020-04-13T21:41:18Z

src/backend/cuda/pup/yaksuri_cudai_pup.c

+    *n_threads = THREAD_BLOCK_SIZE;
+    uint64_t n_blocks = count * cuda_type->num_elements / THREAD_BLOCK_SIZE;
+    n_blocks += ! !(count * cuda_type->num_elements % THREAD_BLOCK_SIZE);
+


For correctness, should this return an error code if the number of blocks exceeds the max allowed size? Or simply assert?

Oh I think I have commented too late :)

That would be more than the size of int64_t. At that point, we'd need to change a whole lot of code in yaksa to make it work, and an assert would not be sufficient.

pavanbalaji self-assigned this Apr 13, 2020

pavanbalaji requested a review from yfguo April 13, 2020 03:50

pavanbalaji added this to the yaksa-1.0b1 milestone Apr 13, 2020

pavanbalaji linked an issue Apr 13, 2020 that may be closed by this pull request

CUDA: respect maximum number of thread blocks #17

Closed

pavanbalaji force-pushed the pr/thread-blocks branch from 71ac217 to b473d98 Compare April 13, 2020 05:12

gcongiu reviewed Apr 13, 2020

View reviewed changes

src/backend/cuda/include/yaksuri_cudai.h Outdated Show resolved Hide resolved

pavanbalaji added 2 commits April 13, 2020 19:41

backend/cuda: pass all three grid dimensions to the pack kernels

a1bde91

Even though we use a single dimension right now, we should send all three dimensions to the kernel. This allows us to eventually tune the number of dimensions used. Signed-off-by: Pavan Balaji <[email protected]>

backend/cuda: use all three dimensions of the block grid

591dc0a

This allows us to handle much larger pack/unpack sizes, and should be sufficient for the forseeable future. Fixes pmodels#17 Signed-off-by: Pavan Balaji <[email protected]>

pavanbalaji force-pushed the pr/thread-blocks branch from b473d98 to 591dc0a Compare April 13, 2020 19:41

pavanbalaji removed the request for review from yfguo April 13, 2020 20:02

yfguo approved these changes Apr 13, 2020

View reviewed changes

pavanbalaji merged commit 8465514 into pmodels:master Apr 13, 2020

pavanbalaji deleted the pr/thread-blocks branch April 13, 2020 21:30

gcongiu reviewed Apr 13, 2020

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA: Allow for more thread blocks than the X dimension of the block grid #41

CUDA: Allow for more thread blocks than the X dimension of the block grid #41

pavanbalaji commented Apr 13, 2020 •

edited

Loading

pavanbalaji commented Apr 13, 2020

gcongiu left a comment

gcongiu Apr 13, 2020

gcongiu Apr 13, 2020 •

edited

Loading

pavanbalaji Apr 13, 2020

CUDA: Allow for more thread blocks than the X dimension of the block grid #41

CUDA: Allow for more thread blocks than the X dimension of the block grid #41

Conversation

pavanbalaji commented Apr 13, 2020 • edited Loading

Pull Request Description

Expected Impact

Author Checklist

pavanbalaji commented Apr 13, 2020

gcongiu left a comment

Choose a reason for hiding this comment

gcongiu Apr 13, 2020

Choose a reason for hiding this comment

gcongiu Apr 13, 2020 • edited Loading

Choose a reason for hiding this comment

pavanbalaji Apr 13, 2020

Choose a reason for hiding this comment

pavanbalaji commented Apr 13, 2020 •

edited

Loading

gcongiu Apr 13, 2020 •

edited

Loading