-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clean up the cudax __launch_transform
code and document its purpose and design
#3526
Conversation
I like these changes. I think we should decide if post-launch side effects should be skipped if the launch fails. With the current approach it will run even on failure, but for the use-cases we had in mind skipping it might be better.
This way they stay as an object member functions and we don't run into this limitation regarding exceptions. Then the first step always happens and for the second step we can decide if it should be skipped if the launch failed. |
🟨 CI finished in 2h 55m: Pass: 98%/157 | Total: 1d 01h | Avg: 9m 53s | Max: 51m 32s | Hits: 529%/23359
|
Project | |
---|---|
+/- | CCCL Infrastructure |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
+/- | CCCL Infrastructure |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
+/- | python |
+/- | CCCL C Parallel Library |
+/- | Catch2Helper |
🏃 Runner counts (total jobs: 157)
# | Runner |
---|---|
110 | linux-amd64-cpu16 |
21 | linux-amd64-gpu-v100-latest-1 |
15 | windows-amd64-cpu16 |
10 | linux-arm64-cpu16 |
1 | linux-amd64-gpu-h100-latest-1-testing |
🟩 CI finished in 1h 48m: Pass: 100%/157 | Total: 1d 00h | Avg: 9m 23s | Max: 50m 01s | Hits: 531%/23359
|
Project | |
---|---|
+/- | CCCL Infrastructure |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
+/- | CCCL Infrastructure |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
+/- | python |
+/- | CCCL C Parallel Library |
+/- | Catch2Helper |
🏃 Runner counts (total jobs: 157)
# | Runner |
---|---|
110 | linux-amd64-cpu16 |
21 | linux-amd64-gpu-v100-latest-1 |
15 | windows-amd64-cpu16 |
10 | linux-arm64-cpu16 |
1 | linux-amd64-gpu-h100-latest-1-testing |
🟩 CI finished in 1h 05m: Pass: 100%/155 | Total: 1d 01h | Avg: 9m 50s | Max: 33m 44s | Hits: 87%/241925
|
Project | |
---|---|
+/- | CCCL Infrastructure |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
+/- | CCCL Infrastructure |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
+/- | python |
+/- | CCCL C Parallel Library |
+/- | Catch2Helper |
🏃 Runner counts (total jobs: 155)
# | Runner |
---|---|
108 | linux-amd64-cpu16 |
15 | windows-amd64-cpu16 |
12 | linux-amd64-gpu-rtx2080-latest-1 |
10 | linux-arm64-cpu16 |
6 | linux-amd64-gpu-rtxa6000-latest-1 |
3 | linux-amd64-gpu-rtx4090-latest-1 |
1 | linux-amd64-gpu-h100-latest-1 |
Description
i find the
__launch_transform
code to be confusing. in this pr, i replace the need tostatic_cast
the result of__launch_transform
with a new function,__kernel_transform
. socudax::launch
will transform each argument with:__kernel_transform(__launch_transform(arg))
a large comment block in
launch_transform.hpp
describes the protocol and explains why two separate functions are needed.Checklist