Distributing without CUDA #136

jdarpinian · 2023-10-31T05:58:27Z

jdarpinian
Oct 31, 2023

Hi, I want to distribute an exllamav2-based app without requiring users to install CUDA. I'm getting "CUDA error: no kernel image is available for execution on the device [...]\exllamav2_ext\cuda\rope.cu 131", and I guess that means that the extension is not built for the right GPU architecture which isn't surprising. Is there a way to build the extension with all the kernels built for all the architectures and include all that with my app?

Answered by jdarpinian

Oct 31, 2023

Well I got it working by adding this code to setup.py:

# Build for every supported architecture since 10-series (Pascal)
for arch in ["60", "61", "62", "70", "72", "75", "80", "86", "87", "89", "90", "90a"]:
    extra_compile_args["nvcc"].extend(["-gencode", f"arch=compute_{arch},code=sm_{arch}"])

# Build latest available PTX version, will run on any future GPU
extra_compile_args["nvcc"].extend(["-gencode", f"arch=compute_90,code=compute_90"])

Overkill, sure, but the resulting binary is only 70 MB, a lot less than e.g. cublas, so I guess it's fine!

View full answer

jdarpinian · 2023-10-31T07:21:48Z

jdarpinian
Oct 31, 2023
Author

Well I got it working by adding this code to setup.py:

# Build for every supported architecture since 10-series (Pascal)
for arch in ["60", "61", "62", "70", "72", "75", "80", "86", "87", "89", "90", "90a"]:
    extra_compile_args["nvcc"].extend(["-gencode", f"arch=compute_{arch},code=sm_{arch}"])

# Build latest available PTX version, will run on any future GPU
extra_compile_args["nvcc"].extend(["-gencode", f"arch=compute_90,code=compute_90"])

Overkill, sure, but the resulting binary is only 70 MB, a lot less than e.g. cublas, so I guess it's fine!

0 replies

turboderp · 2023-11-01T18:19:05Z

turboderp
Nov 1, 2023
Maintainer

Isn't this more or less what's already in the releases, just with more architectures? And wouldn't it still build for a specific Python and CUDA version?

0 replies

jdarpinian · 2023-11-01T21:41:32Z

jdarpinian
Nov 1, 2023
Author

I don't know, I didn't see any documentation about how those builds were produced. I wanted to make a build with changes so I couldn't use the releases.

Yes, I expect that it is specific to a Python version, but it will run without CUDA installed at all. Maybe it needs to match the CUDA version that torch was compiled with but I don't know.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Distributing without CUDA #136

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Uh oh!

Distributing without CUDA #136

Uh oh!

jdarpinian Oct 31, 2023

Replies: 3 comments

Uh oh!

Uh oh!

jdarpinian Oct 31, 2023 Author

Uh oh!

turboderp Nov 1, 2023 Maintainer

Uh oh!

Uh oh!

jdarpinian Nov 1, 2023 Author

jdarpinian
Oct 31, 2023

jdarpinian
Oct 31, 2023
Author

turboderp
Nov 1, 2023
Maintainer

jdarpinian
Nov 1, 2023
Author