Support multiple CUDA compute capabilities #16424

kiwixz · 2024-12-19T05:06:42Z

Describe the bug

When specifying multiple CUDA compute capabilities (e.g. -fsycl-targets=nvidia_gpu_sm_62,nvidia_gpu_sm_61), the runtime will only ever choose the first one and ignore others.

I believe the relevant code is here, always picking the first llvm_nvptx64 it finds.

The correct behavior would be to choose the latest compute capability included not greater than the one of the device.
So if I compile for sm_60, sm_61, sm_62 in any order and I have a Pascal (sm_61) GPU; it should only try the sm_61 program.

To reproduce

#include <sycl/sycl.hpp>

int main() {
    sycl::queue queue{sycl::default_selector_v};
    queue.submit([&] (sycl::handler& cgh) {
        auto os = sycl::stream{128, 128, cgh};
        cgh.single_task<struct hello_world>([=]() {
#ifdef __SYCL_CUDA_ARCH__
            os << "CUDA_ARCH " << __SYCL_CUDA_ARCH__ << "\n";
#else
            os << "Not CUDA\n";
#endif
        });
    });

    return 0;
}

With a Pascal (sm_61) GPU:

icpx -fsycl -fsycl-targets=nvidia_gpu_sm_60,nvidia_gpu_sm_61,nvidia_gpu_sm_62 main.cpp prints CUDA_ARCH 600 instead of the correct CUDA_ARCH 610.
icpx -fsycl -fsycl-targets=nvidia_gpu_sm_62,nvidia_gpu_sm_61,nvidia_gpu_sm_60 main.cpp errors out instead of printing CUDA_ARCH 610.

Environment

OS: Linux
Target device and vendor: NVIDIA GeForce GTX 1070
DPC++ version: Intel(R) oneAPI DPC++/C++ Compiler 2025.0.4 (2025.0.4.20241205)
Dependencies version: NVIDIA-SMI 565.57.01, Driver Version: 565.57.01, CUDA Version: 12.7

Additional context

No response

The text was updated successfully, but these errors were encountered:

kiwixz · 2024-12-19T05:07:53Z

Related to #7561

kiwixz added bug Something isn't working cuda CUDA back-end labels Dec 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support multiple CUDA compute capabilities #16424

Support multiple CUDA compute capabilities #16424

kiwixz commented Dec 19, 2024

kiwixz commented Dec 19, 2024

Support multiple CUDA compute capabilities #16424

Support multiple CUDA compute capabilities #16424

Comments

kiwixz commented Dec 19, 2024

Describe the bug

To reproduce

Environment

Additional context

kiwixz commented Dec 19, 2024