Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][CUDA] Unable to Profile SYCL Application with NCU #17007

Closed
nscottnichols opened this issue Feb 13, 2025 · 1 comment
Closed

[SYCL][CUDA] Unable to Profile SYCL Application with NCU #17007

nscottnichols opened this issue Feb 13, 2025 · 1 comment
Labels
bug Something isn't working cuda CUDA back-end

Comments

@nscottnichols
Copy link
Contributor

Describe the bug

Description

Attempting to profile a SYCL application using NVIDIA Nsight Compute (NCU) results in an error preventing kernel profiling. The issue persists even when LD_PRELOAD=/usr/lib64/libcuda.so is set.

To reproduce

Steps to Reproduce

  1. Run the profiling command:
    ncu -o profile ./test.e
  2. Observe the output (error occurs).
  3. Retry with LD_PRELOAD:
    LD_PRELOAD=/usr/lib64/libcuda.so ncu -o profile ./test.e
  4. The error remains the same.

Observed Behavior

  • Profiling fails with the following error:
    ==ERROR== Failed to prepare kernel for profiling
    ==ERROR== Unknown Error on device 0.
    ==ERROR== Failed to profile "const:" in process <PID>
    ==PROF== Trying to shutdown target application
    ==ERROR== The application returned an error code (9).
    ==ERROR== An error occurred while trying to profile.
    ==WARNING== No kernels were profiled.
    
  • This occurs both with and without LD_PRELOAD=/usr/lib64/libcuda.so.

Expected Behavior

  • Nsight Compute should successfully profile the SYCL application.

Environment

  • OS: openSUSE Leap 15.6
  • Target device and vendor: NVIDIA H100 80GB HBM3
  • DPC++ version: clang version 20.0.0git (https://github.com/intel/llvm fb888b857f0e3ef31a474f51d8a6018eeb521d99)
  • Profiler:Nsight Compute (ncu)
  • Application: test.e (SYCL hello world or any other SYCL app)
  • Dependencies version:
$>nvidia-smi
Thu Feb 13 18:06:05 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.57.01              Driver Version: 565.57.01      CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA H100 80GB HBM3          On  |   00000000:1C:00.0 Off |                    0 |
| N/A   22C    P0             71W /  700W |       1MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA H100 80GB HBM3          On  |   00000000:2B:00.0 Off |                    0 |
| N/A   21C    P0             69W /  700W |       1MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA H100 80GB HBM3          On  |   00000000:AC:00.0 Off |                    0 |
| N/A   20C    P0             69W /  700W |       1MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA H100 80GB HBM3          On  |   00000000:BC:00.0 Off |                    0 |
| N/A   19C    P0             69W /  700W |       1MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+
$>sycl-ls
[cuda:gpu][cuda:0] NVIDIA CUDA BACKEND, NVIDIA H100 80GB HBM3 9.0 [CUDA 12.7]
[cuda:gpu][cuda:1] NVIDIA CUDA BACKEND, NVIDIA H100 80GB HBM3 9.0 [CUDA 12.7]
[cuda:gpu][cuda:2] NVIDIA CUDA BACKEND, NVIDIA H100 80GB HBM3 9.0 [CUDA 12.7]
[cuda:gpu][cuda:3] NVIDIA CUDA BACKEND, NVIDIA H100 80GB HBM3 9.0 [CUDA 12.7]

Additional context

Possible Related Issues

@nscottnichols nscottnichols added bug Something isn't working cuda CUDA back-end labels Feb 13, 2025
@nscottnichols nscottnichols changed the title [SYCL][CUDA] [SYCL][CUDA] Unable to Profile SYCL Application with NCU Feb 13, 2025
@nscottnichols
Copy link
Contributor Author

This is a local system configuration issue. Sorry for opening!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cuda CUDA back-end
Projects
None yet
Development

No branches or pull requests

1 participant