Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue]: nvidia-smi not found #515

Open
joerowell opened this issue Feb 20, 2024 · 2 comments
Open

[Issue]: nvidia-smi not found #515

joerowell opened this issue Feb 20, 2024 · 2 comments

Comments

@joerowell
Copy link

Problem Description

The estimate_matmul functionality in Triton relies rather heavily on the underlying stats of the GPU. On CUDA platforms, this functionality is realised by calling nvidia-smi and then parsing the results. I see that this code is still present in this fork of Triton:

def nvsmi(attrs):

Would it be possible to get support added for rocm-smi here instead? This makes autotuning Triton kernels for GEMM etc much easier.

Operating System

CPU

GPU

AMD Instinct MI300X

ROCm Version

ROCm 6.0.0

ROCm Component

No response

Steps to Reproduce

No response

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

@zhanglx13
Copy link

@joerowell We can add it later after we merge this fork with upstream.
For gemm tuning, we have a dedicated script to tune gemm kernels. You can refer to this README for more info and let me know if you have more questions.

@zhanglx13
Copy link

@jataylo @micmelesse This seems to be related to the nvsmi related test failure. What is the status of that test?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants