This repository has been archived by the owner on Aug 10, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1
How to test if patch applied successfully? #6
Comments
It's clearly not, since it's downloaded As for "test if patch applied successfully" - the main indicator is that the following code does not cause a crash: import torch
import triton
import triton.language as tl
@triton.jit
def test_max_kernel():
t = tl.zeros([2, 2], dtype=tl.float32)
m = tl.max(t, 1)
tl.device_print("max:", m)
@triton.jit
def test_sum_kernel():
t = tl.zeros([2, 2], dtype=tl.float32)
s = tl.sum(t, 1)
tl.device_print("sum:", s)
if True:
grid = lambda meta: (1, )
kernel = test_max_kernel[grid]()
if True:
grid = lambda meta: (1, )
kernel = test_sum_kernel[grid]() |
Correct output (this repo with Pascal GPU(s), or Volta+ GPU(s))$ python3 test.py
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000 CrashLLVM ERROR: Cannot select: intrinsic %llvm.nvvm.shfl.sync.bfly.i32
*** SIGABRT received at time=1714270500 on cpu 5 ***
...
Fatal Python error: Aborted
Stack (most recent call first):
File "/mnt/ml/vllm/venv/lib/python3.11/site-packages/triton/compiler/compiler.py", line 200 in llir_to_ptx
File "/mnt/ml/vllm/venv/lib/python3.11/site-packages/triton/compiler/compiler.py", line 381 in <lambda>
File "/mnt/ml/vllm/venv/lib/python3.11/site-packages/triton/compiler/compiler.py", line 543 in compile
File "/mnt/ml/vllm/venv/lib/python3.11/site-packages/triton/runtime/jit.py", line 532 in run
File "/mnt/ml/vllm/vllm/attention/ops/prefix_prefill.py", line 708 in context_attention_fwd
File "/mnt/ml/vllm/venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115 in decorate_context
File "/mnt/ml/vllm/vllm/attention/ops/paged_attn.py", line 177 in forward_prefix
File "/mnt/ml/vllm/vllm/attention/backends/xformers.py", line 237 in forward
File "/mnt/ml/vllm/vllm/attention/layer.py", line 48 in forward
File "/mnt/ml/vllm/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520 in _call_impl
File "/mnt/ml/vllm/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511 in _wrapped_call_impl
File "/mnt/ml/vllm/vllm/model_executor/models/llama.py", line 166 in forward
... |
Thanks a lot! It looks like it worked! I will try to add the command to vLLM docker file so it automatically fixes it. # python3 -m pip install --index-url https://sasha0552.github.io/vllm-ci/ --force-reinstall --no-deps triton
Looking in indexes: https://sasha0552.github.io/vllm-ci/
Collecting triton
Downloading https://github.com/sasha0552/vllm-ci/releases/download/v10/triton-2.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (168.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 168.1/168.1 MB 16.4 MB/s eta 0:00:00
Installing collected packages: triton
Attempting uninstall: triton
Found existing installation: triton 3.0.0
Uninstalling triton-3.0.0:
Successfully uninstalled triton-3.0.0
Successfully installed triton-2.3.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
root@b5774f219676:/vllm-workspace# # python3 tritontest
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
pid (0, 0, 0) idx (1) max: 0.000000
pid (0, 0, 0) idx (0) max: 0.000000
...
pid (0, 0, 0) idx (1) sum: 0.000000
pid (0, 0, 0) idx (0) sum: 0.000000
pid (0, 0, 0) idx (1) sum: 0.000000 |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
I try to apply the triton patch like this:
pip3 install --extra-index-url https://sasha0552.github.io/vllm-ci/ --force-reinstall triton
Which shows
So it managed to re-install, but it went suspiciously quickly and there's no indication whether the patches are applied or not. Is there a way to test this?
The text was updated successfully, but these errors were encountered: