Control flow for individual threads in a block #4737

jiashenC · 2024-09-17T21:03:04Z

I am experimenting if I can build a hashtable in Triton. Below code snippet shows my example kernel

import torch
import triton
import triton.language as tl

@triton.jit
def build_key_only_hashtable_kernel(
    key_ptr,
    bitmap_ptr,
    hashtable_ptr,
    size,
    hashtable_size,
    BLOCK_SIZE: tl.constexpr,
):
    pid = tl.program_id(axis=0)
    block_start = pid * BLOCK_SIZE
    offset = block_start + tl.arange(0, BLOCK_SIZE)
    mask = offset < size
    bitmap = tl.load(bitmap_ptr + offset, mask=mask)
    mask = mask & bitmap.cast(tl.int1)
    key = tl.load(key_ptr + offset, mask=mask)
    hashkey = key % hashtable_size
    flag = tl.atomic_cas(hashtable_ptr + hashkey, tl.zeros_like(key), key)
    flag = flag.cast(tl.int1)
    while flag:
        hashkey = (hashkey + 1) % hashtable_size
        flag = tl.atomic_cas(hashtable_ptr + hashkey, tl.zeros_like(key), key)
        flag = flag.cast(tl.int1)

inp = torch.arange(0, 100, 1).to(torch.int32).to("cuda")
bitmap = torch.ones(100).to(torch.bool).to("cuda")
hashtable = torch.zeros(140).to(torch.int32).to("cuda")
grid = lambda meta: (triton.cdiv(100, meta["BLOCK_SIZE"]), )
build_key_only_hashtable_kernel[grid](
    inp, 
    bitmap,
    hashtable,
    100,
    140,
    BLOCK_SIZE=16,
)

It gives error

test/test_correct.py python3: /source/llvm-project/llvm/include/llvm/Support/Casting.h:566: decltype(auto) llvm::cast(const From &) [To = mlir::detail::TypedValuemlir::IntegerType, From = mlir::Value]: Assertion `isa(Val) && "cast() argument of incompatible type!"' failed.
Fatal Python error: Aborted

In this case, threads in each block might have divergence. I wonder if something Triton is capable of or if something has not been supported yet.

jiashenC changed the title ~~LLVM complains about incompatible types~~ Control flow for individual threads in a block Sep 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Control flow for individual threads in a block #4737

Control flow for individual threads in a block #4737

jiashenC commented Sep 17, 2024 •

edited

Loading

Control flow for individual threads in a block #4737

Control flow for individual threads in a block #4737

Comments

jiashenC commented Sep 17, 2024 • edited Loading

jiashenC commented Sep 17, 2024 •

edited

Loading