Blocksparse.matmul result does not align with torch #4709

oleksost · 2024-09-11T20:34:27Z

Hello,
in the following code the result returned by triton.ops.blocksparse.matmul and torch.einsum do not align (please nolayout consists of all ones).

My understanding is that both outputs should be the same.

Thank you in advance!

import torch
from triton.ops.blocksparse.matmul import matmul

NumHeads, SeqLen, BlockSize, Embed = 1, 64, 32, 128
layout = torch.ones(NumHeads, Embed // BlockSize, SeqLen // BlockSize).long()
q = torch.randn((1, NumHeads, SeqLen, Embed), dtype=torch.float32).contiguous().to('cuda')
k = torch.randn((1, NumHeads, Embed, SeqLen), dtype=torch.float32).contiguous().to('cuda')

mm = matmul(layout, BlockSize, mode='dds', device='cuda')
o_tn = mm(q,k)
o_torch = torch.einsum('bhsd,bhdo->bhso', q, k)

torch.allclose(o_tn, o_torch, atol=1e-1)

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Blocksparse.matmul result does not align with torch #4709

Blocksparse.matmul result does not align with torch #4709

oleksost commented Sep 11, 2024

Blocksparse.matmul result does not align with torch #4709

Blocksparse.matmul result does not align with torch #4709

Comments

oleksost commented Sep 11, 2024