Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Fix the mandatory boundary_check when loading bias tensor #17

Open
xinyazhang opened this issue Apr 22, 2024 · 0 comments
Open

Comments

@xinyazhang
Copy link
Collaborator

Suggestion Description

The boundary_check option should only be applied when seqlen_q/k cannot be divided by BLOCK_M/N

# FIXME: do boundary_check correctly
"""
if q_padded and k_padded: # CAVEAT: using "or" disables the partial boundary_check branches
bias = tl.load(B_block_ptr, boundary_check=(0,1), padding_option="zero")
elif q_padded:
bias = tl.load(B_block_ptr, boundary_check=(0,), padding_option="zero")
elif k_padded:
bias = tl.load(B_block_ptr, boundary_check=(1,), padding_option="zero")
else:
bias = tl.load(B_block_ptr)
"""
bias = tl.load(B_block_ptr, boundary_check=(0,1), padding_option="zero")

'''
if q_padded and k_padded: # CAVEAT: using "or" disables the partial boundary_check branches
bias = tl.load(B_block_ptr, boundary_check=(0,1), padding_option="zero")
elif q_padded:
bias = tl.load(B_block_ptr, boundary_check=(0,), padding_option="zero")
elif k_padded:
bias = tl.load(B_block_ptr, boundary_check=(1,), padding_option="zero")
else:
bias = tl.load(B_block_ptr)
'''
# FIXME: Must use boundary_check uncondtionally.
# The optimized tl.load above causes nan for some reason
bias = tl.load(B_block_ptr, boundary_check=(0,1), padding_option="zero")

However according to our tests, it is mandatory for all cases. Hence there are two questions to be answered here

  1. Why is this mandatory?
  2. Can we remove it?

Operating System

No response

GPU

No response

ROCm Component

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant