Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fp8 not enabled for mha_varlen_fwd #1232

Open
goldhuang opened this issue Sep 16, 2024 · 0 comments
Open

fp8 not enabled for mha_varlen_fwd #1232

goldhuang opened this issue Sep 16, 2024 · 0 comments

Comments

@goldhuang
Copy link

goldhuang commented Sep 16, 2024

I created an issue earlier. #1157.

https://github.com/Dao-AILab/flash-attention/blob/main/hopper/flash_api.cpp#L447.

I think the kernels are unified. Why is fp8 enabled for mha_fwd but not for mha_varlen_fwd? What's the blocker now?
I'm willing to help and contribute if it's not coming recently.

Update - I tired to enable fp8 for mha_varlen_fwd and I got CUDA illegal memory access error.

Thanks!

@goldhuang goldhuang changed the title Why is fp8 not enabled for mha_varlen_fwd? fp8 not enabled for mha_varlen_fwd Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant