fp8 not enabled for mha_varlen_fwd #1232

goldhuang · 2024-09-16T22:55:17Z

I created an issue earlier. #1157.

https://github.com/Dao-AILab/flash-attention/blob/main/hopper/flash_api.cpp#L447.

I think the kernels are unified. Why is fp8 enabled for mha_fwd but not for mha_varlen_fwd? What's the blocker now?
I'm willing to help and contribute if it's not coming recently.

Update - I tired to enable fp8 for mha_varlen_fwd and I got CUDA illegal memory access error.

Thanks!

goldhuang changed the title ~~Why is fp8 not enabled for mha_varlen_fwd?~~ fp8 not enabled for mha_varlen_fwd Sep 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fp8 not enabled for mha_varlen_fwd #1232

fp8 not enabled for mha_varlen_fwd #1232

goldhuang commented Sep 16, 2024 •

edited

Loading

fp8 not enabled for mha_varlen_fwd #1232

fp8 not enabled for mha_varlen_fwd #1232

Comments

goldhuang commented Sep 16, 2024 • edited Loading

goldhuang commented Sep 16, 2024 •

edited

Loading