-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Issues: Dao-AILab/flash-attention
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
CUDA versions > 12.3 do not correctly compile H100 Flash Attention 3
#1243
opened Sep 21, 2024 by
rohany
Partial success with build from source for Windows 11, but the resulting wheel needed work
#1242
opened Sep 20, 2024 by
jim-plus
2.6.4 and FA3 release .whl for CUDA 12.4 torch2.4.1 python 3.11?
#1234
opened Sep 17, 2024 by
tqangxl
ERROR [12/13] RUN pip install flash-attn --no-build-isolation
#1229
opened Sep 15, 2024 by
promaprogga
Softmax (particularly exp operations) becomes a major bottleneck in full FP16 pipeline
#1225
opened Sep 13, 2024 by
phantaurus
[Question]Computation and register/shared memory wasted during decoding phase?
#1224
opened Sep 13, 2024 by
sleepwalker2017
Which file is the source code of flash_attn_varlen_qkvpacked_func?
#1221
opened Sep 12, 2024 by
scuizhibin
Can't compile from source on ROCm 6.1.3 wtih gfx1100... error : "static assertion failed" 2.6.3
#1215
opened Sep 9, 2024 by
nktice
[FP8][FA3] Is there a plan to support _flash_attn_varlen_forward with fp8
#1213
opened Sep 9, 2024 by
baoleai
Failed to build installable wheels for some pyproject.toml based projects (flash-attn)
#1211
opened Sep 9, 2024 by
danielchang1985
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.