Dao-AILab / flash-attention Public

Notifications You must be signed in to change notification settings
Fork 1.4k
Star 15.2k

Code
Issues 646
Pull requests 48
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: Dao-AILab/flash-attention

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

646 Open 569 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

FA3 package is missing padding utilities

#1467 opened Jan 28, 2025 by tmm1

What is seqused_q and seqused_k?

#1466 opened Jan 27, 2025 by cassanof

FA3 KV Cache is slower than FA2 KV Cache

#1465 opened Jan 26, 2025 by DD-DuDa

Add support for Cuda 12.8 and B200 GPUs

#1464 opened Jan 25, 2025 by ofirkris

fused dense lib warning

#1462 opened Jan 24, 2025 by YuyueminAustin

BUG? get the wrong value when logit_scale is 0

#1461 opened Jan 24, 2025 by shunshen93

Conflict When Installing flash-attn 2.7.3 and 3.0.0b1 Together

#1459 opened Jan 24, 2025 by quanta42

using Flash Attention version 2.5.7, upgraded CUTLASS to version 3.5 then encountered the following compilation error.

#1458 opened Jan 23, 2025 by ccccjunkang

Main branch compilation on nvcc 12.6

#1453 opened Jan 21, 2025 by roded2

v2.7.3 build failed in NGC pytorch:24.12-py3

#1452 opened Jan 21, 2025 by xuchunmei000

FA3 consecutive failing tests after first failure

#1451 opened Jan 21, 2025 by benjamin-kroeger

[QST] masking steps in flash decoding

#1449 opened Jan 17, 2025 by aws-jiadingg

Clarification on MMA0 Results Handling in the Latest Code

#1448 opened Jan 16, 2025 by ziyuhuang123

subprocess.CalledProcessError: Command '['path/to/cuda-11.7/bin/nvcc', '-V']' returned non-zero exit status 255

#1447 opened Jan 16, 2025 by ChosenOne-xx

Wheel names and version inconsitency.

#1444 opened Jan 14, 2025 by sfc-gh-mhazy

Error when importing dropout_layer_norm

#1441 opened Jan 13, 2025 by anfortas337

Running flash_attn/flash_attn_triton_amd/bench.py with sequence length > 4096 causes RuntimeError: Triton Error [CUDA]: an illegal memory access was encountered

#1440 opened Jan 12, 2025 by jiqimaoke

IncompatibleTypeErrorImpl('invalid operands of type pointer<int64> and triton.language.int32')

#1439 opened Jan 11, 2025 by wuyouliaoxi

FA3 forward performance regression on H200

#1438 opened Jan 10, 2025 by complexfilter

FA3 does not work with torch.compile

#1435 opened Jan 10, 2025 by nighting0le01

FA3 regression on H100 80GB?

#1432 opened Jan 9, 2025 by bastianhagedorn

[flash attn v2] Why V uses no-swizzle layout for registers?

#1429 opened Jan 8, 2025 by phantaurus

version `GLIBCXX_3.4.29' not found

#1428 opened Jan 8, 2025 by zhanghanxing2022

ERROR: No matching distribution found for flash-attn==2.6.3+cu123torch2.4cxx11abifalse

#1423 opened Jan 6, 2025 by carolynsoo

Unable to install flash_attn on H100 with CUDA 12.5

#1422 opened Jan 6, 2025 by ghadiaravi13

Previous 1 2 3 4 5 … 25 26 Next

Previous Next

ProTip! Type g i on any issue or pull request to go back to the issue listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly