Animations for Flash Attention, Flash Attention2, and Standard Attention #736

LuisAVasquez · 2023-12-24T23:53:13Z

I made a series of manim animations illustrating how the implementations of Standard Attention, Flash Attention, and Flash Attention 2 work. They show how, asymptotically, Flash Attention and Flash Attention 2 perform better than Standard Attention with respect to the number of Input/Output accesses between HMB and SRAM.

Note that the animations only show the Forward Pass for each algorithm.

Find them all in this playlist

FSSRepo · 2024-02-06T15:03:35Z

Thank you very much, I find this topic very interesting.

LuisAVasquez added 3 commits December 25, 2023 00:42

Adding table with animations

ae4f34e

fixing typo

b23d3eb

Pointing to new version of the video for Flash Attention 2

63406db

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Animations for Flash Attention, Flash Attention2, and Standard Attention #736

Animations for Flash Attention, Flash Attention2, and Standard Attention #736

LuisAVasquez commented Dec 24, 2023 •

edited

Loading

FSSRepo commented Feb 6, 2024

Animations for Flash Attention, Flash Attention2, and Standard Attention #736

Are you sure you want to change the base?

Animations for Flash Attention, Flash Attention2, and Standard Attention #736

Conversation

LuisAVasquez commented Dec 24, 2023 • edited Loading

FSSRepo commented Feb 6, 2024

LuisAVasquez commented Dec 24, 2023 •

edited

Loading