Skip to content

don't save inputs/outputs buffer of FlashAttenFunc to reduce memory usage for inference mode#1383

Merged
tridao merged 1 commit intoDao-AILab:mainfrom XiaobingSuper:xiaobing/reduce_memoryDec 12, 2024