Merge branch 'main' into fa_progen2

pengzhangzhi · pengzhangzhi · Dec 6, 2024 · Dec 3, 2024 · Dec 4, 2024 · Dec 4, 2024
commit d9b2b662430cd94ed38b66fd595d015477ac7c85
diff --git a/README.md b/README.md
@@ -113,7 +113,9 @@ It's recommended to use the flash attention for training. Because in the forward
 
 # Benchmarking
 
-Below is the comparison of peak memory usage and inference time of FAESM with the official ESM2 and shows that FAESM can save memory usage by up to 60% and inference time by up to 70% (length 1000). The benchmarking is done on ESM-650M with batch size 8, and a single A100 with 80GB of memory.
+
+### FAESM vs. Official ESM2
+Below  is the comparison of peak memory usage and inference time of FAESM with the official ESM2. We show that FAESM can save memory usage by up to 60% and inference time by up to 70% (length 1000). The benchmarking is done on ESM-650M with batch size 8, and a single A100 with 80GB of memory.
 
 ![benchmark](assets/figs/benchmark.png)