diff --git a/README.md b/README.md index 6984782b..6a043150 100644 --- a/README.md +++ b/README.md @@ -21,7 +21,7 @@ ![trtllm](https://github.com/DefTruth/CUDA-Learn-Notes/assets/31974251/5a913fb4-19ba-4880-9602-422d4d6b2925) -- [[TensorRT-LLM][3w字]🔥TensorRT-LLM部署调优-指北](https://zhuanlan.zhihu.com/p/699333691) +- [[TensorRT-LLM][5w字]🔥TensorRT-LLM部署调优-指北](https://zhuanlan.zhihu.com/p/699333691) - [[KV Cache优化]🔥MQA/GQA/YOCO/CLA笔记: 层内和层间KV Cache共享](https://zhuanlan.zhihu.com/p/697311739) - [[Prefill优化]🔥图解vLLM Prefix Prefill Triton Kernel](https://zhuanlan.zhihu.com/p/695799736) - [[Prefill优化][万字]🔥原理&图解vLLM Automatic Prefix Cache(RadixAttention): 首Token时延优化](https://zhuanlan.zhihu.com/p/693556044)