Performance issue with long context #2548
Labels
bug
Something isn't working
Investigating
Performance
Issue about performance number
triaged
Issue has been triaged by maintainers
System Info
x86_64, debian 11, L4/A100 GPU
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Expected behavior
15k should be around 7.5x slower than 2k, 100k should be 7x slower than 15k
actual behavior
L4 2k: 580ms
L4 15k: 10.3s
L4 100k: 163s
A100 2k: 155ms
A100 15k: 1.75s
A100 100k: 36.5s
L4 15k, L4 100k, A100 100k are too slow and unacceptable
additional notes
There may be some way to configure the trtllm-build to accelerate this, but I can't find a good doc about optimizing the long context
The text was updated successfully, but these errors were encountered: