Inference Qwen2-0.5b + Medusa failed #2678
Labels
bug
Something isn't working
Investigating
Speculative Decoding
triaged
Issue has been triaged by maintainers
System Info
A100
[TensorRT-LLM] TensorRT-LLM version: 0.16.0.dev2024120300
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Expected behavior
get the answers
actual behavior
The engine is successfully loaded but the generation fails.
additional notes
The trtllm version is 0.16.0.dev2024120300 and this is the last version to successfully build qwen2 + medusa engines. In the subsequent versions, when engines are built, there is an error message shown as bellow. If I added the missing configs manually, the inference results are messy.
The text was updated successfully, but these errors were encountered: