You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using fairseq and TensorRT-LLM for inference, I encountered an issue.
In the model's vocabulary,:
</s>: 0
<pad>: 2
</s>: 0
<unk>: 1
When using TensorRT-LLM, the decoder_input_ids must be set to 2 (the token) in order to function correctly.
If I set decoder_input_ids to other token IDs (e.g., 0 for ““ or 1 for “”), the model does not work properly and does not produce the expected output.
the output of the fairseq model :
the output of the TensorRT-LLM (decoder_start_token_id =2):
the output of the TensorRT-LLM (decoder_start_token_id =0):
Does the token_id in the TensorRT-LLM engine need to be consistent with the token_id in the fairseq model?
The text was updated successfully, but these errors were encountered:
Issue
When using fairseq and TensorRT-LLM for inference, I encountered an issue.
In the model's vocabulary,:
When using TensorRT-LLM, the decoder_input_ids must be set to 2 (the token) in order to function correctly.
If I set decoder_input_ids to other token IDs (e.g., 0 for ““ or 1 for “”), the model does not work properly and does not produce the expected output.
Environment
fairseq version: 0.12.2
TensorRT-LLM version: 0.9.0
Example
the output of the fairseq model :
the output of the TensorRT-LLM (decoder_start_token_id =2):
the output of the TensorRT-LLM (decoder_start_token_id =0):
Does the token_id in the TensorRT-LLM engine need to be consistent with the token_id in the fairseq model?
The text was updated successfully, but these errors were encountered: