Skip to content

decoder_input_ids must be set to 2 when using fairseq model #2621

Open
@cocovoc

Description

@cocovoc

Issue

When using fairseq and TensorRT-LLM for inference, I encountered an issue.
In the model's vocabulary,:

</s>:       0
<pad>:    2
</s>:       0
<unk>:    1

When using TensorRT-LLM, the decoder_input_ids must be set to 2 (the token) in order to function correctly.
If I set decoder_input_ids to other token IDs (e.g., 0 for ““ or 1 for “”), the model does not work properly and does not produce the expected output.

Environment

fairseq version: 0.12.2
TensorRT-LLM version: 0.9.0

Example

the output of the fairseq model :
Image
the output of the TensorRT-LLM (decoder_start_token_id =2):
Image
the output of the TensorRT-LLM (decoder_start_token_id =0):
Image

Does the token_id in the TensorRT-LLM engine need to be consistent with the token_id in the fairseq model?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions