You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Exporting a model that uses torch.Categorical().sample to sample from the logits.
I currently have a (fixed length) loop within a torch.compile graph that includes sampling from the logits to choose an output and feeding that in as the next input (a standard auto-regressive model).
I see the examples in this repo of gpt2 etc all use greedy sampling (i.e. they're not stochastic) and trying to export my model gives an error
raise UnsupportedOperatorException(
torch_tensorrt.dynamo.conversion._TRTInterpreter.UnsupportedOperatorException: Conversion of function torch._ops.aten.aten::multinomial not currently supported!
Is there any workaround or is sampling not currently possible in tensorrt? I know you can sample outside the model but in my case it is much better encapsulated to have the sampling inside the model.
This can be consider a feature request to support multinomial in torch_tensorrt I guess.
The text was updated successfully, but these errors were encountered:
Can you provide a reproducer of this issue? The simplest way is to probably have the sampling in a PyTorch block since I'm not sure if TRT can handle it. What is odd here is that you are getting past capability partitioning.
You can try doing torch_exectued_ops=[torch.ops.aten.multinomial]
Exporting a model that uses torch.Categorical().sample to sample from the logits.
I currently have a (fixed length) loop within a torch.compile graph that includes sampling from the logits to choose an output and feeding that in as the next input (a standard auto-regressive model).
I see the examples in this repo of gpt2 etc all use greedy sampling (i.e. they're not stochastic) and trying to export my model gives an error
Is there any workaround or is sampling not currently possible in tensorrt? I know you can sample outside the model but in my case it is much better encapsulated to have the sampling inside the model.
This can be consider a feature request to support multinomial in torch_tensorrt I guess.
The text was updated successfully, but these errors were encountered: