ADD: support of fp16 python inference backend #773
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Enviroment:
CUDA Version: 11.4
TensorRT version: 8.2.0.6-1+cuda11.4
ONNX version: 1.10.1
Python version: 3.8.10
Issue:
While working around with converted to onnx model with fp16 precision i have found the following issue:
This error occurs when building an engine in python when calling the command:
engine = backend.prepare(model, device='CUDA:0', verbose=True)
Then it was found that in the
backend.py
file there was no initialization of thefp16
flag in the config for building the engine.Solution:
The solution is inspired by the still open pull request #345, but it's for the old version of TensorRT. The current solution works on the latest version of TensorRT, which at the time of the pull request is 8.2.0.6.