This is a quick and self-contained TensorRT example. It demonstrates how to build a TensorRT custom plugin and how to use it in a TensorRT engine without complicated dependencies and too much abstraction.
The ONNX model we created is a simple identity neural network that consists of three Conv
nodes whose weights and attributes are orchestrated so that the convolution operation is a simple identity operation. The second Conv
node in the ONNX is replaced with a ONNX custom node IdentityConv
that is not defined in the ONNX operator set. The TensorRT custom plugin is implemented to perform the IdentityConv
operation using CUDA memory copy in the TensorRT engine.
To build the custom Docker image, please run the following command.
$ docker build -f docker/tensorrt.Dockerfile --no-cache --tag=tensorrt:24.05 .
To run the custom Docker container, please run the following command.
$ docker run -it --rm --gpus device=0 -v $(pwd):/mnt tensorrt:24.05
To build the application, please run the following command.
$ cmake -B build
$ cmake --build build --config Release --parallel
Under the build/src/plugins
directory, the custom plugin library will be saved as libidentity_conv_iplugin_v2_io_ext.so
for IPluginV2Ext
and libidentity_conv_iplugin_v3.so
for IPluginV3
, respectively. The IPluginV2Ext
plugin interface has been deprecated since TensorRT 10.0.0 and will be removed in the future. The IPluginV3
plugin interface is the only recommended interface for custom plugin development.
Under the build/src/apps
directory, the engine builder will be saved as build_engine
, and the engine runner will be saved as run_engine
.
To build the ONNX model, please run the following command.
$ python scripts/create_identity_neural_network.py
The ONNX model will be saved as identity_neural_network.onnx
under the data
directory.
Alternatively, the ONNX model can be exported from a PyTorch model.
To export an ONNX model with ONNX Opset 15 or above, please run the following command.
$ python scripts/export_identity_neural_network_new_opset.py
To export an ONNX model with ONNX Opset 14 or below, please run the following command.
$ python scripts/export_identity_neural_network_old_opset.py
The ONNX model will be saved as identity_neural_network.onnx
under the data
directory.
To build the TensorRT engine from the ONNX model, please run the following command.
$ ./build/src/apps/build_engine data/identity_neural_network.onnx build/src/plugins/IdentityConvIPluginV2IOExt/libidentity_conv_iplugin_v2_io_ext.so data/identity_neural_network_iplugin_v2_io_ext.engine
$ ./build/src/apps/build_engine data/identity_neural_network.onnx build/src/plugins/IdentityConvIPluginV3/libidentity_conv_iplugin_v3.so data/identity_neural_network_iplugin_v3.engine
To run the TensorRT engine, please run the following command.
$ ./build/src/apps/run_engine build/src/plugins/IdentityConvIPluginV2IOExt/libidentity_conv_iplugin_v2_io_ext.so data/identity_neural_network_iplugin_v2_io_ext.engine
$ ./build/src/apps/run_engine build/src/plugins/IdentityConvIPluginV3/libidentity_conv_iplugin_v3.so data/identity_neural_network_iplugin_v3.engine
If the custom plugin implementation and integration are correct, the output of the TensorRT engine should be the same as the input.