TensorRT C++ Samples

Real-time inference using TensorRT.

convert onnx to trt on target hardware
run yolo models on target hardware (and it automatically creates the trt engine file)
run engine file on target hardware

Getting Started

Download a yolo model
Update the Makefile with your ARCH_BIN (see #reference for details)
Start the docker container.

make build
make run

Build the code

place your images in src/images

# inside the docker container
mkdir build && cd build
cmake ..
make -j -l4

alternatively, if you want to build an individual module alone, you can follow these steps

# go to module of interest
$ cd /src/engine
# create build directory
$ mkdir build && cd build
# build the project
$ cmake .. && make -j

modules

the main CMakeLists.txt builds these folders:

converter ----> converts yolo model to tensorRT serialized engine file (trt engine file)
engine    ----> runs a trt engine file
yolo      ----> runs a yolo model (converts to trt engine and runs)

When you build in the main directory here is what the outputs look like

/src/build# tree -L 2 -I 'CMakeFiles'
.
|-- CMakeCache.txt
|-- Makefile
|-- cmake_install.cmake
|-- converter
|   |-- Makefile
|   |-- cmake_install.cmake
|   `-- onnx2trt                <<<<<<<<<< convert onnx 2 trt
|-- engine
|   |-- Makefile
|   |-- cmake_install.cmake
|   `-- engine                  <<<<<<<<<< run serialized engine file
`-- yolo
    |-- Makefile
    |-- cmake_install.cmake
    |-- detectImage             <<<<<<<<<< run object detection on an image with yolo model
    |-- detectWebcam            <<<<<<<<<< run object detection on an webcam with yolo model
    |-- libyolo.so
    `-- profile                 <<<<<<<<<< calculate yolo model execution time when doing detection on image

References

arch_bin

The Dockerfile has an ARG ARCH_BIN that is used to build openCV wth cuda support. You can check nvidia docs to match your gpu and set ARCH_BIN in the Makefile

# here we have GeForce GTX 1050. The docs label it as ARCH_BIN=6.1
$ nvidia-smi
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.54.03              Driver Version: 535.54.03    CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX 1050        Off | 00000000:01:00.0 Off |                  N/A |
  ...
  ...
+---------------------------------------------------------------------------------------+

version check

check your versions (inside docker container)

# TensorRT version
$ find / -name NvInferVersion.h -type f
/usr/include/x86_64-linux-gnu/NvInferVersion.h

# this displays TensorRT version 8.6.1
$ cat /usr/include/x86_64-linux-gnu/NvInferVersion.h | grep NV_TENSORRT | head -n 3
#define NV_TENSORRT_MAJOR 8 //!< TensorRT major version.
#define NV_TENSORRT_MINOR 6 //!< TensorRT minor version.
#define NV_TENSORRT_PATCH 1 //!< TensorRT patch version.

# this displays cudNN version 8.9.1
$ cat /usr/include/x86_64-linux-gnu/cudnn_v*.h | grep CUDNN_MAJOR -A 2 | head -n 3
#define CUDNN_MAJOR 8
#define CUDNN_MINOR 9
#define CUDNN_PATCHLEVEL 1

tao converter

# make run puts you inside the docker container

# before running this, check the README.txt in /src/scripts/tao-converter and install any dependencies and set paths
/tmp/tao-converter# export MODEL_PATH=~/path/to/folder
/tmp/tao-converter# export MODEL=replace_with_model_name
/tmp/tao-converter# export KEY=replace_with_nvidia_key
/tmp/tao-converter# ./tao-converter -k "${KEY}" -t fp16 -e "${MODEL_PATH}/${MODEL}.engine" -o output "${MODEL_PATH}/${MODEL}.etlt"

[INFO] ----------------------------------------------------------------
[INFO] Input filename:   /tmp/filer9wcjU
[INFO] ONNX IR version:  0.0.7
[INFO] Opset version:    13
[INFO] Producer name:    pytorch
[INFO] Producer version: 1.10

trtexec

ask for help

$ /usr/src/tensorrt/bin/trtexec --help

profile model speed

# load in a onnx file
$ export MODEL_PATH=/path/to/folder
$ export ONNX_NAME=model.onnx
$ export TRT_NAME=model.engine
$ /usr/src/tensorrt/bin/trtexec --onnx="${MODEL_PATH}/${ONNX_NAME}" --iterations=5 --workspace=4096
# load in a trt engine file
$ /usr/src/tensorrt/bin/trtexec --loadEngine="${MODEL_PATH}/${TRT_NAME}" --fp16 --batch=1 --iterations=50 --workspace=4096
# save logs to a file
$ /usr/src/tensorrt/bin/trtexec --loadEngine="${MODEL_PATH}/${TRT_NAME}" --fp16 --batch=1 --iterations=50 --workspace=4096 > stats.log

model conversion

$ export MODEL_PATH=/path/to/folder
$ export MODEL_NAME=model
# convert the model to FP16 (if supported on hardware)
$ /usr/src/tensorrt/bin/trtexec --onnx="${MODEL_PATH}/${MODEL_NAME}.onnx" --saveEngine="${MODEL_PATH}/${MODEL_NAME}_fp16.engine" --useCudaGraph --fp16 > "${MODEL_NAME}_fp16.log"

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
src		src
.clang-format		.clang-format
.gitignore		.gitignore
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TensorRT C++ Samples

Getting Started

References

About

Uh oh!

Releases

Packages

Languages

SuperElectron/TensorRT-samples

Folders and files

Latest commit

History

Repository files navigation

TensorRT C++ Samples

Getting Started

References

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages