Skip to content

Commit

Permalink
[Refactor] 2 basic demos and all related documents
Browse files Browse the repository at this point in the history
  • Loading branch information
zgjja committed Oct 28, 2024
1 parent 3a40bbc commit 33d54ea
Show file tree
Hide file tree
Showing 32 changed files with 1,229 additions and 980 deletions.
20 changes: 20 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
cmake_minimum_required(VERSION 3.14)

project(
tensorrtx
VERSION 0.1
LANGUAGES C CXX CUDA)

set(TensorRT_7_8_10_TARGETS mlp lenet)

set(TensorRT_8_TARGETS)

set(TensorRT_10_TARGETS)

set(ALL_TARGETS ${TensorRT_7_8_10_TARGETS} ${TensorRT_8_TARGETS}
${TensorRT_10_TARGETS})

foreach(sub_dir ${ALL_TARGETS})
message(STATUS "Add subdirectory: ${sub_dir}")
add_subdirectory(${sub_dir})
endforeach()
77 changes: 68 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ The basic workflow of TensorRTx is:
## News

- `22 Oct 2024`. [lindsayshuo](https://github.com/lindsayshuo): YOLOv8-obb
- `18 Oct 2024`. [zgjja](https://github.com/zgjja): Rafactor docker image.
- `18 Oct 2024`. [zgjja](https://github.com/zgjja): Refactor docker image.
- `11 Oct 2024`. [mpj1234](https://github.com/mpj1234): YOLO11
- `9 Oct 2024`. [Phoenix8215](https://github.com/Phoenix8215): GhostNet V1 and V2.
- `21 Aug 2024`. [Lemonononon](https://github.com/Lemonononon): real-esrgan-general-x4v3
Expand All @@ -38,7 +38,7 @@ The basic workflow of TensorRTx is:
- [A guide for quickly getting started, taking lenet5 as a demo.](./tutorials/getting_started.md)
- [The .wts file content format](./tutorials/getting_started.md#the-wts-content-format)
- [Frequently Asked Questions (FAQ)](./tutorials/faq.md)
- [Migrating from TensorRT 4 to 7](./tutorials/migrating_from_tensorrt_4_to_7.md)
- [Migration Guide](./tutorials/migration_guide.md)
- [How to implement multi-GPU processing, taking YOLOv4 as example](./tutorials/multi_GPU_processing.md)
- [Check if Your GPU support FP16/INT8](./tutorials/check_fp16_int8_support.md)
- [How to Compile and Run on Windows](./tutorials/run_on_windows.md)
Expand All @@ -47,21 +47,80 @@ The basic workflow of TensorRTx is:

## Test Environment

1. TensorRT 7.x
2. TensorRT 8.x(Some of the models support 8.x)
1. (**NOT recommended**) TensorRT 7.x
2. (**Recommended**)TensorRT 8.x
3. (**NOT recommended**) TensorRT 10.x

### Note

1. For history reason, some of the models are limited to specific TensorRT version, please check the README.md or code for the model you want to use.
2. Currently, TensorRT 8.x has better compatibility and the most of the features supported.

## How to run

Each folder has a readme inside, which explains how to run the models inside.
**Note**: this project support to build each network by the `CMakeLists.txt` in its subfolder, or you can build them together by the `CMakeLists.txt` on top of this project.

* General procedures before building and running:

```bash
# 1. generate xxx.wts from https://github.com/wang-xinyu/pytorchx/tree/master/lenet
# ...

# 2. put xxx.wts on top of this folder
# ...
```

* (*Option 1*) To build a single subproject in this project, do:

```bash
## enter the subfolder
cd tensorrtx/xxx

## configure & build
cmake -S . -B build
make -C build
```

* (*Option 2*) To build many subprojects, firstly, in the top `CMakeLists.txt`, **uncomment** the project you don't want to build or not suppoted by your TensorRT version, e.g., you cannot build subprojects in `${TensorRT_8_Targets}` if your TensorRT is `7.x`. Then:

```bash
## enter the top of this project
cd tensorrtx

## configure & build
# you may use "Ninja" rather than "make" to significantly boost the build speed
cmake -G Ninja -S . -B build
ninja -C build
```

**WARNING**: This part is still under development, most subprojects are not adapted yet.

* run the generated executable, e.g.:

```bash
# serialize model to plan file i.e. 'xxx.engine'
build/xxx -s

# deserialize plan file and run inference
build/xxx -d

# (Optional) check if the output is same as pytorchx/lenet
# ...

# (Optional) customize the project
# ...
```

For more details, each subfolder may contain a `README.md` inside, which explains more.

## Models

Following models are implemented.

|Name | Description |
|-|-|
|[mlp](./mlp) | the very basic model for starters, properly documented |
|[lenet](./lenet) | the simplest, as a "hello world" of this project |
| Name | Description | Supported TensorRT Version |
|---------------|---------------|---------------|
|[mlp](./mlp) | the very basic model for starters, properly documented | 7.x/8.x/10.x |
|[lenet](./lenet) | the simplest, as a "hello world" of this project | 7.x/8.x/10.x |
|[alexnet](./alexnet)| easy to implement, all layers are supported in tensorrt |
|[googlenet](./googlenet)| GoogLeNet (Inception v1) |
|[inception](./inception)| Inception v3, v4 |
Expand Down
4 changes: 2 additions & 2 deletions docker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,11 +49,11 @@ Change the `TAG` on top of the `.dockerfile`. Note: all images are officially ow

For more detail of the support matrix, please check [HERE](https://docs.nvidia.com/deeplearning/frameworks/support-matrix/index.html)

### How to customize opencv?
### How to customize the opencv in the image?

If prebuilt package from apt cannot meet your requirements, please refer to the demo code in `.dockerfile` to build opencv from source.

### How to solve image build fail issues?
### How to solve failiures when building image?

For *443 timeout* or any similar network issues, a proxy may required. To make your host proxy work for building env of docker, please change the `build` node inside docker-compose file like this:
```YAML
Expand Down
2 changes: 1 addition & 1 deletion docker/tensorrtx-docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
services:
tensorrt:
image: tensortx:1.0.0
image: tensortx:1.0.1
container_name: tensortx
environment:
- NVIDIA_VISIBLE_DEVICES=all
Expand Down
7 changes: 5 additions & 2 deletions docker/x86_64.dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,16 @@ ENV DEBIAN_FRONTEND noninteractive
# basic tools
RUN apt update && apt-get install -y --fix-missing --no-install-recommends \
sudo wget curl git ca-certificates ninja-build tzdata pkg-config \
gdb libglib2.0-dev libmount-dev \
gdb libglib2.0-dev libmount-dev locales \
&& rm -rf /var/lib/apt/lists/*
RUN pip install --no-cache-dir yapf isort cmake-format pre-commit

## fix a potential pre-commit error
RUN locale-gen "en_US.UTF-8"

## override older cmake
RUN find /usr/local/share -type d -name "cmake-*" -exec rm -rf {} + \
&& curl -fsSL "https://github.com/Kitware/CMake/releases/download/v3.29.0/cmake-3.29.0-linux-x86_64.sh" \
&& curl -fsSL "https://github.com/Kitware/CMake/releases/download/v3.30.0/cmake-3.30.0-linux-x86_64.sh" \
-o cmake.sh && bash cmake.sh --skip-license --exclude-subdir --prefix=/usr/local && rm cmake.sh

RUN apt update && apt-get install -y \
Expand Down
72 changes: 43 additions & 29 deletions lenet/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,29 +1,43 @@
cmake_minimum_required(VERSION 2.6)

project(lenet)

add_definitions(-std=c++11)

set(TARGET_NAME "lenet")

option(CUDA_USE_STATIC_CUDA_RUNTIME OFF)
set(CMAKE_CXX_STANDARD 11)
set(CMAKE_BUILD_TYPE Debug)

include_directories(${PROJECT_SOURCE_DIR}/include)
# include and link dirs of cuda and tensorrt, you need adapt them if yours are different
# cuda
include_directories(/usr/local/cuda/include)
link_directories(/usr/local/cuda/lib64)
# tensorrt
include_directories(/usr/include/x86_64-linux-gnu/)
link_directories(/usr/lib/x86_64-linux-gnu/)

FILE(GLOB SRC_FILES ${PROJECT_SOURCE_DIR}/lenet.cpp ${PROJECT_SOURCE_DIR}/include/*.h)

add_executable(${TARGET_NAME} ${SRC_FILES})
target_link_libraries(${TARGET_NAME} nvinfer)
target_link_libraries(${TARGET_NAME} cudart)

add_definitions(-O2 -pthread)

cmake_minimum_required(VERSION 3.17.0)

project(
lenet
VERSION 0.1
LANGUAGES C CXX CUDA)

if(NOT DEFINED CMAKE_CUDA_ARCHITECTURES)
set(CMAKE_CUDA_ARCHITECTURES
60
70
72
75
80
86
89)
endif()

set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
set(CMAKE_CUDA_STANDARD 17)
set(CMAKE_CUDA_STANDARD_REQUIRED ON)
set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
set(CMAKE_INCLUDE_CURRENT_DIR TRUE)
set(CMAKE_BUILD_TYPE
"Debug"
CACHE STRING "Build type for this project" FORCE)

option(CUDA_USE_STATIC_CUDA_RUNTIME "Use static cudaruntime library" OFF)

find_package(Threads REQUIRED)
find_package(CUDAToolkit REQUIRED)

if(NOT TARGET TensorRT::TensorRT)
include(FindTensorRT.cmake)
else()
message("TensorRT has been found, skipping for ${PROJECT_NAME}")
endif()

add_executable(${PROJECT_NAME} lenet.cpp)

target_link_libraries(${PROJECT_NAME} PUBLIC Threads::Threads CUDA::cudart
TensorRT::TensorRT)
79 changes: 79 additions & 0 deletions lenet/FindTensorRT.cmake
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
cmake_minimum_required(VERSION 3.17.0)

set(TRT_VERSION
$ENV{TRT_VERSION}
CACHE STRING
"TensorRT version, e.g. \"8.6.1.6\" or \"8.6.1.6+cuda12.0.1.011\"")

# find TensorRT include folder
if(NOT TensorRT_INCLUDE_DIR)
if(CMAKE_SYSTEM_PROCESSOR MATCHES "aarch64")
set(TensorRT_INCLUDE_DIR
"/usr/local/cuda/targets/aarch64-linux/include"
CACHE PATH "TensorRT_INCLUDE_DIR")
else()
set(TensorRT_INCLUDE_DIR
"/usr/include/x86_64-linux-gnu"
CACHE PATH "TensorRT_INCLUDE_DIR")
endif()
message(STATUS "TensorRT: ${TensorRT_INCLUDE_DIR}")
endif()

# find TensorRT library folder
if(NOT TensorRT_LIBRARY_DIR)
if(CMAKE_SYSTEM_PROCESSOR MATCHES "aarch64")
set(TensorRT_LIBRARY_DIR
"/usr/lib/aarch64-linux-gnu/tegra"
CACHE PATH "TensorRT_LIBRARY_DIR")
else()
set(TensorRT_LIBRARY_DIR
"/usr/include/x86_64-linux-gnu"
CACHE PATH "TensorRT_LIBRARY_DIR")
endif()
message(STATUS "TensorRT: ${TensorRT_LIBRARY_DIR}")
endif()

set(TensorRT_LIBRARIES)

message(STATUS "Found TensorRT lib: ${TensorRT_LIBRARIES}")

# process for different TensorRT version
if(DEFINED TRT_VERSION AND NOT TRT_VERSION STREQUAL "")
string(REGEX MATCH "([0-9]+)" _match ${TRT_VERSION})
set(TRT_MAJOR_VERSION "${_match}")
set(_modules nvinfer nvinfer_plugin)
unset(_match)

if(TRT_MAJOR_VERSION GREATER_EQUAL 8)
list(APPEND _modules nvinfer_vc_plugin nvinfer_dispatch nvinfer_lean)
endif()
else()
message(FATAL_ERROR "Please set a environment variable \"TRT_VERSION\"")
endif()

# find and add all modules of TensorRT into list
foreach(lib IN LISTS _modules)
find_library(
TensorRT_${lib}_LIBRARY
NAMES ${lib}
HINTS ${TensorRT_LIBRARY_DIR})
list(APPEND TensorRT_LIBRARIES ${TensorRT_${lib}_LIBRARY})
endforeach()

# make the "TensorRT target"
add_library(TensorRT IMPORTED INTERFACE)
add_library(TensorRT::TensorRT ALIAS TensorRT)
target_link_libraries(TensorRT INTERFACE ${TensorRT_LIBRARIES})

set_target_properties(
TensorRT
PROPERTIES C_STANDARD 17
CXX_STANDARD 17
POSITION_INDEPENDENT_CODE ON
SKIP_BUILD_RPATH TRUE
BUILD_WITH_INSTALL_RPATH TRUE
INSTALL_RPATH "$\{ORIGIN\}"
INTERFACE_INCLUDE_DIRECTORIES "${TensorRT_INCLUDE_DIR}")

unset(TRT_MAJOR_VERSION)
unset(_modules)
44 changes: 16 additions & 28 deletions lenet/README.md
Original file line number Diff line number Diff line change
@@ -1,36 +1,22 @@
# lenet5

lenet5 is the simplest net in this tensorrtx project. You can learn the basic procedures of building tensorrt app from API. Including `define network`, `build engine`, `set output`, `do inference`, `serialize model to file`, `deserialize model from file`, etc.
lenet5 is one of the simplest net in this repo. You can learn the basic procedures of building CNN from TensorRT API. This demo includes 2 major steps:

## TensorRT C++ API

```
// 1. generate lenet5.wts from https://github.com/wang-xinyu/pytorchx/tree/master/lenet
// 2. put lenet5.wts into tensorrtx/lenet
// 3. build and run
cd tensorrtx/lenet
mkdir build
1. Build engine
* define network
* set input/output
* serialize model to `.engine` file
2. Do inference
* load and deserialize model from `.engine` file
* run inference

cd build
cmake ..
make
sudo ./lenet -s // serialize model to plan file i.e. 'lenet5.engine'
sudo ./lenet -d // deserialize plan file and run inference
## TensorRT C++ API

// 4. see if the output is same as pytorchx/lenet
```
see [HERE](../README.md#how-to-run)

## TensorRT Python API

```
```bash
# 1. generate lenet5.wts from https://github.com/wang-xinyu/pytorchx/tree/master/lenet

# 2. put lenet5.wts into tensorrtx/lenet
Expand All @@ -39,9 +25,11 @@ sudo ./lenet -d // deserialize plan file and run inference

cd tensorrtx/lenet

python lenet.py -s # serialize model to plan file, i.e. 'lenet5.engine'
# 4.1 serialize model to plan file, i.e. 'lenet5.engine'
python lenet.py -s

python lenet.py -d # deserialize plan file and run inference
# 4.2 deserialize plan file and run inference
python lenet.py -d

# 4. see if the output is same as pytorchx/lenet
# 5. (Optional) see if the output is same as pytorchx/lenet
```
Loading

0 comments on commit 33d54ea

Please sign in to comment.