Skip to content

Commit 51c8ce8

Browse files
committed
transfer from gitlab
0 parents  commit 51c8ce8

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

58 files changed

+9338
-0
lines changed

.gitignore

+2
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
*/build
2+
*/*.wts

README.md

+57
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
# TensorRTx
2+
3+
TensorRTx aims to implement popular deep learning networks with tensorrt API. As we know, tensorrt has builtin parsers, including caffeparser, uffparser, onnxparser, etc. But when we use these parses, we often run into some "unsupported operations or layers" problems, especially some state-of-the-art models are using new type of layers, therefore, sometimes we have no choice but to implement the models with tensorrt network definition APIs.
4+
5+
I wrote this project to get familiar with tensorrt API, and also to share and learn from the community.
6+
7+
TensorRTx has a brother project [Pytorchx](https://github.com/wang-xinyu/pytorchx). All the models are implemented in pytorch first, and export a weights file xxx.wts, and then use tensorrt to load weights, define network and do inference.
8+
9+
## Test Environment
10+
11+
Jetson TX1
12+
13+
Ubuntu16.04
14+
15+
cuda9.0
16+
17+
cudnn7.1.5
18+
19+
tensorrt4.0.2/nvinfer4.1.3
20+
21+
Currently, I only test it on TX1, But I think it will be totally OK on TX2 with same software version. And also it will be easy be ported to x86.
22+
23+
## Models
24+
25+
Following models are implemented, each one also has a readme inside.
26+
27+
|Name | Description |
28+
|-|-|
29+
|[lenet](./lenet) | the simplest, as a "hello world" of this project |
30+
|[alexnet](./alexnet)| easy to implement, all layers are supported in tensorrt |
31+
|[googlenet](./googlenet)| GoogLeNet (Inception v1) |
32+
|[inception](./inception)| Inception v3 |
33+
|[mnasnet](./mnasnet)| MNASNet with depth multiplier of 0.5 from the paper |
34+
|[mobilenet](./mobilenet)| MobileNet V2 |
35+
|[resnet](./resnet)| resnet-18 and resnet-50 are implemented |
36+
|[shufflenet](./shufflenet)| ShuffleNetV2 with 0.5x output channels |
37+
|[squeezenet](./squeezenet)| SqueezeNet 1.1 model |
38+
|[vgg](./vgg)| VGG 11-layer model |
39+
|[yolov3](./yolov3)| darknet-53, weights from yolov3 authors |
40+
41+
## Tricky Operations
42+
43+
Some tricky operations encountered in these models, already solved, but might have better solutions.
44+
45+
|Name | Description |
46+
|-|-|
47+
|BatchNorm| Implement by a scale layer, used in resnet, googlenet, mobilenet, etc. |
48+
|MaxPool2d(ceil_mode=True)| use a padding layer before maxpool to solve ceil_mode=True, see googlenet. |
49+
|average pool with padding| use setAverageCountExcludesPadding() when necessary, see inception. |
50+
|relu6| use `Relu6(x) = Relu(x) - Relu(x-6)`, see mobilenet. |
51+
|torch.chunk()| implement the 'chunk(2, dim=C)' by tensorrt plugin, see shufflenet. |
52+
|channel shuffle| use two shuffle layers to implement `channel_shuffle`, see shufflenet. |
53+
|adaptive pool| use fixed input dimension, and use regular average pooling, see shufflenet. |
54+
|leaky relu| I wrote a leaky relu plugin, but PRelu in `NvInferPlugin.h` can be used, see yolov3. |
55+
|yolo layer| yolo layer is implemented as a plugin, see yolov3. |
56+
|upsample| replaced by a deconvolution layer, see yolov3. |
57+

alexnet/CMakeLists.txt

+20
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
cmake_minimum_required(VERSION 2.6)
2+
3+
project(alexnet)
4+
5+
add_definitions(-std=c++11)
6+
7+
option(CUDA_USE_STATIC_CUDA_RUNTIME OFF)
8+
set(CMAKE_CXX_STANDARD 11)
9+
set(CMAKE_BUILD_TYPE Debug)
10+
11+
include_directories(${PROJECT_SOURCE_DIR}/include)
12+
include_directories(/usr/local/cuda-9.0/targets/aarch64-linux/include)
13+
link_directories(/usr/local/cuda-9.0/targets/aarch64-linux/lib)
14+
15+
add_executable(alexnet ${PROJECT_SOURCE_DIR}/alex.cpp)
16+
target_link_libraries(alexnet nvinfer)
17+
target_link_libraries(alexnet cudart)
18+
19+
add_definitions(-O2 -pthread)
20+

alexnet/README.md

+33
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# alexnet
2+
3+
AlexNet model architecture from the "One weird trick..." <https://arxiv.org/abs/1404.5997>`_ paper.
4+
5+
For the details, you can refer to [pytorchx/alexnet](https://github.com/wang-xinyu/pytorchx/tree/master/alexnet)
6+
7+
This alexnet is just several `conv-relu-pool` blocks followed by several `fc-relu`, nothing special. All layers can be implemented by tensorrt api, including `addConvolution`, `addActivation`, `addPooling`, `addFullyConnected`.
8+
9+
```
10+
// 1. generate alexnet.wts from [pytorchx/alexnet](https://github.com/wang-xinyu/pytorchx/tree/master/alexnet)
11+
12+
// 2. put alexnet.wts into tensorrtx/alexnet
13+
14+
// 3. build and run
15+
16+
cd tensorrtx/alexnet
17+
18+
mkdir build
19+
20+
cd build
21+
22+
cmake ..
23+
24+
make
25+
26+
sudo ./alexnet -s // serialize model to plan file i.e. 'alexnet.engine'
27+
28+
sudo ./alexnet -d // deserialize plan file and run inference
29+
30+
// 4. see if the output is same as pytorchx/alexnet
31+
```
32+
33+

0 commit comments

Comments
 (0)