Currently tkDNN supports only ShelfNet as semantic segmentation network.
To run the semantic segmentation demo follow these steps (example with shelfnet):
rm shelfnet_fp32.rt # be sure to delete(or move) old tensorRT files
export TKDNN_BATCHSIZE=4 # be sure you have batch size > than 1 if you want to run inference on images bigger than 1024
./test_shelfnet # run the yolo test (is slow)
./demo shelfnet_fp32.rt ../demo/yolo_test.mp4 1 19
In general the demo program takes the following parameters:
./seg_demo <network-rt-file> <path-to-video> <n-batches> <number-of-classes> <resize-flag> <baseline-resize> <show-flag> <write-pred>
where
<network-rt-file>
is the rt file generated by a test<<path-to-video>
is the path to a video file or a camera input<n-batches>
number of batches to use in inference (N.B. you should first export TKDNN_BATCHSIZE to the required n_batches and create again the rt file for the network).<number-of-classes>
is the number of classes the network is trained on<resize-flag>
if set to 0 the demo will not resize the input frames, but use it as it is, otherwise it will resize it.<baseline-resize>
is<resize-flag>
is set to 1, then the input frames will be proportionally resized using<baseline-resize>
as width baseline.<show-flag>
if set to 0 the demo will not show the visualization but save the video into result.mp4 (if n-batches ==1)<write-pred>
if set to 0 (default) the demo will run, otherwise the evaluation of a dataset will run and the output of the segmentation will be saved. Attention: this is under development and paths are embedded, so change them in the code in advance.
NB) By default it is used FP32 inference NB) The batching is not used to work on more streams, rather to work on more tiles of the same image. Shelfnet never resized the input image, therefore for images greater than 1024x1024 tiles of 1024x1024 are given in input to the network in batch.
For other demo videos refer to this playlist.
NB) The gif and the videos are obtained with Mapillary Vistas weights, that we cannot publicly share due to its license restrictions. However, you can train Shelfnet using Mapillary and this fork of the original repo.
Inference FPS of shelfnet with tkDNN, average of 1200 images on:
- RTX 2080Ti (CUDA 10.2, TensorRT 7.0.0, Cudnn 7.6.5);
- Xavier AGX, Jetpack 4.3 (CUDA 10.0, CUDNN 7.6.3, tensorrt 6.0.1 );
Platform | Test | Phase | FP32, ms | FP32, FPS | FP16, ms | FP16, FPS | INT8, ms | INT8, FPS |
---|---|---|---|---|---|---|---|---|
RTX 2080Ti | shelfnet 1024x1024 (B=1) | pre | 6.11863 | 163.435 | 5.81465 | 171.979 | 5.88699 | 169.866 |
RTX 2080Ti | shelfnet 1024x1024 (B=1) | inf | 11.5464 | 86.6074 | 7.35396 | 135.981 | 6.37623 | 156.832 |
RTX 2080Ti | shelfnet 1024x1024 (B=1) | post | 4.09058 | 244.464 | 3.91961 | 255.128 | 4.07343 | 245.493 |
RTX 2080Ti | shelfnet 1024x1024 (B=1) | tot | 21.7556 | 45.9652 | 17.0882 | 58.5199 | 16.3366 | 61.2121 |
RTX 2080Ti | shelfnet 2048x2048 (B=4) | pre | 25.435 | 39.3158 | 25.2953 | 39.5331 | 25.9303 | 38.565 |
RTX 2080Ti | shelfnet 2048x2048 (B=4) | inf | 36.5015 | 27.3961 | 17.0534 | 58.6395 | 15.6061 | 64.0773 |
RTX 2080Ti | shelfnet 2048x2048 (B=4) | post | 17.3917 | 57.4985 | 17.1649 | 58.2583 | 17.5539 | 56.9675 |
RTX 2080Ti | shelfnet 2048x2048 (B=4) | tot | 79.3283 | 12.6058 | 59.5136 | 16.8029 | 59.0903 | 16.9233 |
AGX Xavier | shelfnet 1024x1024 (B=1) | pre | 8.0174 | 124.729 | 7.5117 | 133.126 | 7.47333 | 133.809 |
AGX Xavier | shelfnet 1024x1024 (B=1) | inf | 72.4173 | 13.8089 | 37.505 | 26.6631 | 31.3286 | 31.9197 |
AGX Xavier | shelfnet 1024x1024 (B=1) | post | 8.89958 | 112.365 | 8.83576 | 113.176 | 9.42655 | 106.083 |
AGX Xavier | shelfnet 1024x1024 (B=1) | tot | 89.3342 | 11.1939 | 53.8525 | 18.5692 | 48.2285 | 20.7346 |
AGX Xavier | shelfnet 2048x2048 (B=4) | pre | 47.1454 | 21.211 | 21.6475 | 46.1947 | 21.4201 | 46.6851 |
AGX Xavier | shelfnet 2048x2048 (B=4) | inf | 266.537 | 3.75183 | 128.321 | 7.79293 | 107.621 | 9.29185 |
AGX Xavier | shelfnet 2048x2048 (B=4) | post | 44.0711 | 22.6906 | 40.1732 | 24.8922 | 39.873 | 25.0796 |
AGX Xavier | shelfnet 2048x2048 (B=4) | tot | 357.753 | 2.79522 | 190.142 | 5.25922 | 168.914 | 5.92016 |
When creating the rt file all the checks returns errors. It is due to a different resize function and handling of the original ShelfNet outputs. However, the network is supposed to work.