I would like to compare the models for real time object detection and their performance. In the future I am going to modify the code so that all networks can use the cv2.dnn module.
- Notebook i5-8265U + 12GB RAM + NVIDIA MX230
- PC Core i5-8400 + 16GB RAM + NVIDIA GTX1060 6GB
- NVIDIA Jetson TX2
-
download modified EfficientDet repos
git clone [email protected]:bartoszptak/EfficientDet.git
cd EfficientDet/
mv EfficientDet/* .
rm -r EfficientDet/ inference.py
- download models
python download_models.py
- (optionaly) download and prepare dataset for benchmark
mkdir data && cd data
wget "http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar"
tar -vxf VOCtest_06-Nov-2007.tar
cd VOCdevkit/VOC2007/JPEGImages/
# make dataset smaller
rm 00{2..9}*.jpg
- select only 4 classes (this script): bicycle, bus, car, person
- set fixed image size to 512x512
- trained with default other parameters
Model | Size | GFLOPS | [email protected] | Train time |
---|---|---|---|---|
YOLOv3 | 512 | 99.42 | 0.8661 | 1-05:32:00 |
EffficientDet | 512 | 2.5 (from pdf) |
0.8870 | 23:12:18 |
Device | Total FPS batch=1 |
Total FPS batch=2 |
Inference FPS batch=1 |
Inference FPS batch=2 |
---|---|---|---|---|
NVIDIA MX230 |
5.67 | 6.49 | 12.63 | 14.13 |
NVIDIA GTX1060 |
15.07 | 17.77 | 34.43 | 42.05 |
NVIDIA Jetson TX2 |
Device | Total FPS batch=1 |
Total FPS batch=2 |
Inference FPS batch=1 |
Inference FPS batch=2 |
---|---|---|---|---|
NVIDIA MX230 |
5.22 | 5.30 | 7.40 | 7.81 |
NVIDIA GTX1060 |
11.36 | 11.43 | 27.99 | 29.88 |
NVIDIA Jetson TX2 |