This model aims to detect objects in real time. It detects 80 different classes from the COCO Datasets. For information on network architecture, see the author's page and white paper.
The model was converted to ONNX from PyTorch version of YOLOv2 using PyTorch-Yolo2. The output is fully verified by generating bounding boxes under PyTorch and onnxruntime.
Model | Download | Download (with sample test data) | ONNX version | Opset version |
---|---|---|---|---|
YOLOv2 | 203.9 MB | 182.6 MB | 1.5 | 9 |
shape (1x3x416x416)
shape (1x425x13x13)
The output is a (1x425x13x13)
tensor where 13x13 is the number of grid cells that the image gets divided into. Each grid cell corresponds to 5 anchors, made up of the 5 bounding boxes predicted by the grid cell and the 80 classes that describe each bounding box (5 x (80 classes + 5) = 425
). For more information on how to derive the final bounding boxes and their corresponding confidence scores, refer to this post and PyTorch source code.
The YOLOv2 model was trained on the COCO datasets and was sourced from the original yolov2-voc .cfg
and .weights
files from link.
"YOLO9000: Better, Faster, Stronger" arXiv:1612.08242
MIT License