Description
Code
dataset.py calculate thewidth_cell
and height_cell
to be set to the label_matrix
Tensor.
"""
...
Then to find the width relative to the cell is simply:
width_pixels/cell_pixels, simplification leads to the
formulas below.
"""
width_cell, height_cell = (
width * self.S,
height * self.S,
)
Question
Please help understand why the unit of width_cell
and width_cell
are cells, that is, relative to S instead of image size.
In my understanding, width
andheight
are from the YOLO Darknet annotation where width and height are relative to the image size whose value is between 0 and 1. Suppose width=0.7
, then width_cell
will be 4.9
cells.
If width_cell
and width_cell
are used as the ground truth for YOLO v1 training, I suppose they should be relative to image size as in the YOLO v1 paper.
Each bounding box consists of 5 predictions: x, y, w, h,
and confidence. The (x; y) coordinates represent the center
of the box relative to the bounds of the grid cell. The width
and height are predicted relative to the whole image.