Predict box shape directly instead of offsets? #79

stevebottos · 2021-07-24T01:52:47Z

More of a question than an issue really. I was curious - if I'm understanding correctly the network will predict offsets for each anchor box, which in turn will describe a bounding box. This requires lots of conversions (cxcy to xy, encoding, decoding), so would it not be possible to simply train the network to output as [xmin, ymin, xmax, ymax] instead of [offset-x, offset-y, width, height]? If not, what are the issues with this?

In the same vein, is the encoding and decoding of the bounding box only necessary because we need to go from offsets -> bounding box described by offsets?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Predict box shape directly instead of offsets? #79

Predict box shape directly instead of offsets? #79

stevebottos commented Jul 24, 2021

Predict box shape directly instead of offsets? #79

Predict box shape directly instead of offsets? #79

Comments

stevebottos commented Jul 24, 2021