Small correction to 06_multicat #647

andrewkchan · 2024-04-16T03:38:09Z

I believe the first sentence of this is not worded correctly:

Note that because we have a one-hot-encoded dependent variable, we can't directly use nll_loss or softmax (and therefore we can't use cross_entropy):

softmax, as we saw, requires that all predictions sum to 1, and tends to push one activation to be much larger than the others (due to the use of exp); however, we may well have multiple objects that we're confident appear in an image, so restricting the maximum sum of activations to 1 is not a good idea. By the same reasoning, we may want the sum to be less than 1, if we don't think any of the categories appear in an image.

nll_loss, as we saw, returns the value of just one activation: the single activation corresponding with the single label for an item. This doesn't make sense when we have multiple labels.

One-hot encoding does not disallow the use of softmax. The reason should be that the objective is multi-label, right? As the bullet points explain.

dwreck · 2025-03-09T16:24:52Z

I think this is because of the earlier misuse of the term one-hot. I believe (in this chapter) the term "one-hot" should be "multi-hot". If I am wrong about this, I would love to learn why. Maybe my background as a Software Dev & Electrical Engineer is skewing my jargon (and making me pedantic! :)) . Please let me know if that is the case.

In the image below, I believe the term "one-hot" (highlighted) should be "multi-hot".

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Small correction to 06_multicat #647

Small correction to 06_multicat #647

andrewkchan commented Apr 16, 2024

dwreck commented Mar 9, 2025

Small correction to 06_multicat #647

Small correction to 06_multicat #647

Comments

andrewkchan commented Apr 16, 2024

dwreck commented Mar 9, 2025