scale invariance for the extra losses #45

pmelchior · 2023-08-31T21:43:35Z

The similarity and consistency losses (as written in Liang+2023) assume that the latents have typically amplitudes of order 1. This is not guaranteed by the fidelity training, but if that's not the case it will screw up the extended training procedure by pushing the sigmoids into the flat regime.

This can be fixed by adding rescaling terms that are computed from the typical latent space amplitude:

The first RHS terms should have a prefactor $1/(\sigma_s^2 S)$ instead of $1/S$, in the same way as

This ensures that these terms are all of order 1 and thus remain in the active parts of the sigmoids.

In Liang+23, we set $\sigma_s=0.1$, to set a target size of the consistency loss. It's better to make both of these rescaling terms dynamic, i.e. measure the typical value of $\lVert s\rVert$ across the data set, and update it during the training to account for any shrinking or expansion.

This also has the advantage of

preventing the latent distribution collapse from the consistency term because overall shrinkage does not improve $L_c$
making it easier for the autoencoder to achieve redshift invariance by removing the latent shrinking from the consistency term.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scale invariance for the extra losses #45

scale invariance for the extra losses #45

pmelchior commented Aug 31, 2023

scale invariance for the extra losses #45

scale invariance for the extra losses #45

Comments

pmelchior commented Aug 31, 2023