You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm training a Vocos decoder for my DAC autoencoder. When I set hop length = 256 and n_fft = 1024 in the iSTFT head the discriminators quickly win within 1000 steps. However, this doesn't happen when I set n_fft = 512, 768, or 1026. Do you know why this is happening and whether using 1026 would affect quality? I don't completely understand the COLA property.
The text was updated successfully, but these errors were encountered:
I'm training a Vocos decoder for my DAC autoencoder. When I set hop length = 256 and n_fft = 1024 in the iSTFT head the discriminators quickly win within 1000 steps. However, this doesn't happen when I set n_fft = 512, 768, or 1026. Do you know why this is happening and whether using 1026 would affect quality? I don't completely understand the COLA property.
The text was updated successfully, but these errors were encountered: