-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Estimating normalization factor Z #23
Comments
Hi, Thanks for your comment! Which specific result are you referring to? Or are you suggesting that an EMA of Z could potentially improve all InsDis, MoCo and CMC with NCE loss? |
You reported a low number of MoCo with the NCE loss. This is because your implementation of NCE is problematic and correcting it should gives a more reasonable MoCo w/ NCE number. |
@KaimingHe , yeah, probably the current NCE implementation is less suitable for MoCo, and I am happy to rectify it. What is the best momentum multiplier for updating Z you would like to suggest? |
0.99 for updating Z works well. In ImageNet-1K, MoCo with NCE is ~2% worse than MoCo with InfoNCE, similar to the case of the memory bank counterpart. |
Thanks for your input! I have temporarily removed the NCE numbers in README to avoid any confusion, and will keep them vacant until I get a chance to look into it. |
Is it necessary to fix or EMA-update |
Later I found the statement in InsDis: "Empirically, we find the approximation derived from initial batches sufficient to work well in practice." |
CMC/NCE/NCEAverage.py
Line 189 in 0f72b18
This one-time estimation is problematic, especially if the dictionary is not random noise. Computing Z as a moving average of this would give a more reasonable result.
The text was updated successfully, but these errors were encountered: