You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
Thank you for the amazing work on this repo. I have identified what might be an issue with the implementation of Lightly's Gaussian Blur that is negatively impacting the performance of most of the Lightly models. The source of the problem appears to be a confusion around the interpretation of the "radius" parameter in PIL's Gaussian Blur filter. Simply removing the blur augmentation currently increases significantly performances on a few Lightly models I have tested. With a quick fix of this blur augmentation, all the models I have tested seem to perform even better (cf. table below), except for SwaV since I have noticed that blur augmentation has been disabled. This could also explain why SwaV outperforms some models by far in current Lightly Imagenette Benchmark .
1. What should be done according to SimClr
According to its documentation, the ImageCollateFunction() class from Lightly is designed to replicate the augmentation techniques outlined in the original SimCLR paper. Concerning the Gaussian Blur augmentation parameters, the paper describes in Appendix A. :
"We blur the image 50% of the time using a Gaussian kernel. We randomly sample σ ∈ [0.1, 2.0], and the kernel size is set to be 10% of the image height/width." Indeed, a proper Gaussian Blur filter requires two parameters to be defined: the size of the convolution kernel (or, equivalently, the radius, which is half the size of the kernel) and the standard deviation (σ) of this Gaussian kernel.
2. How it is currently done in Lightly VS elsewhere
The Gaussian blur PIL filter used in Lightly only allows for the input of one parameter, labeled as "radius". This lead to potential confusion, as it is unclear how this parameter relates to kernel size and σ. This question has already been raised several times including on Torchvision github or on Stackoverflow, without very clear answers though, since PIL approximates Gaussian Kernel filter with box linear filter.
The official documentation of PIL's Gaussian Blur as well as the second thread above suggests that, non-intuitively, σ should be passed as radius parameter of PIL GaussianBlur function, in contrast to what is done currently in Lightly, where the radius is passed as radius. And in facts, all official implementations of libraries like MoCo, SwAV, SimSiam or VicReg are also using Gaussian blur from PIL and indeed gives σ asradius parameter in the PIL’s Gaussian Blur filter (and does not use any radius or kernel size parameter).
This results in blurring images with a strength of one or two order of magnitude higher than what is described in SimCLR, making images barely recognizable with raw eye (with a probability .5, which allows the model to still learn).
Using the Lightly Imagenette benchmark file as is, I have run 3 experiments with a batch size of 256 on a few models (with only one seed) :
a. Baseline : I have tried to reproduce the results reported in Lightly Imagenette file to make sure my own baseline is correct. Results are close (except for SwaV on 200 epochs which is weird, since on 800 epochs I got the same ~90% accuracy, I think it might be a mistake there) b. Ablation without any blur : I have just removed the blur from augmentations (put probability of applying blur for all models to 0) c. Blur function fixed : I have fixed the blur function to make it work exactly as in MoCo, SwAV, SimSiam and VicReg official implementations, passing σ to PIL's Gaussian Blur function. The code can be found in #1052
Model
Epochs
KNN Acc % Baseline (from Lightly)
(a) KNN Acc % Baseline (reproduction)
(b) KNN Acc % wihtout Blur
(c) KNN Acc % with Blur solved
SimCLR
200
77.1
77.4
83.1
84.1
BYOL
200
61.9
62.9
70.1
71.2
MOCO
200
72.7
72.9
78.1
78.7
SimSiam
200
66.9
67.7
65.1
75.8
SwaV
200
74.8
86.4
85.6
86.0*
SimCLR
800
85.8
85.5
NA
89.2
*I forget to remove the probability of blurring in SwaV to 0 as explained in the intro, so this result does not represent a fixed version of SwaV with Blur, which might explain there is no improvement here compared with baseline. I will replace with the proper result when my training is complete.
The observed gain of ~1 accuracy between b. and c. is roughly on par to what is described in SimCLR paper albeit done with smaller architecture and smaller dataset here. (”We find it helpful, as it improves our ResNet-50 trained for 100 epochs from 63.2% to 64.5%.”)
PS : It appears that SimCLR has implemented its own custom version of Gaussian Blur filtering. Through my testing, I have found that there are small differences between SimCLR's approach and PIL's approach, since PIL's does not take into account the kernel size. This means that most of the current official implementations of the Gaussian Blur in SSL papers are not perfectly aligned with the one outlined in the SimCLR official paper and implementation, which probably has very a low impact on the final accuracy, but this still may be a question of interest for the way to implement it in Lightly. Moreover, Torchvision has since released an implementation of Gaussian blur (following a request about from SSL field), that uses both kernel size and standard deviation σ. From what I have tested, the results are extremely close to the implementation of SimCLR's Gaussian Noise.
The text was updated successfully, but these errors were encountered:
Hey @AurelienGauffre, thank you very much for the contribution. Thanks a lot for investigating this so thoroughly and providing a solution. We will look into your PR + run all the benchmarks again :)
Hi,
Thank you for the amazing work on this repo. I have identified what might be an issue with the implementation of Lightly's Gaussian Blur that is negatively impacting the performance of most of the Lightly models. The source of the problem appears to be a confusion around the interpretation of the "radius" parameter in PIL's Gaussian Blur filter. Simply removing the blur augmentation currently increases significantly performances on a few Lightly models I have tested. With a quick fix of this blur augmentation, all the models I have tested seem to perform even better (cf. table below), except for SwaV since I have noticed that blur augmentation has been disabled. This could also explain why SwaV outperforms some models by far in current Lightly Imagenette Benchmark .
1. What should be done according to SimClr
According to its documentation, the ImageCollateFunction() class from Lightly is designed to replicate the augmentation techniques outlined in the original SimCLR paper. Concerning the Gaussian Blur augmentation parameters, the paper describes in Appendix A. :
"We blur the image 50% of the time using a Gaussian kernel. We randomly sample σ ∈ [0.1, 2.0], and the kernel size is set to be 10% of the image height/width." Indeed, a proper Gaussian Blur filter requires two parameters to be defined: the size of the convolution kernel (or, equivalently, the radius, which is half the size of the kernel) and the standard deviation (σ) of this Gaussian kernel.
2. How it is currently done in Lightly VS elsewhere
The Gaussian blur PIL filter used in Lightly only allows for the input of one parameter, labeled as "radius". This lead to potential confusion, as it is unclear how this parameter relates to kernel size and σ. This question has already been raised several times including on Torchvision github or on Stackoverflow, without very clear answers though, since PIL approximates Gaussian Kernel filter with box linear filter.
The official documentation of PIL's Gaussian Blur as well as the second thread above suggests that, non-intuitively, σ should be passed as radius parameter of PIL GaussianBlur function, in contrast to what is done currently in Lightly, where the radius is passed as radius. And in facts, all official implementations of libraries like MoCo, SwAV, SimSiam or VicReg are also using Gaussian blur from PIL and indeed gives σ as radius parameter in the PIL’s Gaussian Blur filter (and does not use any radius or kernel size parameter).
This results in blurring images with a strength of one or two order of magnitude higher than what is described in SimCLR, making images barely recognizable with raw eye (with a probability .5, which allows the model to still learn).
Moreover, Lightly currently uses a scale argument in its Gaussian Blur function as well as a normal sampling and a min/max clipping, the origin of which are unclear.
3. Solution and results :
Using the Lightly Imagenette benchmark file as is, I have run 3 experiments with a batch size of 256 on a few models (with only one seed) :
a. Baseline : I have tried to reproduce the results reported in Lightly Imagenette file to make sure my own baseline is correct. Results are close (except for SwaV on 200 epochs which is weird, since on 800 epochs I got the same ~90% accuracy, I think it might be a mistake there)
b. Ablation without any blur : I have just removed the blur from augmentations (put probability of applying blur for all models to 0)
c. Blur function fixed : I have fixed the blur function to make it work exactly as in MoCo, SwAV, SimSiam and VicReg official implementations, passing σ to PIL's Gaussian Blur function. The code can be found in #1052
Baseline (from Lightly)
Baseline (reproduction)
wihtout Blur
with Blur solved
*I forget to remove the probability of blurring in SwaV to 0 as explained in the intro, so this result does not represent a fixed version of SwaV with Blur, which might explain there is no improvement here compared with baseline. I will replace with the proper result when my training is complete.
The observed gain of ~1 accuracy between b. and c. is roughly on par to what is described in SimCLR paper albeit done with smaller architecture and smaller dataset here. (”We find it helpful, as it improves our ResNet-50 trained for 100 epochs from 63.2% to 64.5%.”)
PS : It appears that SimCLR has implemented its own custom version of Gaussian Blur filtering. Through my testing, I have found that there are small differences between SimCLR's approach and PIL's approach, since PIL's does not take into account the kernel size. This means that most of the current official implementations of the Gaussian Blur in SSL papers are not perfectly aligned with the one outlined in the SimCLR official paper and implementation, which probably has very a low impact on the final accuracy, but this still may be a question of interest for the way to implement it in Lightly. Moreover, Torchvision has since released an implementation of Gaussian blur (following a request about from SSL field), that uses both kernel size and standard deviation σ. From what I have tested, the results are extremely close to the implementation of SimCLR's Gaussian Noise.
The text was updated successfully, but these errors were encountered: