You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If I see it correctly, the script white_box_test.py uses data loader defined in the function white_box_test.get_val_loader for the ViT models obtained from timm. It uses default ImageNet values mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225].
The difference seems to be that it rescales the images to the size of 248 and uses mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5] for normalization.
When I use their input transform I obtain larger clean accuracy for transformer models than the one provided in the paper. It looks similar to the one provided in the paperswithcode page: https://paperswithcode.com/lib/timm/vision-transformer
Perhaps adding the transforms from the timm package to the code could help in evaluation.
Thanks!
The text was updated successfully, but these errors were encountered:
Thanks a lot for your interesting work!
If I see it correctly, the script
white_box_test.py
uses data loader defined in the functionwhite_box_test.get_val_loader
for the ViT models obtained from timm. It uses default ImageNet valuesmean=[0.485, 0.456, 0.406]
,std=[0.229, 0.224, 0.225]
.It seems that timm documentation suggests a different set of transforms for data preprocessing:
https://rwightman.github.io/pytorch-image-models/models/vision-transformer/
The difference seems to be that it rescales the images to the size of 248 and uses
mean=[0.5, 0.5, 0.5]
,std=[0.5, 0.5, 0.5]
for normalization.When I use their input transform I obtain larger clean accuracy for transformer models than the one provided in the paper. It looks similar to the one provided in the paperswithcode page:
https://paperswithcode.com/lib/timm/vision-transformer
Perhaps adding the transforms from the timm package to the code could help in evaluation.
Thanks!
The text was updated successfully, but these errors were encountered: