timm data loader for the ViT models #5

yatsurama · 2021-11-16T14:53:48Z

Thanks a lot for your interesting work!

If I see it correctly, the script white_box_test.py uses data loader defined in the function white_box_test.get_val_loader for the ViT models obtained from timm. It uses default ImageNet values mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225].

It seems that timm documentation suggests a different set of transforms for data preprocessing:
https://rwightman.github.io/pytorch-image-models/models/vision-transformer/

The difference seems to be that it rescales the images to the size of 248 and uses mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5] for normalization.

When I use their input transform I obtain larger clean accuracy for transformer models than the one provided in the paper. It looks similar to the one provided in the paperswithcode page:
https://paperswithcode.com/lib/timm/vision-transformer

Perhaps adding the transforms from the timm package to the code could help in evaluation.

Thanks!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

timm data loader for the ViT models #5

timm data loader for the ViT models #5

yatsurama commented Nov 16, 2021

timm data loader for the ViT models #5

timm data loader for the ViT models #5

Comments

yatsurama commented Nov 16, 2021