Algorithm | Config | Top-1 (%) | Top-5 (%) | gpu memory (MB) | inference time (ms/img) | Download |
---|---|---|---|---|---|---|
resnet50(raw) | resnet50(raw) | 76.454 | 93.084 | 2412 | 8.59 | model |
resnet50(tfrecord) | resnet50(tfrecord) | 76.266 | 92.972 | 2412 | 8.59 | model |
resnet101 | resnet101 | 78.152 | 93.922 | 2484 | 16.77 | model |
resnet152 | resnet152 | 78.544 | 94.206 | 2544 | 24.69 | model |
resnext50-32x4d | resnext50-32x4d | 77.604 | 93.856 | 4718 | 12.88 | model |
resnext101-32x4d | resnext101-32x4d | 78.568 | 94.344 | 4792 | 26.84 | model |
resnext101-32x8d | resnext101-32x8d | 79.468 | 94.434 | 9582 | 27.52 | model |
resnext152-32x4d | resnext152-32x4d | 78.994 | 94.462 | 4852 | 41.08 | model |
hrnetw18 | hrnetw18 | 76.258 | 92.976 | 4701 | 54.55 | model |
hrnetw30 | hrnetw30 | 77.66 | 93.862 | 4766 | 54. 30 | model |
hrnetw32 | hrnetw32 | 77.994 | 93.976 | 4780 | 53.48 | model |
hrnetw40 | hrnetw40 | 78.142 | 93.956 | 4843 | 54.31 | model |
hrnetw44 | hrnetw44 | 79.266 | 94.476 | 4884 | 54.83 | model |
hrnetw48 | hrnetw48 | 79.636 | 94.802 | 4916 | 54.14 | model |
hrnetw64 | hrnetw64 | 79.884 | 95.04 | 5120 | 54.74 | model |
vit-base-patch16 | vit-base-patch16 | 76.082 | 92.026 | 346 | 8.03 | model |
swin-tiny-patch4-window7 | swin-tiny-patch4-window7 | 80.528 | 94.822 | 132 | 12.94 | model |
deitiii-small-patch16-224 | deitiii-small-patch16-224 | 81.408 | 95.388 | 90 | 7.41 4.90(A100_80G) |
model |
deitiii-base-patch16-192 | deitiii-base-patch16-192 | 82.982 | 95.95 | 337 | 7.49 5.04(A100_80G) |
model |
deitiii-large-patch16-192 | deitiii-large-patch16-192 | 83.902 | 96.296 | 1170 | 14.35 9.91(A100_80G) |
model |
deit_base_patch16_224 (Hydra Attention [8 layers]) | deit_base_patch16_224 (Hydra Attention [8 layers]) | 79.444 | 94.468 | 340 | 6.78 4.47(A100_80G) |
model |
deit_base_patch16_224 (Hydra Attention [12 layers]) | deit_base_patch16_224 (Hydra Attention [12 layers]) | 76.67 | 92.872 | 338 | 6.65 4.34(A100_80G) |
model |
(ps: 通过EasyCV训练得到模型结果,推理的输入尺寸默认为224,机器默认为V100 16G,其中gpu memory记录的是gpu peak memory)
Algorithm | Config | Top-1 (%) | Top-5 (%) | gpu memory (MB) | inference time (ms/img) | Download |
---|---|---|---|---|---|---|
vit_base_patch16_224 | vit_base_patch16_224 | 78.096 | 94.324 | 346 | 8.03 | model |
vit_large_patch16_224 | vit_large_patch16_224 | 84.404 | 97.276 | 1171 | 16.30 | model |
deit_base_patch16_224 | deit_base_patch16_224 | 81.756 | 95.6 | 346 | 7.98 | model |
deit_base_distilled_patch16_224 | deit_base_distilled_patch16_224 | 83.232 | 96.476 | 349 | 8.07 | model |
xcit_medium_24_p8_224 | xcit_medium_24_p8_224 | 83.348 | 96.21 | 884 | 31.77 | model |
xcit_medium_24_p8_224_dist | xcit_medium_24_p8_224_dist | 84.876 | 97.164 | 884 | 32.08 | model |
xcit_large_24_p8_224 | xcit_large_24_p8_224 | 83.986 | 96.47 | 1962 | 37.44 | model |
xcit_large_24_p8_224_dist | xcit_large_24_p8_224_dist | 85.022 | 97.29 | 1962 | 37.44 | model |
tnt_s_patch16_224 | tnt_s_patch16_224 | 76.934 | 93.388 | 100 | 18.92 | model |
convit_tiny | convit_tiny | 72.954 | 91.68 | 31 | 10.79 | model |
convit_small | convit_small | 81.342 | 95.784 | 122 | 11.23 | model |
convit_base | convit_base | 82.27 | 95.916 | 358 | 11.26 | model |
cait_xxs24_224 | cait_xxs24_224 | 78.45 | 94.154 | 50 | 22.62 | model |
cait_xxs36_224 | cait_xxs36_224 | 79.788 | 94.87 | 71 | 33.25 | model |
cait_s24_224 | cait_s24_224 | 83.302 | 96.568 | 190 | 23.74 | model |
levit_128 | levit_128 | 78.468 | 93.874 | 76 | 15.33 | model |
levit_192 | levit_192 | 79.72 | 94.664 | 128 | 15.17 | model |
levit_256 | levit_256 | 81.432 | 95.38 | 222 | 15.27 | model |
convnext_tiny | convnext_tiny | 81.878 | 95.836 | 128 | 7.17 | model |
convnext_small | convnext_small | 82.836 | 96.458 | 213 | 12.89 | model |
convnext_base | convnext_base | 83.73 | 96.692 | 364 | 13.04 | model |
convnext_large | convnext_large | 84.164 | 96.844 | 781 | 13.78 | model |
resmlp_12_distilled_224 | resmlp_12_distilled_224 | 77.876 | 93.532 | 66 | 4.90 | model |
resmlp_24_distilled_224 | resmlp_24_distilled_224 | 80.548 | 95.204 | 124 | 9.07 | model |
resmlp_36_distilled_224 | resmlp_36_distilled_224 | 80.944 | 95.416 | 181 | 13.56 | model |
resmlp_big_24_distilled_224 | resmlp_big_24_distilled_224 | 83.45 | 96.65 | 534 | 20.48 | model |
coat_tiny | coat_tiny | 78.112 | 93.972 | 127 | 33.09 | model |
coat_mini | coat_mini | 80.912 | 95.378 | 247 | 33.29 | model |
convmixer_768_32 | convmixer_768_32 | 80.08 | 94.992 | 4995 | 10.23 | model |
convmixer_1024_20_ks9_p14 | convmixer_1024_20_ks9_p14 | 81.742 | 95.578 | 2407 | 6.29 | model |
convmixer_1536_20 | convmixer_1536_20 | 81.432 | 95.38 | 547 | 14.66 | model |
gmixer_24_224 | gmixer_24_224 | 78.088 | 93.6 | 104 | 11.65 | model |
gmlp_s16_224 | gmlp_s16_224 | 77.204 | 93.358 | 81 | 11.15 | model |
mixer_b16_224 | mixer_b16_224 | 72.558 | 90.068 | 241 | 5.37 | model |
mixer_l16_224 | mixer_l16_224 | 68.34 | 86.11 | 804 | 11.74 | model |
jx_nest_tiny | jx_nest_tiny | 81.278 | 95.618 | 90 | 9.05 | model |
jx_nest_small | jx_nest_tiny | 83.144 | 96.3 | 174 | 16.92 | model |
jx_nest_base | jx_nest_base | 83.474 | 96.442 | 300 | 16.88 | model |
pit_s_distilled_224 | pit_s_distilled_224 | 83.144 | 96.3 | 109 | 7.00 | model |
pit_b_distilled_224 | pit_b_distilled_224 | 83.474 | 96.442 | 330 | 7.66 | model |
twins_svt_small | twins_svt_small | 81.598 | 95.55 | 657 | 14.07 | model |
twins_svt_base | twins_svt_base | 82.882 | 96.234 | 1447 | 18.99 | model |
twins_svt_large | twins_svt_large | 83.428 | 96.506 | 2567 | 19.11 | model |
swin_base_patch4_window7_224 | swin_base_patch4_window7_224 | 84.714 | 97.444 | 375 | 23.47 | model |
swin_large_patch4_window7_224 | swin_large_patch4_window7_224 | 85.826 | 97.816 | 788 | 23.29 | model |
dynamic_swin_small_p4_w7_224 | dynamic_swin_small_p4_w7_224 | 82.896 | 96.234 | 220 | 28.55 | model |
dynamic_swin_tiny_p4_w7_224 | dynamic_swin_tiny_p4_w7_224 | 80.912 | 95.41 | 136 | 14.58 | model |
shuffletrans_tiny_p4_w7_224 | shuffletrans_tiny_p4_w7_224 | 82.176 | 96.05 | 5311 | 13.90 | model |
efficientformer_l1 | efficientformer_l1 | 80.102 | 94.934 | 1820 | 7.5 | model |
efficientformer_l3 | efficientformer_l3 | 82.272 | 96.028 | 2436 | 13.07 | model |
efficientformer_l7 | efficientformer_l7 | 83.076 | 96.44 | 1622 | 18.96 | model |
EdgeVit_xxs_b512_224 | EdgeVit_xxs_b512_224 | 75.18 | 92.188 | 206 | 8.67 | model |
EdgeVit_xs_b256_224 | EdgeVit_xs_b256_224 | 77.624 | 93.47 | 551 | 8.04 | model |
EdgeVit_s_b128_224 | EdgeVit_s_b128_224 | 80.3 | 95.302 | 576 | 13.49 | model |
(ps: 通过导入官方模型得到推理结果,需要torch.version >= 1.9.0,推理的输入尺寸默认为224,机器默认为V100 16G,其中gpu memory记录的是gpu peak memory)