The results show that different combinations of architecture and framework have varying effects on model size, error rates, and their respective differences. It is observed that for all architectures, using the Onnx (Int8) format results in significant reductions in model size, compared to the PyTorch (fp32) and Onnx (fp32) formats. However, this comes at the cost of increased error rates, specifically Top1 Error and Top5 Error, which have slight to drastic increases.
The table shows valuable insights for making informed decisions about the trade-off between model size and accuracy. It should be used responsibly, keeping in mind the end-goal of the deployment.
If the objective is to reduce model size for memory efficiency, while being flexible with a slight increase in error rates, then the Onnx (Int8) format would be a suitable choice. This would be especially relevant for deployment on systems with limited resources or for applications where the increase in error rates does not drastically impact the overall performance.
However, if the objective is to maintain the highest possible accuracy, the PyTorch (fp32) or Onnx (fp32) formats should be considered, as they consistently show the lowest error rates across all architectures.
In all cases, it is essential to thoroughly test and validate the chosen model and framework on relevant datasets and use cases, to ensure that the trade-offs are well-understood and acceptable for the specific application.
Architecture | Framework | Model | Logits | Size | Size Difference | Top1 Error | Top1 Error Difference | Top5 Error | Top5 Error Difference |
squeezenet1_0 | PyTorch (fp32) | Link | Link | 4.88MB | -- | 41.91 | -- | 19.58 | -- |
Onnx (fp32) | Link | Link | 4.89MB | -0.19% | 41.91 | 0.00% | 19.58 | 0.00% | |
Onnx (Int8) | Link | Link | 1.27MB | 74.04% | 42.26 | -0.84% | 19.77 | -0.96% | |
squeezenet1_1 | PyTorch (fp32) | Link | Link | 4.83MB | -- | 41.82 | -- | 19.38 | -- |
Onnx (fp32) | Link | Link | 4.84MB | -0.19% | 41.82 | 0.00% | 19.38 | 0.00% | |
Onnx (Int8) | Link | Link | 1.26MB | 74.03% | 41.88 | -0.13% | 19.47 | -0.51% | |
densenet121 | PyTorch (fp32) | Link | Link | 31.5MB | -- | 25.57 | -- | 8.03 | -- |
Onnx (fp32) | Link | Link | 31.49MB | 0.02% | 25.57 | 0.00% | 8.03 | 0.00% | |
Onnx (Int8) | Link | Link | 8.49MB | 73.06% | 27.53 | -7.67% | 9.18 | -14.30% | |
densenet161 | PyTorch (fp32) | Link | Link | 112.9MB | -- | 22.86 | -- | 6.44 | -- |
Onnx (fp32) | Link | Link | 112.84MB | 0.05% | 22.86 | 0.00% | 6.44 | 0.00% | |
Onnx (Int8) | Link | Link | 29.62MB | 73.76% | 23.08 | -0.96% | 6.49 | -0.78% | |
densenet169 | PyTorch (fp32) | Link | Link | 55.9MB | -- | 24.40 | -- | 7.19 | -- |
Onnx (fp32) | Link | Link | 55.89MB | 0.00% | 24.40 | 0.00% | 7.19 | 0.00% | |
Onnx (Int8) | Link | Link | 15.08MB | 73.03% | 24.75 | -1.43% | 7.34 | -2.09% | |
densenet201 | PyTorch (fp32) | Link | Link | 79.08MB | -- | 23.10 | -- | 6.63 | -- |
Onnx (fp32) | Link | Link | 79.09MB | -0.01% | 23.10 | 0.00% | 6.63 | 0.00% | |
Onnx (Int8) | Link | Link | 21.33MB | 73.03% | 25.25 | -9.31% | 7.80 | -17.62% | |
inception_v3 | PyTorch (fp32) | Link | Link | 106.25MB | -- | 30.46 | -- | 11.35 | -- |
Onnx (fp32) | Link | Link | 93.07MB | 12.40% | 30.46 | 0.00% | 11.35 | 0.00% | |
Onnx (Int8) | Link | Link | 23.44MB | 77.95% | 31.47 | -3.32% | 12.03 | -5.99% | |
googlenet | PyTorch (fp32) | Link | Link | 25.94MB | -- | 30.22 | -- | 10.47 | -- |
Onnx (fp32) | Link | Link | 25.87MB | 0.26% | 30.22 | 0.00% | 10.47 | 0.00% | |
Onnx (Int8) | Link | Link | 6.56MB | 74.71% | 30.87 | -2.14% | 10.95 | -4.55% | |
shufflenet_v2_x0_5 | PyTorch (fp32) | Link | Link | 5.38MB | -- | 39.45 | -- | 18.25 | -- |
Onnx (fp32) | Link | Link | 5.39MB | -0.33% | 39.45 | 0.00% | 18.25 | 0.00% | |
Onnx (Int8) | Link | Link | 1.46MB | 72.84% | 43.12 | -9.31% | 21.04 | -15.27% | |
shufflenet_v2_x1_0 | PyTorch (fp32) | Link | Link | 8.97MB | -- | 30.64 | -- | 11.68 | -- |
Onnx (fp32) | Link | Link | 8.94MB | 0.34% | 30.64 | 0.00% | 11.68 | 0.00% | |
Onnx (Int8) | Link | Link | 2.36MB | 73.71% | 34.52 | -12.69% | 14.16 | -21.21% | |
shufflenet_v2_x1_5 | PyTorch (fp32) | Link | Link | 13.78MB | -- | 27.22 | -- | 8.95 | -- |
Onnx (fp32) | Link | Link | 13.71MB | 0.53% | 27.22 | 0.00% | 8.95 | 0.00% | |
Onnx (Int8) | Link | Link | 3.57MB | 74.16% | 28.78 | -5.72% | 9.81 | -9.66% | |
shufflenet_v2_x2_0 | PyTorch (fp32) | Link | Link | 29.02MB | -- | 23.79 | -- | 7.12 | -- |
Onnx (fp32) | Link | Link | 28.89MB | 0.46% | 23.79 | 0.00% | 7.12 | 0.00% | |
Onnx (Int8) | Link | Link | 7.37MB | 74.60% | 27.41 | -15.23% | 9.02 | -26.74% | |
mobilenet_v2 | PyTorch (fp32) | Link | Link | 13.83MB | -- | 28.12 | -- | 9.71 | -- |
Onnx (fp32) | Link | Link | 13.65MB | 1.31% | 28.12 | 0.00% | 9.71 | 0.00% | |
Onnx (Int8) | Link | Link | 3.53MB | 74.48% | 29.46 | -4.75% | 10.41 | -7.19% | |
resnext50_32x4d | PyTorch (fp32) | Link | Link | 98.04MB | -- | 22.38 | -- | 6.30 | -- |
Onnx (fp32) | Link | Link | 97.66MB | 0.39% | 22.38 | 0.00% | 6.30 | 0.00% | |
Onnx (Int8) | Link | Link | 24.59MB | 74.93% | 22.59 | -0.91% | 6.44 | -2.22% | |
wide_resnet50_2 | PyTorch (fp32) | Link | Link | 269.35MB | -- | 21.53 | -- | 5.91 | -- |
Onnx (fp32) | Link | Link | 268.96MB | 0.14% | 21.53 | 0.00% | 5.91 | 0.00% | |
Onnx (Int8) | Link | Link | 67.41MB | 74.97% | 21.60 | -0.32% | 5.95 | -0.64% | |
wide_resnet101_2 | PyTorch (fp32) | Link | Link | 496.2MB | -- | 21.15 | -- | 5.72 | -- |
Onnx (fp32) | Link | Link | 495.42MB | 0.16% | 21.15 | 0.00% | 5.72 | 0.00% | |
Onnx (Int8) | Link | Link | 124.19MB | 74.97% | 21.35 | -0.95% | 5.83 | -1.99% | |
mnasnet0_5 | PyTorch (fp32) | Link | Link | 8.75MB | -- | 32.26 | -- | 12.51 | -- |
Onnx (fp32) | Link | Link | 8.64MB | 1.23% | 32.26 | 0.00% | 12.51 | 0.00% | |
Onnx (Int8) | Link | Link | 2.26MB | 74.22% | 99.90 | -209.65% | 99.50 | -695.51% | |
mnasnet0_75 | PyTorch (fp32) | Link | Link | 12.51MB | -- | 29.12 | -- | 9.85 | -- |
Onnx (fp32) | Link | Link | 12.34MB | 1.29% | 29.12 | 0.00% | 9.85 | 0.00% | |
Onnx (Int8) | Link | Link | 3.2MB | 74.45% | 30.20 | -3.70% | 10.54 | -6.94% | |
mnasnet1_0 | PyTorch (fp32) | Link | Link | 17.28MB | -- | 26.54 | -- | 8.49 | -- |
Onnx (fp32) | Link | Link | 17.07MB | 1.21% | 26.54 | 0.00% | 8.49 | 0.00% | |
Onnx (Int8) | Link | Link | 4.39MB | 74.60% | 28.32 | -6.69% | 9.45 | -11.33% | |
mnasnet1_3 | PyTorch (fp32) | Link | Link | 24.74MB | -- | 23.92 | -- | 6.70 | -- |
Onnx (fp32) | Link | Link | 24.46MB | 1.10% | 23.92 | 0.00% | 6.70 | 0.00% | |
Onnx (Int8) | Link | Link | 6.26MB | 74.72% | 25.23 | -5.46% | 7.47 | -11.49% | |
alexnet | PyTorch (fp32) | Link | Link | 238.68MB | -- | 43.48 | -- | 20.93 | -- |
Onnx (fp32) | Link | Link | 238.68MB | 0.00% | 43.48 | 0.00% | 20.93 | 0.00% | |
Onnx (Int8) | Link | Link | 59.71MB | 74.98% | 43.85 | -0.86% | 21.12 | -0.87% | |
vgg11 | PyTorch (fp32) | Link | Link | 519MB | -- | 30.98 | -- | 11.37 | -- |
Onnx (fp32) | Link | Link | 519.01MB | 0.00% | 30.98 | 0.00% | 11.37 | 0.00% | |
Onnx (Int8) | Link | Link | 129.8MB | 74.99% | 31.02 | -0.14% | 11.36 | 0.12% | |
vgg11_bn | PyTorch (fp32) | Link | Link | 519.05MB | -- | 29.63 | -- | 10.19 | -- |
Onnx (fp32) | Link | Link | 519.01MB | 0.01% | 29.63 | 0.00% | 10.19 | 0.00% | |
Onnx (Int8) | Link | Link | 129.8MB | 74.99% | 29.86 | -0.78% | 10.35 | -1.55% | |
vgg13 | PyTorch (fp32) | Link | Link | 519.72MB | -- | 30.07 | -- | 10.75 | -- |
Onnx (fp32) | Link | Link | 519.73MB | 0.00% | 30.07 | 0.00% | 10.75 | 0.00% | |
Onnx (Int8) | Link | Link | 129.99MB | 74.99% | 30.11 | -0.12% | 10.69 | 0.56% | |
vgg13_bn | PyTorch (fp32) | Link | Link | 519.77MB | -- | 28.41 | -- | 9.63 | -- |
Onnx (fp32) | Link | Link | 519.73MB | 0.01% | 28.41 | 0.00% | 9.63 | 0.00% | |
Onnx (Int8) | Link | Link | 129.99MB | 74.99% | 28.71 | -1.06% | 9.77 | -1.48% | |
vgg16 | PyTorch (fp32) | Link | Link | 540.46MB | -- | 28.41 | -- | 9.62 | -- |
Onnx (fp32) | Link | Link | 540.47MB | 0.00% | 28.41 | 0.00% | 9.62 | 0.00% | |
Onnx (Int8) | Link | Link | 135.18MB | 74.99% | 28.43 | -0.09% | 9.61 | 0.10% | |
vgg16_bn | PyTorch (fp32) | Link | Link | 540.53MB | -- | 26.64 | -- | 8.48 | -- |
Onnx (fp32) | Link | Link | 540.47MB | 0.01% | 26.64 | 0.00% | 8.48 | 0.00% | |
Onnx (Int8) | Link | Link | 135.18MB | 74.99% | 27.42 | -2.95% | 8.81 | -3.87% | |
resnet18 | PyTorch (fp32) | Link | Link | 45.7MB | -- | 30.24 | -- | 10.92 | -- |
Onnx (fp32) | Link | Link | 45.65MB | 0.11% | 30.24 | 0.00% | 10.92 | 0.00% | |
Onnx (Int8) | Link | Link | 11.46MB | 74.93% | 30.58 | -1.11% | 11.06 | -1.23% | |
resnet34 | PyTorch (fp32) | Link | Link | 85.22MB | -- | 26.69 | -- | 8.58 | -- |
Onnx (fp32) | Link | Link | 85.13MB | 0.10% | 26.69 | 0.00% | 8.58 | 0.00% | |
Onnx (Int8) | Link | Link | 21.36MB | 74.94% | 26.96 | -1.03% | 8.73 | -1.77% | |
resnet50 | PyTorch (fp32) | Link | Link | 100.05MB | -- | 23.87 | -- | 7.14 | -- |
Onnx (fp32) | Link | Link | 99.75MB | 0.30% | 23.87 | 0.00% | 7.14 | 0.00% | |
Onnx (Int8) | Link | Link | 25.09MB | 74.93% | 24.47 | -2.53% | 7.39 | -3.59% | |
resnet152 | PyTorch (fp32) | Link | Link | 235.73MB | -- | 21.69 | -- | 5.95 | -- |
Onnx (fp32) | Link | Link | 234.88MB | 0.36% | 21.69 | 0.00% | 5.95 | 0.00% | |
Onnx (Int8) | Link | Link | 59.14MB | 74.91% | 21.78 | -0.43% | 6.02 | -1.07% | |
resnet101 | PyTorch (fp32) | Link | Link | 174.44MB | -- | 22.63 | -- | 6.45 | -- |
Onnx (fp32) | Link | Link | 173.85MB | 0.34% | 22.63 | 0.00% | 6.45 | 0.00% | |
Onnx (Int8) | Link | Link | 43.75MB | 74.92% | 22.90 | -1.19% | 6.51 | -0.93% | |
vgg19 | PyTorch (fp32) | Link | Link | --MB | -- | 28.41 | -- | 9.62 | -- |
Onnx (fp32) | Link | Link | --MB | -- | -- | -- | -- | -- | |
Onnx (Int8) | Link | Link | --MB | -- | -- | -- | -- | -- | |
vgg19_bn | PyTorch (fp32) | Link | Link | --MB | -- | -- | -- | -- | -- |
Onnx (fp32) | Link | Link | --MB | -- | -- | -- | -- | -- | |
Onnx (Int8) | Link | Link | --MB | -- | -- | -- | -- | -- |
imagenet_directory = '/content/drive/MyDrive/Research/data/ILSVRC2012/val'
imagenet_transform = torchvision.transforms.Compose([
torchvision.transforms.Resize(256),
torchvision.transforms.CenterCrop(224),
torchvision.transforms.ToTensor(),
torchvision.transforms.Normalize(
mean=[0.485, 0.456, 0.406],
std =[0.229, 0.224, 0.225])])
imagenet_dataset = torchvision.datasets.ImageFolder(
imagenet_directory,
imagenet_transform)