Skip to content

LP ONNX Runtime on WoA #1827

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 29, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions content/learning-paths/laptops-and-desktops/win_python/intro.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
---
# User change
title: "Introduction"

weight: 2

layout: "learningpathall"
---

## What is ONNX?
Open Neural Network Exchange (ONNX) provides a portable, interoperable standard for machine learning (ML). It is an open-source format that enables representation of ML models through a common set of operators and a standardized file format. These operators serve as fundamental building blocks in machine learning and deep learning models, facilitating compatibility and ease of integration across various platforms.

Machine learning practitioners frequently create ML models using diverse frameworks such as PyTorch, TensorFlow, scikit-learn, Core ML, and Azure AI Custom Vision. However, each framework has its own unique methods for creating, storing, and utilizing models. This diversity poses significant challenges when deploying trained models across different environments or executing inference tasks using hardware-accelerated tools or alternate programming languages. For example, deploying models on edge devices or Arm64-powered hardware often necessitates using programming languages other than Python to fully exploit specialized hardware features, such as dedicated neural network accelerators.

By defining a unified and standardized format, ONNX effectively addresses these interoperability challenges. With ONNX, you can easily develop models using your preferred framework and export these models to the ONNX format. Additionally, the ONNX ecosystem includes robust runtime environments (such as ONNX Runtime), enabling efficient inference across multiple hardware platforms and diverse programming languages, including C++, C#, and Java, beyond Python alone.

ONNX Runtime, in particular, provides optimized inference capabilities, supporting execution on CPUs, GPUs, and specialized accelerators, thus significantly improving performance and efficiency for deployment in production environments and edge devices.

Several major ML frameworks currently support exporting models directly to the ONNX format, including Azure AI Custom Vision, Core ML, PyTorch, TensorFlow, and scikit-learn, streamlining the workflow from model development to deployment.

## Objective
In this hands-on learning path, you will explore the practical aspects of running inference on an ONNX-formatted model for an image classification task. Specifically, the demonstration uses the widely used Modified National Institute of Standards and Technology (MNIST) dataset, illustrating how ONNX can be applied to accurately recognize handwritten digits, showcasing both the flexibility and simplicity offered by this standardized format.

## Before you begin
You need the following to complete the tutorial:
### Python
At the time of this writing Python 3.13.3 was available. Use the installer for ARM64, which is available [here](https://www.python.org/ftp/python/3.13.3/python-3.13.3-arm64.exe). After running the installer, select "Add python.exe to PATH", and click Install Now:

[img1]

### A virtual environment with all the required packages

### A code editor

The companion code is available for download [here]().
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
---
title: How to Use ONNX Runtime with Windows on Arm

minutes_to_complete: 30

who_is_this_for: This is an introductory topic for developers who are interested in ONNX.

learning_objectives:
- Use pre-trained ONNX model for inference
- Leverage native Arm64 for Python applications

prerequisites:
- A Windows on Arm computer such as the Lenovo Thinkpad X13s running Windows 11 or a Windows on Arm [virtual machine](/learning-paths/cross-platform/woa_azure/).
- Any code editor, we recommend using [Visual Studio Code for Arm64](https://code.visualstudio.com/docs/?dv=win32arm64user).

author: Dawid Borycki

### Tags
skilllevels: Introductory
subjects: Migration to Arm
armips:
- Cortex-A
operatingsystems:
- Windows
tools_software_languages:
- Python
- Visual Studio Code

further_reading:
- resource:
title: ONNX
link: https://onnx.ai
type: website
- resource:
title: ONNX Repository
link: https://github.com/onnx/onnx
type: blog


### FIXED, DO NOT MODIFY
# ================================================================================
weight: 1 # _index.md always has weight of 1 to order correctly
layout: "learningpathall" # All files under learning paths have this same wrapper
learning_path_main_page: "yes" # This should be surfaced when looking for related content. Only set for _index.md of learning path content.
---
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
# ================================================================================
# FIXED, DO NOT MODIFY THIS FILE
# ================================================================================
weight: 21 # Set to always be larger than the content in this path to be at the end of the navigation.
title: "Next Steps" # Always the same, html page title.
layout: "learningpathall" # All files under learning paths have this same wrapper for Hugo processing.
---
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
---
# User change
title: "Using ONNX Runtime"

weight: 3

layout: "learningpathall"
---

## Objective
Next, you will implement Python code that accomplishes the following tasks:
* Downloads a pre-trained ONNX model specifically trained on the MNIST dataset, along with the MNIST dataset itself, which is widely used for benchmarking machine learning models.
* Executes predictions (inference) using the pre-trained ONNX model on test images containing handwritten digits from the MNIST dataset.
* Evaluates and measures the performance of the inference process, providing insights into the efficiency and speed of the neural network model on your specific system architecture.

This practical demonstration will illustrate the end-to-end workflow of deploying and evaluating ONNX-formatted machine learning models.

## Implementation

### Model
Create a file named main.py. At the beginning of this file, include the following import statements:

```Python
import onnxruntime as ort
import numpy as np
import matplotlib.pyplot as plt
import wget, time, os, urllib
import torchvision
import torchvision.transforms as transforms
```

These statements import the necessary Python libraries:
* onnxruntime - enables running inference with ONNX models.
* numpy - facilitates numerical computations and handling of arrays.
* matplotlib - used for visualizing results such as classification outputs.
* wget, urllib, and os - provide utilities for downloading files and interacting with the file system.
* torchvision - allows easy access to datasets like MNIST.

Next, add the following function immediately below the import statements in your main.py file:

```Python
def download_model(model_name):
if not os.path.exists(model_name):
base_url = 'https://github.com/dawidborycki/ONNX.WoA/raw/refs/heads/main/models/'
url = urllib.parse.urljoin(base_url, model_name)
wget.download(url)
```

This function, download_model, accepts one parameter, model_name. It first checks whether a file with this name already exists in your local directory. If the file does not exist, it downloads the specified ONNX model file from the given GitHub repository URL. This automated check ensures that you won't repeatedly download the model unnecessarily.

### Inference
Next, you will implement a Python function to perform neural inference. Add the following code to your main.py file below the previously defined download_model function:

```Python
def onnx_predict(onnx_session, input_name, output_name,
test_images, test_labels, image_index, show_results):

test_image = np.expand_dims(test_images[image_index], [0,1])

onnx_pred = onnx_session.run([output_name], {input_name: test_image.astype('float32')})

predicted_label = np.argmax(np.array(onnx_pred))
actual_label = test_labels[image_index]

if show_results:
plt.figure()
plt.xticks([])
plt.yticks([])
plt.imshow(test_images[image_index], cmap=plt.cm.binary)

plt.title('Actual: %s, predicted: %s'
% (actual_label, predicted_label), fontsize=22)
plt.show()

return predicted_label, actual_label
```

The onnx_predict function prepares a single test image from the dataset by reshaping it to match the input shape expected by the ONNX model, which is (1, 1, 28, 28). This reshaping is achieved using NumPy's expand_dims function. Next, the function performs inference using the ONNX runtime (onnx_session.run). The inference results are probabilities (scores) for each digit class, and the function uses np.argmax to select the digit class with the highest probability, returning it as the predicted label. Optionally, the function visually displays the image along with its actual and predicted labels.

### Performance measurements
Next, add the following performance-measuring function below onnx_predict in your main.py file:

```Python
def measure_performance(onnx_session, input_name, output_name,
test_images, test_labels, execution_count):

start = time.time()

image_indices = np.random.randint(0, test_images.shape[0] - 1, execution_count)

for i in range(1, execution_count):
onnx_predict(onnx_session, input_name, output_name,
test_images, test_labels, image_indices[i], False)

computation_time = time.time() - start

print('Computation time: %.3f ms' % (computation_time*1000))
```

This measure_performance function assesses the inference speed by repeatedly invoking the onnx_predict function. It measures the total computation time (in milliseconds) required for the specified number of inference executions (execution_count) and outputs this measurement to the console.

### Putting Everything Together
Finally, integrate all previously defined functions by adding these statements at the bottom of your main.py file:
```Python
if __name__ == "__main__":
# Download and prepare the model
model_name = 'mnist-12.onnx'
download_model(model_name)

# Set up ONNX inference session
onnx_session = ort.InferenceSession(model_name)

input_name = onnx_session.get_inputs()[0].name
output_name = onnx_session.get_outputs()[0].name

# Load the MNIST dataset using torchvision
transform = transforms.Compose([transforms.ToTensor()])
mnist_dataset = torchvision.datasets.MNIST(root='./data', train=False,
download=True, transform=transform)

test_images = mnist_dataset.data.numpy()
test_labels = mnist_dataset.targets.numpy()

# Normalize images
test_images = test_images / 255.0

# Perform a single prediction and display the result
image_index = np.random.randint(0, test_images.shape[0] - 1)
onnx_predict(onnx_session, input_name, output_name,
test_images, test_labels, image_index, True)

# Measure inference performance
measure_performance(onnx_session, input_name, output_name,
test_images, test_labels, execution_count=1000)
```

This script first initializes an ONNX inference session with the downloaded model (mnist-12.onnx). It then retrieves the model's input and output details, loads the MNIST dataset for testing, runs a sample inference showing visual results, and finally measures the performance of the inference operation over multiple runs.

## Summary
In this section, you implemented Python code to download a pre-trained ONNX model and the MNIST dataset, perform inference to recognize handwritten digits, and measure inference performance. In the next step, you will install all required dependencies and run the code.
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
---
# User change
title: "Running Inference"

weight: 4

layout: "learningpathall"
---

## Objective
You will now use the implemented code to run inference.

## Packages
Start by activating the virtual environment and installing the necessary Python packages. Activate the virtual environment:

```console
venv-x64\Scripts\activate.bat
```

Then install required packages:
```console
py -V:3.13 -m pip install onnxruntime numpy matplotlib wget torchvision torch
```

## Running Inference
To perform inference, run the following command:

```console
py -V:3.13 .\main.py
```

The code will display a sample inference result similar to the image below:
![fig1](figures/01.png)
Upon closing the displayed image, the script will output the computation time:
```output
PS C:\Users\db\onnx> py -V:3.13 .\main.py
Computation time: 95.854 ms
PS C:\Users\db\onnx> py -V:3.13 .\main.py
Computation time: 111.230 ms
```


To compare results with Windows Arm 64, repeat the steps below using the Arm-64 Python architecture. Activate the Arm64 virtual environment and install packages:
```console
venv-arm64\Scripts\activate.bat
py -V:3.13-arm64 -m pip install onnxruntime numpy matplotlib wget torchvision torch
```

Run inference using Arm64:
```console
py -V:3.13-arm64 main.py
```

Note: The above Arm64 commands will function properly once ONNX Runtime becomes available for Windows Arm 64.

## Summary
In this learning path, you’ve learned how to use ONNX Runtime to perform inference on the MNIST dataset. You prepared your environment, implemented the necessary Python code, and measured the performance of your inference tasks.
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
---
# User change
title: "Introduction"

weight: 2

layout: "learningpathall"
---

## What is ONNX?
Open Neural Network Exchange (ONNX) provides a portable, interoperable standard for machine learning (ML). It is an open-source format that enables representation of ML models through a common set of operators and a standardized file format. These operators serve as fundamental building blocks in machine learning and deep learning models, facilitating compatibility and ease of integration across various platforms.

Machine learning practitioners frequently create ML models using diverse frameworks such as PyTorch, TensorFlow, scikit-learn, Core ML, and Azure AI Custom Vision. However, each framework has its own unique methods for creating, storing, and utilizing models. This diversity poses significant challenges when deploying trained models across different environments or executing inference tasks using hardware-accelerated tools or alternate programming languages. For example, deploying models on edge devices or Arm64-powered hardware often necessitates using programming languages other than Python to fully exploit specialized hardware features, such as dedicated neural network accelerators.

By defining a unified and standardized format, ONNX effectively addresses these interoperability challenges. With ONNX, you can easily develop models using your preferred framework and export these models to the ONNX format. Additionally, the ONNX ecosystem includes robust runtime environments (such as ONNX Runtime), enabling efficient inference across multiple hardware platforms and diverse programming languages, including C++, C#, and Java, beyond Python alone.

ONNX Runtime, in particular, provides optimized inference capabilities, supporting execution on CPUs, GPUs, and specialized accelerators, thus significantly improving performance and efficiency for deployment in production environments and edge devices.

Several major ML frameworks currently support exporting models directly to the ONNX format, including Azure AI Custom Vision, Core ML, PyTorch, TensorFlow, and scikit-learn, streamlining the workflow from model development to deployment.

The companion code is available [here](https://github.com/dawidborycki/ONNX.WoA/tree/main)

## Objective
In this hands-on learning path, you will explore the practical aspects of running inference on an ONNX-formatted model for an image classification task. Specifically, the demonstration uses the widely used Modified National Institute of Standards and Technology (MNIST) dataset, illustrating how ONNX can be applied to accurately recognize handwritten digits, showcasing both the flexibility and simplicity offered by this standardized format.

## Before you Begin
Ensure you have the following prerequisites installed to complete this tutorial:

### Python
At the time of writing, Python 3.13.3 is available. You can download it using the links below
1. [Windows x64 (64-bit)](https://www.python.org/ftp/python/3.13.3/python-3.13.3-amd64.exe)
2. [Windows ARM64](https://www.python.org/ftp/python/3.13.3/python-3.13.3-arm64.exe)

Install both Python versions as required. After installation, confirm both are available by running the following command in your console

```console
py --list
```

The output should look like this:

```output
py --list
-V:3.13 * Python 3.13 (64-bit)
-V:3.13-arm64 Python 3.13 (ARM64)
```

### Creating a Virtual Environment
Create a virtual environment tailored to your Python installation's architecture. For the 64-bit Python environment, use:
```console
py -V:3.13 -m venv venv-x64
```

For arm64 use the following command:
```console
py -V:3.13-arm64 -m venv venv-arm64
```

By using different virtual environments, you can compare the performance of the same code across different architectures. However, at the time of writing, ONNX Runtime is unavailable as a Python wheel for Windows ARM64. Therefore, the subsequent instructions apply only to Windows x64.

### A Code Editor
For this demonstration we use Visual Studio Code. Available for download [here](https://code.visualstudio.com/download)

## Pre-trained Models
Many pre-trained models are available in the ONNX format ([ONNX Model Zoo](https://github.com/onnx/models)). There are models for image classification, image recognition, machine translation, language modeling, speech, audio processing, and more.

In this tutorial, you will use the pre-trained model for [handwritten digit recognition](https://github.com/onnx/models/tree/main/validated/vision/classification/mnist) — the deep learning convolutional neural model trained on the popular MNIST dataset
Loading