Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md

README.md

Section 6 - Larger Example Designs

There are a number of example designs available here, which further help explain many of the unique features of AI Engines and the NPU array in Ryzen™ AI. This section contains more complex application designs for both vision and machine learning use cases. In particular, we will describe a ResNet implementation on for Ryzen™ AI.

Vision Kernels

Design name	Data type	Description
Vision Passthrough	i8	A simple pipeline with just one `passThrough` kernel. This pipeline mainly aims to test whether the data movement works correctly to copy a greyscale image.
Color Detect	i32	This multi-kernel, multi-core pipeline detects colors in an RGBA image.
Edge Detect	i32	A multi-kernel, multi-core pipeline that detects edges in an image and overlays the detection on the original image.
Color Threshold	i32	A multi-core data-parallel implementation of color thresholding of a RGBA image.

Machine Learning Designs

Design name	Data type	Description
bottleneck	ui8	A Bottleneck Residual Block is a variant of the residual block that utilizes three convolutions, using 1x1, 3x3, and 1x1 filter sizes, respectively. The implementation features fusing of multiple kernels and dataflow optimizations, highlighting the unique architectural capabilities of AI Engines
resnet	ui8	ResNet with offloaded conv2_x layers. The implementation features depth-first implementation of multiple bottleneck blocks across multiple NPU columns.

Exercises

In bottleneck design, how many different types of fused computations do you observe?
In bottleneck design following a dataflow approach, how many elements does the 3x3 convolution operation require from the 1x1 convolution core to proceed with its computation?
Suppose you have a bottleneck block with input dimensions of 32x32x256. After passing through the 1x1 convolutional layer, the output dimensions become 32x32x64. What would be the output dimensions after the subsequent 3x3 convolutional layer, assuming a stride of 1 with no padding and an output channel of 64?

[Prev - Section 5] [Top]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

section-6

section-6

README.md

Section 6 - Larger Example Designs

Vision Kernels

Machine Learning Designs

Exercises

Files

section-6

Directory actions

More options

Directory actions

More options

Latest commit

History

section-6

Folders and files

parent directory

README.md

Section 6 - Larger Example Designs

Vision Kernels

Machine Learning Designs

Exercises