-
1.4: Get the Inference Server Docker Image (for Model Serving)
Click here to go back to the UIF User Guide home page.
ROCm™ Userspace API is guaranteed to be compatible with specific older and newer ROCm base driver installations.
Note: The ROCm userspace is delivered using a Docker® container based on ROCm v5.6.1. Refer to the following matrix for details on the supported PyTorch and TensorFlow versions: https://rocm.docs.amd.com/en/latest/release/3rd_party_support_matrix.html.
For general information on ROCm installation, see Deploy ROCm on Linux.
Verify the ROCm installation using the rocminfo
command.
$ /opt/rocm-<version>/bin/rocminfo
Install the Docker software. If Docker is not installed on your machine, see the official Docker documentation for installation instructions.
The UIF/TensorFlow Docker image provides a superset functionality from ROCm/TensorFlow, and the UIF/PyTorch Docker image provides a superset functionality from ROCm/PyTorch. When the UIF Docker images were created, no items were deleted from underlying PyTorch or TensorFlow Docker images. The items that have been added in the superset include:
- Quantizer and pruner tools as plugins to TensorFlow/PyTorch to enable the use of UIF Docker images to quantize models on a ROCm platform (for GPU or CPU). Note: To use the pruner, use the Vitis™ AI 3.5 ROCm Dockers. See the Host Installation Instructions in the Vitis AI documentation for details.
- MIGraphX to enable the use of UIF Docker images for GPU inference
A prebuilt Docker image is used to run UIF tools using PyTorch.
Follow these steps:
-
Obtain the latest Docker image.
docker pull amdih/uif-pytorch:uif1.2_rocm5.6.1_vai3.5_py3.8_pytorch1.13
The previous instruction downloads the UIF container, including PyTorch and optimization tools.
-
Start a Docker container using the image.
docker run -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --device=/dev/kfd --device=/dev/dri --group-add video --ipc=host --shm-size 8G amdih/uif-pytorch:uif1.2_rocm5.6.1_vai3.5_py3.8_pytorch1.13
You can also pass the
-v
argument to mount any data directories from the host onto the container.
A prebuilt Docker image is used to run UIF tools using TensorFlow.
Follow these steps:
-
Obtain the latest Docker image.
docker pull amdih/uif-tensorflow:uif1.2_rocm5.6.1_vai3.5_tensorflow2.12
The previous instruction downloads the UIF container, including TensorFlow and optimization tools.
-
Start a Docker container using the image.
docker run -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --device=/dev/kfd --device=/dev/dri --group-add video --ipc=host --shm-size 8G amdih/uif-tensorflow:uif1.2_rocm5.6.1_vai3.5_tensorflow2.12
You can also pass the
-v
argument to mount any data directories from the host onto the container.
Install the Docker software. If Docker is not installed on your machine yet, see the official Docker documentation for installation instructions.
For instuctions on how to pull a Docker image for the Vitis AI development environment, see the Vitis AI Docker Installation.
Perform the following steps to install TensorFlow, PyTorch and ONNXRT built with ZenDNN:
To run inference on the TensorFlow model using ZenDNN, download and install the TensorFlow+ZenDNN package. Perform the following steps to complete the TensorFlow+ZenDNN installation:
-
Download the TensorFlow+ZenDNN v4.0 release package from the AMD ZenDNN page.
-
Unzip the package. For example:
TF_v2.10_ZenDNN_v4.0_Python_v3.8.zip
.unzip TF_v2.10_ZenDNN_v4.0_Python_v3.8.zip
-
Ensure that you have the conda environment installed, and execute the following commands:
cd TF_v2.10_ZenDNN_v4.0_Python_v*/ source scripts/TF_ZenDNN_setup_release.sh
TensorFlow+ZenDNN installation completes.
To run inference on the PyTorch model using ZenDNN, download and install the PyTorch+ZenDNN package. Perform the following steps to complete the PyTorch+ZenDNN installation:
-
Download PTv1.13+ZenDNN_v4.0 release package from the AMD ZenDNN page.
-
Unzip the package. For example:
PT_v1.12.0_ZenDNN_v4.0_Python_v3.8.zip
.unzip PT_v1.12.0_ZenDNN_v4.0_Python_v3.8.zip
-
Ensure that you have the conda environment installed, and execute the following commands:
cd PT_v1.12.0_ZenDNN_v4.0_Python_v*/ source scripts/PT_ZenDNN_setup_release.sh
PyTorch+ZenDNN installation completes.
To run inference on the ONNXRT model using ZenDNN, download and install the ONNXRT+ZenDNN package. Perform the following steps to complete the ONNXRT+ZenDNN installation:
-
Download ONNXRTv1.12.1+ZenDNN_v4.0 release package from the AMD ZenDNN page.
-
Unzip the package. For example:
ONNXRT_v1.12.1_ZenDNN_v4.0_Python_v3.8.zip
.unzip ONNXRT_v1.12.1_ZenDNN_v4.0_Python_v3.8.zip
-
Ensure that you have the conda environment installed, and execute the following commands:
cd ONNXRT_v1.12.1_ZenDNN_v4.0_Python_v*/ source scripts/ONNXRT_ZenDNN_setup_release.sh
ONNXRT+ZenDNN installation completes.
The AMD Inference Server is integrated with ZenDNN, MIGraphX and Vitis AI and can be used for model serving. To use the inference server, you need a Docker image for it, which you can get by using a prebuilt image or building one from the inference server repository on GitHub.
The instructions provided here are an overview, but you can see more complete information about the AMD Inference Server in the documentation.
You can pull the appropriate deployment Docker image(s) from DockerHub using:
docker pull amdih/serve:uif1.2_migraphx_amdinfer_0.4.0
docker pull amdih/serve:uif1.2_vai_amdinfer_0.4.0
docker pull amdih/serve:uif1.2_zendnn_amdinfer_0.4.0
You need Docker (18.09+) to build the image.
- Clone the
inference-server
repo.
git clone https://github.com/Xilinx/inference-server
cd inference-server
# version 0.4.0 corresponds to this documentation
git checkout v0.4.0
python3 docker/generate.py
./amdinfer dockerize <flags>
- Use flags to control the image building, such as:
--production
: Builds the deployment version of the image instead of the default development one.--vitis
: Enables FPGAs with Vitis AI in the image.--migraphx
: Enables GPUs with MIGraphX in the image.--tfzendnn=<path to zip>
: Enables CPUs with TF+ZenDNN in the image. You need to download TF_v2.12_ZenDNN_v4.0_C++_API.zip and pass the path to it.--ptzendnn=<path to zip>
: Enables CPUs with PT+ZenDNN in the image. You need to download PT_v1.13_ZenDNN_v4.0_C++_API.zip and pass the path to it.
Note: The downloaded ZenDNN package(s) must be inside the inference-server folder since the Docker cannot access files outside the repository.
You can pass these flags in any combination. Use ./amdinfer dockerize --help
for the full documentation on the available flags.
UIF is licensed under Apache License Version 2.0. Refer to the LICENSE file for the full license text and copyright notice.
Contact [email protected] for questions, issues, and feedback on UIF.
Submit your questions, feature requests, and bug reports on the GitHub issues page.