diff --git a/README.md b/README.md
old mode 100644
new mode 100755
diff --git a/tftrt/examples-cpp/image_classification/README.md b/tftrt/examples-cpp/image_classification/README.md
index 6b3e8ba4f..eec06e5d5 100755
--- a/tftrt/examples-cpp/image_classification/README.md
+++ b/tftrt/examples-cpp/image_classification/README.md
@@ -3,104 +3,11 @@
# TF-TRT C++ Image Recognition Demo
-This example shows how you can load a native TF Keras ResNet-50 model, convert it to a TF-TRT optimized model (via the TF-TRT Python API), save the model as a frozen graph, and then finally load and serve the model with the TF C++ API. The process can be demonstrated with the below workflow diagram:
+This example shows how you can load a native TF Keras ResNet-50 model, convert it to a TF-TRT optimized model (via the TF-TRT Python API), save the model as either a frozen graph or a saved model, and then finally load and serve the model with the TF C++ API. The process can be demonstrated with the below workflow diagram:
-
+
This example is built based upon the original Google's TensorFlow C++ image classification [example](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/label_image), on top of which we added the TF-TRT conversion part and adapted the C++ code for loading and inferencing with the TF-TRT model.
-## Docker environment
-Docker images provide a convinient and repeatable environment for experimentation. This workflow was tested in the NVIDIA NGC TensorFlow 22.01 docker container that comes with a TensorFlow 2.x build. Tools required for building this example, such as Bazel, NVIDIA CUDA, CUDNN, NCCL libraries are all readily setup.
-
-To replecate the below steps, start by pulling the NGC TF container:
-
-```
-docker pull nvcr.io/nvidia/tensorflow:22.01-tf2-py3
-```
-Then start the container with nvidia-docker:
-
-```
-nvidia-docker run --rm -it -p 8888:8888 --name TFTRT_CPP nvcr.io/nvidia/tensorflow:22.01-tf2-py3
-```
-
-You will land at `/workspace` within the docker container. Clone the TF-TRT example repository with:
-
-```
-git clone https://github.com/tensorflow/tensorrt
-cd tensorrt
-```
-
-Then copy the content of this C++ example directory to the TensorFlow example source directory:
-
-```
-cp -r ./tftrt/examples-cpp/image_classification/ /opt/tensorflow/tensorflow-source/tensorflow/examples/
-cd /opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification
-```
-
-
-## Convert to TF-TRT Model
-
-Start Jupyter lab with:
-
-```
-jupyter lab -ip 0.0.0.0
-```
-
-A Jupyter notebook for downloading the Keras ResNet-50 model and TF-TRT conversion is provided in `tf-trt-conversion.ipynb` for your experimentation. By default, this notebook will produce a TF-TRT FP32 model at `/opt/tensorflow/tensorflow-source/tensorflow/examples/image-classification/frozen_models_trt_fp32/frozen_models_trt_fp32.pb`.
-
-As part of the conversion, the notebook will also carry out benchmarking and print out the throughput statistics.
-
-
-
-
-## Build the C++ example
-The NVIDIA NGC container should have everything you need to run this example installed already.
-
-To build it, first, you need to copy the build scripts `tftrt_build.sh` to `/opt/tensorflow`:
-
-```
-cp tftrt-build.sh /opt/tensorflow
-```
-
-Then from `/opt/tensorflow`, run the build command:
-
-```bash
-cd /opt/tensorflow
-bash ./tftrt-build.sh
-```
-
-That should build a binary executable `tftrt_label_image` that you can then run like this:
-
-```bash
-tensorflow-source/bazel-bin/tensorflow/examples/image_classification/tftrt_label_image \
---graph=/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/frozen_models_trt_fp32/frozen_models_trt_fp32.pb \
---image=/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/data/img0.JPG
-```
-
-This uses the default image `img0.JPG` which was download as part of the conversion notebook, and should
-output something similar to this:
-
-```
-2022-02-23 13:53:56.076348: I tensorflow/examples/image-classification/main.cc:276] malamute (250): 0.575496
-2022-02-23 13:53:56.076384: I tensorflow/examples/image-classification/main.cc:276] Saint Bernard (248): 0.399285
-2022-02-23 13:53:56.076412: I tensorflow/examples/image-classification/main.cc:276] Eskimo dog (249): 0.0228338
-2022-02-23 13:53:56.076423: I tensorflow/examples/image-classification/main.cc:276] Ibizan hound (174): 0.00127912
-2022-02-23 13:53:56.076449: I tensorflow/examples/image-classification/main.cc:276] Mexican hairless (269): 0.000520922
-```
-
-The program will also benchmark and output the throughput. Observe the improved throughput offered by moving from Python to C++ serving.
-
-Next, try it out on your own images by supplying the --image= argument, e.g.
-
-```bash
-tensorflow-source/bazel-bin/tensorflow/examples/label_image/tftrt_label_image --image=my_image.png
-```
-
-## What's next
-
-Try to build TF-TRT FP16 and INT8 models and test on your own data, and serve them with C++.
-
-```bash
-
-```
+See the respective sub-folder for details on either approach.
\ No newline at end of file
diff --git a/tftrt/examples-cpp/image_classification/TF-TRT_CPP_inference_overview.png b/tftrt/examples-cpp/image_classification/TF-TRT_CPP_inference_overview.png
new file mode 100644
index 000000000..de35058bc
Binary files /dev/null and b/tftrt/examples-cpp/image_classification/TF-TRT_CPP_inference_overview.png differ
diff --git a/tftrt/examples-cpp/image_classification/BUILD b/tftrt/examples-cpp/image_classification/frozen-graph/BUILD
similarity index 100%
rename from tftrt/examples-cpp/image_classification/BUILD
rename to tftrt/examples-cpp/image_classification/frozen-graph/BUILD
diff --git a/tftrt/examples-cpp/image_classification/frozen-graph/README.md b/tftrt/examples-cpp/image_classification/frozen-graph/README.md
new file mode 100755
index 000000000..9fd1ca305
--- /dev/null
+++ b/tftrt/examples-cpp/image_classification/frozen-graph/README.md
@@ -0,0 +1,108 @@
+
+
+
+# TF-TRT C++ Image Recognition Demo
+
+This example shows how you can load a native TF Keras ResNet-50 model, convert it to a TF-TRT optimized model (via the TF-TRT Python API), save the model as a frozen graph, and then finally load and serve the model with the TF C++ API. The process can be demonstrated with the below workflow diagram:
+
+
+
+
+This example is built based upon the original Google's TensorFlow C++ image classification [example](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/label_image), on top of which we added the TF-TRT conversion part and adapted the C++ code for loading and inferencing with the TF-TRT model.
+
+## Docker environment
+Docker images provide a convinient and repeatable environment for experimentation. This workflow was tested in the NVIDIA NGC TensorFlow 22.01 docker container that comes with a TensorFlow 2.x build. Tools required for building this example, such as Bazel, NVIDIA CUDA, CUDNN, NCCL libraries are all readily setup.
+
+To replecate the below steps, start by pulling the NGC TF container:
+
+```
+docker pull nvcr.io/nvidia/tensorflow:22.01-tf2-py3
+```
+Then start the container with nvidia-docker:
+
+```
+nvidia-docker run --rm -it -p 8888:8888 --name TFTRT_CPP nvcr.io/nvidia/tensorflow:22.01-tf2-py3
+```
+
+You will land at `/workspace` within the docker container. Clone the TF-TRT example repository with:
+
+```
+git clone https://github.com/tensorflow/tensorrt
+cd tensorrt
+```
+
+Then copy the content of this C++ example directory to the TensorFlow example source directory:
+
+```
+cp -r ./tftrt/examples-cpp/image_classification /opt/tensorflow/tensorflow-source/tensorflow/examples/
+cd /opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/frozen-graph
+```
+
+
+## Convert to TF-TRT Model
+
+Start Jupyter lab with:
+
+```
+jupyter lab -ip 0.0.0.0
+```
+
+A Jupyter notebook for downloading the Keras ResNet-50 model and TF-TRT conversion is provided in `tf-trt-conversion.ipynb` for your experimentation. By default, this notebook will produce a TF-TRT FP32 model at `/opt/tensorflow/tensorflow-source/tensorflow/examples/image-classification/frozen-graph/frozen_models_trt_fp32/frozen_models_trt_fp32.pb`.
+
+As part of the conversion, the notebook will also carry out benchmarking and print out the throughput statistics.
+
+
+
+
+## Build the C++ example
+The NVIDIA NGC container should have everything you need to run this example installed already.
+
+To build it, first, you need to copy the build scripts `tftrt_build.sh` to `/opt/tensorflow`:
+
+```
+cp tftrt-build.sh /opt/tensorflow
+```
+
+Then from `/opt/tensorflow`, run the build command:
+
+```bash
+cd /opt/tensorflow
+bash ./tftrt-build.sh
+```
+
+That should build a binary executable `tftrt_label_image` that you can then run like this:
+
+```bash
+tensorflow-source/bazel-bin/tensorflow/examples/image_classification/frozen-graph/tftrt_label_image \
+--graph=/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/frozen-graph/frozen_models_trt_fp32/frozen_models_trt_fp32.pb \
+--image=/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/frozen-graph/data/img0.JPG
+```
+
+This uses the default image `img0.JPG` which was download as part of the conversion notebook, and should
+output something similar to this:
+
+```
+2022-04-29 04:20:24.377345: I tensorflow/examples/image_classification/frozen-graph/main.cc:276] malamute (250): 0.575496
+2022-04-29 04:20:24.377370: I tensorflow/examples/image_classification/frozen-graph/main.cc:276] Saint Bernard (248): 0.399285
+2022-04-29 04:20:24.377380: I tensorflow/examples/image_classification/frozen-graph/main.cc:276] Eskimo dog (249): 0.0228338
+2022-04-29 04:20:24.377387: I tensorflow/examples/image_classification/frozen-graph/main.cc:276] Ibizan hound (174): 0.00127912
+2022-04-29 04:20:24.377394: I tensorflow/examples/image_classification/frozen-graph/main.cc:276] Mexican hairless (269): 0.000520922
+```
+
+The program will also benchmark and output the throughput. Observe the improved throughput offered by moving from Python to C++ serving.
+
+Next, try it out on your own images by supplying the --image= argument, e.g.
+
+```bash
+tensorflow-source/bazel-bin/tensorflow/examples/image_classification/frozen-graph/tftrt_label_image \
+--graph=/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/frozen-graph/frozen_models_trt_fp32/frozen_models_trt_fp32.pb \
+--image=my_image.png
+```
+
+## What's next
+
+Try to build TF-TRT FP16 and INT8 models and test on your own data, and serve them with C++.
+
+```bash
+
+```
diff --git a/tftrt/examples-cpp/image_classification/TF-TRT_CPP_inference.png b/tftrt/examples-cpp/image_classification/frozen-graph/TF-TRT_CPP_inference.png
old mode 100644
new mode 100755
similarity index 100%
rename from tftrt/examples-cpp/image_classification/TF-TRT_CPP_inference.png
rename to tftrt/examples-cpp/image_classification/frozen-graph/TF-TRT_CPP_inference.png
diff --git a/tftrt/examples-cpp/image_classification/main.cc b/tftrt/examples-cpp/image_classification/frozen-graph/main.cc
similarity index 98%
rename from tftrt/examples-cpp/image_classification/main.cc
rename to tftrt/examples-cpp/image_classification/frozen-graph/main.cc
index 5dc34da18..5248d143a 100755
--- a/tftrt/examples-cpp/image_classification/main.cc
+++ b/tftrt/examples-cpp/image_classification/frozen-graph/main.cc
@@ -302,11 +302,11 @@ int main(int argc, char* argv[]) {
// These are the command-line flags the program can understand.
// They define where the graph and input data is located, and what kind of
// input the model expects.
- string image = "/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/data/img0.JPG";
+ string image = "/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/frozen-graph/data/img0.JPG";
string graph =
- "/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/data/resnet-50.pb";
+ "/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/frozen-graph/data/resnet-50.pb";
string labels =
- "/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/data/imagenet_slim_labels.txt";
+ "/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/frozen-graph/data/imagenet_slim_labels.txt";
int32_t input_width = 224;
int32_t input_height = 224;
float input_mean = 127;
diff --git a/tftrt/examples-cpp/image_classification/tftrt-build.sh b/tftrt/examples-cpp/image_classification/frozen-graph/tftrt-build.sh
old mode 100644
new mode 100755
similarity index 100%
rename from tftrt/examples-cpp/image_classification/tftrt-build.sh
rename to tftrt/examples-cpp/image_classification/frozen-graph/tftrt-build.sh
diff --git a/tftrt/examples-cpp/image_classification/tftrt-conversion.ipynb b/tftrt/examples-cpp/image_classification/frozen-graph/tftrt-conversion.ipynb
similarity index 100%
rename from tftrt/examples-cpp/image_classification/tftrt-conversion.ipynb
rename to tftrt/examples-cpp/image_classification/frozen-graph/tftrt-conversion.ipynb
diff --git a/tftrt/examples-cpp/image_classification/saved-model/BUILD b/tftrt/examples-cpp/image_classification/saved-model/BUILD
new file mode 100755
index 000000000..2bc49d38d
--- /dev/null
+++ b/tftrt/examples-cpp/image_classification/saved-model/BUILD
@@ -0,0 +1,50 @@
+# Description:
+# TensorFlow C++ inference example with TF-TRT model.
+
+load("//tensorflow:tensorflow.bzl", "tf_cc_binary")
+
+package(
+ default_visibility = ["//tensorflow:internal"],
+ licenses = ["notice"],
+)
+
+tf_cc_binary(
+ name = "tftrt_label_image",
+ srcs = [
+ "main.cc",
+ ],
+ linkopts = select({
+ "//tensorflow:android": [
+ "-pie",
+ "-landroid",
+ "-ljnigraphics",
+ "-llog",
+ "-lm",
+ "-z defs",
+ "-s",
+ "-Wl,--exclude-libs,ALL",
+ ],
+ "//conditions:default": ["-lm"],
+ }),
+ deps = select({
+ "//tensorflow:android": [
+ # cc:cc_ops is used to include image ops (for label_image)
+ # Jpg, gif, and png related code won't be included
+ "//tensorflow/cc:cc_ops",
+ "//tensorflow/core:portable_tensorflow_lib",
+ # cc:android_tensorflow_image_op is for including jpeg/gif/png
+ # decoder to enable real-image evaluation on Android
+ "//tensorflow/core/kernels/image:android_tensorflow_image_op",
+ ],
+ "//conditions:default": [
+ "//tensorflow/cc:cc_ops",
+ "//tensorflow/cc/saved_model:loader",
+ "//tensorflow/core:core_cpu",
+ "//tensorflow/core:framework",
+ "//tensorflow/core:framework_internal",
+ "//tensorflow/core:lib",
+ "//tensorflow/core:protos_all_cc",
+ "//tensorflow/core:tensorflow",
+ ],
+ }),
+)
\ No newline at end of file
diff --git a/tftrt/examples-cpp/image_classification/saved-model/README.md b/tftrt/examples-cpp/image_classification/saved-model/README.md
new file mode 100755
index 000000000..6c2322331
--- /dev/null
+++ b/tftrt/examples-cpp/image_classification/saved-model/README.md
@@ -0,0 +1,109 @@
+
+
+
+
+# TF-TRT C++ Image Recognition Demo
+
+This example shows how you can load a native TF Keras ResNet-50 model, convert it to a TF-TRT optimized model (via the TF-TRT Python API), save the model as a saved model, and then finally load and serve the model with the TF C++ API. The process can be demonstrated with the below workflow diagram:
+
+
+
+
+This example is built based upon the original Google's TensorFlow C++ image classification [example](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/label_image), on top of which we added the TF-TRT conversion part and adapted the C++ code for loading and inferencing with the TF-TRT model.
+
+## Docker environment
+Docker images provide a convinient and repeatable environment for experimentation. This workflow was tested in the NVIDIA NGC TensorFlow 22.01 docker container that comes with a TensorFlow 2.x build. Tools required for building this example, such as Bazel, NVIDIA CUDA, CUDNN, NCCL libraries are all readily setup.
+
+To replecate the below steps, start by pulling the NGC TF container:
+
+```
+docker pull nvcr.io/nvidia/tensorflow:22.01-tf2-py3
+```
+Then start the container with nvidia-docker:
+
+```
+nvidia-docker run --rm -it -p 8888:8888 --name TFTRT_CPP nvcr.io/nvidia/tensorflow:22.01-tf2-py3
+```
+
+You will land at `/workspace` within the docker container. Clone the TF-TRT example repository with:
+
+```
+git clone https://github.com/tensorflow/tensorrt
+cd tensorrt
+```
+
+Then copy the content of this C++ example directory to the TensorFlow example source directory:
+
+```
+cp -r ./tftrt/examples-cpp/image_classification/ /opt/tensorflow/tensorflow-source/tensorflow/examples/
+cd /opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/saved-model
+```
+
+
+## Convert to TF-TRT Model
+
+Start Jupyter lab with:
+
+```
+jupyter lab -ip 0.0.0.0
+```
+
+A Jupyter notebook for downloading the Keras ResNet-50 model and TF-TRT conversion is provided in `tf-trt-conversion.ipynb` for your experimentation. By default, this notebook will produce a TF-TRT FP32 saved model at `/opt/tensorflow/tensorflow-source/tensorflow/examples/image-classification/saved-model/resnet50_saved_model_TFTRT_FP32_frozen`.
+
+As part of the conversion, the notebook will also carry out benchmarking and print out the throughput statistics.
+
+
+
+
+## Build the C++ example
+The NVIDIA NGC container should have everything you need to run this example installed already.
+
+To build it, first, you need to copy the build scripts `tftrt_build.sh` to `/opt/tensorflow`:
+
+```
+cp tftrt-build.sh /opt/tensorflow
+```
+
+Then from `/opt/tensorflow`, run the build command:
+
+```bash
+cd /opt/tensorflow
+bash ./tftrt-build.sh
+```
+
+That should build a binary executable `tftrt_label_image` that you can then run like this:
+
+```bash
+tensorflow-source/bazel-bin/tensorflow/examples/image_classification/saved-model/tftrt_label_image \
+--export_dir=/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/saved-model/resnet50_saved_model_TFTRT_FP32_frozen \
+--image=/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/saved-model/data/img0.JPG
+```
+
+This uses the default image `img0.JPG` which was download as part of the conversion notebook, and should
+output something similar to this:
+
+```
+2022-04-29 04:19:28.397102: I tensorflow/examples/image_classification/saved-model/main.cc:331] malamute (250): 0.575497
+2022-04-29 04:19:28.397126: I tensorflow/examples/image_classification/saved-model/main.cc:331] Saint Bernard (248): 0.399284
+2022-04-29 04:19:28.397134: I tensorflow/examples/image_classification/saved-model/main.cc:331] Eskimo dog (249): 0.0228338
+2022-04-29 04:19:28.397141: I tensorflow/examples/image_classification/saved-model/main.cc:331] Ibizan hound (174): 0.00127912
+2022-04-29 04:19:28.397147: I tensorflow/examples/image_classification/saved-model/main.cc:331] Mexican hairless (269): 0.000520922
+```
+
+The program will also benchmark and output the throughput. Observe the improved throughput offered by moving from Python to C++ serving.
+
+Next, try it out on your own images by supplying the --image= argument, e.g.
+
+```bash
+tensorflow-source/bazel-bin/tensorflow/examples/image_classification/saved-model/tftrt_label_image \
+--export_dir=/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/saved-model/resnet50_saved_model_TFTRT_FP32_frozen \
+--image=my_image.png
+```
+
+## What's next
+
+Try to build TF-TRT FP16 and INT8 models and test on your own data, and serve them with C++.
+
+```bash
+
+```
diff --git a/tftrt/examples-cpp/image_classification/saved-model/TF-TRT_CPP_inference_saved_model.png b/tftrt/examples-cpp/image_classification/saved-model/TF-TRT_CPP_inference_saved_model.png
new file mode 100755
index 000000000..153881ad3
Binary files /dev/null and b/tftrt/examples-cpp/image_classification/saved-model/TF-TRT_CPP_inference_saved_model.png differ
diff --git a/tftrt/examples-cpp/image_classification/saved-model/image_classification_build.sh b/tftrt/examples-cpp/image_classification/saved-model/image_classification_build.sh
new file mode 100755
index 000000000..38477247c
--- /dev/null
+++ b/tftrt/examples-cpp/image_classification/saved-model/image_classification_build.sh
@@ -0,0 +1,37 @@
+#!/bin/bash
+# Build the C++ TFTRT Example
+
+# Copyright 2019 NVIDIA Corporation. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+
+set -e
+if [[ ! -f /opt/tensorflow/nvbuild.sh || ! -f /opt/tensorflow/nvbuildopts ]]; then
+ echo This TF-TRT example is intended to be executed in the NGC TensorFlow container environment. Get one with, e.g. `docker pull nvcr.io/nvidia/tensorflow:19.10-py3`.
+ exit 1
+fi
+
+# TODO: to programatically determine the python and tf API versions
+PYVER=3.6 #TODO get this by parsing `python --version`
+TFAPI=1 #TODO get this by parsing tf.__version__
+
+/opt/tensorflow/nvbuild.sh --configonly --python$PYVER --v$TFAPI
+
+BUILD_OPTS="$(cat /opt/tensorflow/nvbuildopts)"
+if [[ "$TFAPI" == "2" ]]; then
+ BUILD_OPTS="--config=v2 $BUILD_OPTS"
+fi
+
+cd /opt/tensorflow/tensorflow-source
+bazel build $BUILD_OPTS tensorflow/examples/image-classification/...
diff --git a/tftrt/examples-cpp/image_classification/saved-model/main.cc b/tftrt/examples-cpp/image_classification/saved-model/main.cc
new file mode 100755
index 000000000..a21e2f330
--- /dev/null
+++ b/tftrt/examples-cpp/image_classification/saved-model/main.cc
@@ -0,0 +1,464 @@
+/* Copyright 2021 NVIDIA Corporation. All Rights Reserved.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+==============================================================================*/
+
+/* Copyright 2015 The TensorFlow Authors. All Rights Reserved.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+==============================================================================*/
+
+// A minimal but useful C++ example showing how to load a TF-TRT ResNet-50 model,
+// prepare input images for it, run them through the graph, and interpret the results.
+//
+// It's designed to have as few dependencies and be as clear as possible, so
+// it's more verbose than it could be in production code. In particular, using
+// auto for the types of a lot of the returned values from TensorFlow calls can
+// remove a lot of boilerplate, but I find the explicit types useful in sample
+// code to make it simple to look up the classes involved.
+//
+// To use it, compile and then run in a working directory with the
+// learning/brain/tutorials/label_image/data/ folder below it, and you should
+// see the top five labels for the example Lena image output. You can then
+// customize it to use your own models or images by changing the file names at
+// the top of the main() function.
+//
+// Note that, for GIF inputs, to reuse existing code, only single-frame ones
+// are supported.
+
+#include
+#include
+#include
+#include
+#include
+#include
+#include
+
+#include "tensorflow/cc/ops/const_op.h"
+#include "tensorflow/cc/ops/array_ops.h"
+#include "tensorflow/cc/ops/image_ops.h"
+#include "tensorflow/cc/ops/standard_ops.h"
+#include "tensorflow/core/framework/graph.pb.h"
+#include "tensorflow/core/framework/tensor.h"
+#include "tensorflow/core/graph/default_device.h"
+#include "tensorflow/core/graph/graph_def_builder.h"
+#include "tensorflow/core/lib/core/errors.h"
+#include "tensorflow/core/lib/core/stringpiece.h"
+#include "tensorflow/core/lib/core/threadpool.h"
+#include "tensorflow/core/lib/io/path.h"
+#include "tensorflow/core/lib/strings/str_util.h"
+#include "tensorflow/core/lib/strings/stringprintf.h"
+#include "tensorflow/core/platform/env.h"
+#include "tensorflow/core/platform/init_main.h"
+#include "tensorflow/core/platform/logging.h"
+#include "tensorflow/core/platform/types.h"
+#include "tensorflow/core/public/session.h"
+#include "tensorflow/core/util/command_line_flags.h"
+#include "tensorflow/cc/saved_model/loader.h"
+#include "tensorflow/core/framework/tensor.pb.h"
+#include "tensorflow/core/lib/core/status.h"
+#include "tensorflow/core/platform/init_main.h"
+
+#include "absl/strings/string_view.h"
+
+// These are all common classes it's handy to reference with no namespace.
+using tensorflow::Flag;
+using tensorflow::int32;
+using tensorflow::Status;
+using tensorflow::string;
+using tensorflow::Tensor;
+using tensorflow::tstring;
+
+// Returns the name of nodes listed in the signature definition.
+std::vector
+GetNodeNames(const google::protobuf::Map
+ &signature) {
+ std::vector names;
+ for (auto const &item : signature) {
+ absl::string_view name = item.second.name();
+ // Remove tensor suffix like ":0".
+ size_t last_colon = name.find_last_of(':');
+ if (last_colon != absl::string_view::npos) {
+ name.remove_suffix(name.size() - last_colon);
+ }
+ names.push_back(std::string(name));
+ }
+ return names;
+}
+
+// Loads a SavedModel from export_dir into the SavedModelBundle.
+tensorflow::Status LoadModel(const std::string &export_dir,
+ tensorflow::SavedModelBundle *bundle,
+ std::vector *input_names,
+ std::vector *output_names) {
+
+ tensorflow::RunOptions run_options;
+ TF_RETURN_IF_ERROR(tensorflow::LoadSavedModel(tensorflow::SessionOptions(),
+ run_options, export_dir,
+ {"serve"}, bundle));
+
+ // Print the signature defs.
+ auto signature_map = bundle->GetSignatures();
+ for (const auto &name_and_signature_def : signature_map) {
+ const auto &name = name_and_signature_def.first;
+ const auto &signature_def = name_and_signature_def.second;
+ std::cerr << "Name: " << name << std::endl;
+ std::cerr << "SignatureDef: " << signature_def.DebugString() << std::endl;
+ }
+
+ // Extract input and output tensor names from the signature def.
+ const tensorflow::SignatureDef &signature = signature_map["serving_default"];
+ *input_names = GetNodeNames(signature.inputs());
+ *output_names = GetNodeNames(signature.outputs());
+
+ return tensorflow::Status::OK();
+}
+
+// Takes a file name, and loads a list of labels from it, one per line, and
+// returns a vector of the strings. It pads with empty strings so the length
+// of the result is a multiple of 16, because our model expects that.
+Status ReadLabelsFile(const string& file_name, std::vector* result,
+ size_t* found_label_count) {
+ std::ifstream file(file_name);
+ if (!file) {
+ return tensorflow::errors::NotFound("Labels file ", file_name,
+ " not found.");
+ }
+ result->clear();
+ string line;
+ while (std::getline(file, line)) {
+ result->push_back(line);
+ }
+ *found_label_count = result->size();
+ const int padding = 16;
+ while (result->size() % padding) {
+ result->emplace_back();
+ }
+ return Status::OK();
+}
+
+static Status ReadEntireFile(tensorflow::Env* env, const string& filename,
+ Tensor* output) {
+ tensorflow::uint64 file_size = 0;
+ TF_RETURN_IF_ERROR(env->GetFileSize(filename, &file_size));
+
+ string contents;
+ contents.resize(file_size);
+
+ std::unique_ptr file;
+ TF_RETURN_IF_ERROR(env->NewRandomAccessFile(filename, &file));
+
+ tensorflow::StringPiece data;
+ TF_RETURN_IF_ERROR(file->Read(0, file_size, &data, &(contents)[0]));
+ if (data.size() != file_size) {
+ return tensorflow::errors::DataLoss("Truncated read of '", filename,
+ "' expected ", file_size, " got ",
+ data.size());
+ }
+ output->scalar()() = tstring(data);
+ return Status::OK();
+}
+
+// Given an image file name, read in the data, try to decode it as an image,
+// resize it to the requested size, and then scale the values as desired.
+Status ReadTensorFromImageFile(const string& file_name, const int input_height,
+ const int input_width, const float input_mean,
+ const float input_std,
+ std::vector* out_tensors) {
+ auto root = tensorflow::Scope::NewRootScope();
+ using namespace ::tensorflow::ops; // NOLINT(build/namespaces)
+
+ string input_name = "file_reader";
+ string output_name = "normalized";
+
+ // read file_name into a tensor named input
+ Tensor input(tensorflow::DT_STRING, tensorflow::TensorShape());
+ TF_RETURN_IF_ERROR(
+ ReadEntireFile(tensorflow::Env::Default(), file_name, &input));
+
+ // use a placeholder to read input data
+ auto file_reader =
+ Placeholder(root.WithOpName("input"), tensorflow::DataType::DT_STRING);
+
+ std::vector> inputs = {
+ {"input", input},
+ };
+
+ // Now try to figure out what kind of file it is and decode it.
+ const int wanted_channels = 3;
+ tensorflow::Output image_reader;
+ if (tensorflow::str_util::EndsWith(file_name, ".png")) {
+ image_reader = DecodePng(root.WithOpName("png_reader"), file_reader,
+ DecodePng::Channels(wanted_channels));
+ } else if (tensorflow::str_util::EndsWith(file_name, ".gif")) {
+ // gif decoder returns 4-D tensor, remove the first dim
+ image_reader =
+ Squeeze(root.WithOpName("squeeze_first_dim"),
+ DecodeGif(root.WithOpName("gif_reader"), file_reader));
+ } else if (tensorflow::str_util::EndsWith(file_name, ".bmp")) {
+ image_reader = DecodeBmp(root.WithOpName("bmp_reader"), file_reader);
+ } else {
+ // Assume if it's neither a PNG nor a GIF then it must be a JPEG.
+ image_reader = DecodeJpeg(root.WithOpName("jpeg_reader"), file_reader,
+ DecodeJpeg::Channels(wanted_channels));
+ }
+ // Now cast the image data to float so we can do normal math on it.
+ auto float_caster =
+ Cast(root.WithOpName("float_caster"), image_reader, tensorflow::DT_FLOAT);
+ // The convention for image ops in TensorFlow is that all images are expected
+ // to be in batches, so that they're four-dimensional arrays with indices of
+ // [batch, height, width, channel]. Because we only have a single image, we
+ // have to add a batch dimension of 1 to the start with ExpandDims().
+ auto dims_expander = ExpandDims(root, float_caster, 0);
+ // Bilinearly resize the image to fit the required dimensions.
+ auto resized = ResizeBilinear(
+ root, dims_expander,
+ Const(root.WithOpName("size"), {input_height, input_width}));
+
+ // Preprocess image in "caffe" style: https://github.com/keras-team/keras/blob/d8fcb9d4d4dad45080ecfdd575483653028f8eda/keras/applications/imagenet_utils.py#L206
+ // Convert channel from RGB -> BGR
+ auto unstack_image_node = tensorflow::ops::Unstack(root.WithOpName("unstack_image"), resized, 3, tensorflow::ops::Unstack::Attrs().Axis(3));
+ auto stacked_image_node = tensorflow::ops::Stack(root.WithOpName("stacked_image"), {unstack_image_node[2],unstack_image_node[1],unstack_image_node[0]}, tensorflow::ops::Stack::Attrs().Axis(3));
+
+
+ // Substract mean: BGR
+ std::vector vec = {103.939, 116.779, 123.68};
+ Tensor img_mean(tensorflow::DT_FLOAT, {3});
+ std::copy_n(vec.begin(), vec.size(), img_mean.flat().data());
+
+ Div(root.WithOpName(output_name), Sub(root, stacked_image_node, img_mean),
+ {input_std});
+
+ // This runs the GraphDef network definition that we've just constructed, and
+ // returns the results in the output tensor.
+ tensorflow::GraphDef graph;
+ TF_RETURN_IF_ERROR(root.ToGraphDef(&graph));
+
+ std::unique_ptr session(
+ tensorflow::NewSession(tensorflow::SessionOptions()));
+ TF_RETURN_IF_ERROR(session->Create(graph));
+ TF_RETURN_IF_ERROR(session->Run({inputs}, {output_name}, {}, out_tensors));
+ return Status::OK();
+}
+
+// Reads a model graph definition from disk, and creates a session object you
+// can use to run it.
+Status LoadGraph(const string& graph_file_name,
+ std::unique_ptr* session) {
+ tensorflow::GraphDef graph_def;
+ Status load_graph_status =
+ ReadBinaryProto(tensorflow::Env::Default(), graph_file_name, &graph_def);
+ if (!load_graph_status.ok()) {
+ return tensorflow::errors::NotFound("Failed to load compute graph at '",
+ graph_file_name, "'");
+ }
+ session->reset(tensorflow::NewSession(tensorflow::SessionOptions()));
+ Status session_create_status = (*session)->Create(graph_def);
+ if (!session_create_status.ok()) {
+ return session_create_status;
+ }
+ return Status::OK();
+}
+
+// Analyzes the output of the Inception graph to retrieve the highest scores and
+// their positions in the tensor, which correspond to categories.
+Status GetTopLabels(const std::vector& outputs, int how_many_labels,
+ Tensor* indices, Tensor* scores) {
+ auto root = tensorflow::Scope::NewRootScope();
+ using namespace ::tensorflow::ops; // NOLINT(build/namespaces)
+
+ string output_name = "top_k";
+ TopK(root.WithOpName(output_name), outputs[0], how_many_labels);
+ // This runs the GraphDef network definition that we've just constructed, and
+ // returns the results in the output tensors.
+ tensorflow::GraphDef graph;
+ TF_RETURN_IF_ERROR(root.ToGraphDef(&graph));
+
+ std::unique_ptr session(
+ tensorflow::NewSession(tensorflow::SessionOptions()));
+ TF_RETURN_IF_ERROR(session->Create(graph));
+ // The TopK node returns two outputs, the scores and their original indices,
+ // so we have to append :0 and :1 to specify them both.
+ std::vector out_tensors;
+ TF_RETURN_IF_ERROR(session->Run({}, {output_name + ":0", output_name + ":1"},
+ {}, &out_tensors));
+ *scores = out_tensors[0];
+ *indices = out_tensors[1];
+ return Status::OK();
+}
+
+// Given the output of a model run, and the name of a file containing the labels
+// this prints out the top five highest-scoring values.
+Status PrintTopLabels(const std::vector& outputs,
+ const string& labels_file_name) {
+ std::vector labels;
+ size_t label_count;
+ Status read_labels_status =
+ ReadLabelsFile(labels_file_name, &labels, &label_count);
+ if (!read_labels_status.ok()) {
+ LOG(ERROR) << read_labels_status;
+ return read_labels_status;
+ }
+ const int how_many_labels = std::min(5, static_cast(label_count));
+ Tensor indices;
+ Tensor scores;
+ TF_RETURN_IF_ERROR(GetTopLabels(outputs, how_many_labels, &indices, &scores));
+ tensorflow::TTypes::Flat scores_flat = scores.flat();
+ tensorflow::TTypes::Flat indices_flat = indices.flat();
+ for (int pos = 0; pos < how_many_labels; ++pos) {
+ const int label_index = indices_flat(pos);
+ const float score = scores_flat(pos);
+ LOG(INFO) << labels[label_index] << " (" << label_index << "): " << score;
+ }
+ return Status::OK();
+}
+
+// This is a testing function that returns whether the top label index is the
+// one that's expected.
+Status CheckTopLabel(const std::vector& outputs, int expected,
+ bool* is_expected) {
+ *is_expected = false;
+ Tensor indices;
+ Tensor scores;
+ const int how_many_labels = 1;
+ TF_RETURN_IF_ERROR(GetTopLabels(outputs, how_many_labels, &indices, &scores));
+ tensorflow::TTypes::Flat indices_flat = indices.flat();
+ if (indices_flat(0) != expected) {
+ LOG(ERROR) << "Expected label #" << expected << " but got #"
+ << indices_flat(0);
+ *is_expected = false;
+ } else {
+ *is_expected = true;
+ }
+ return Status::OK();
+}
+
+int main(int argc, char* argv[]) {
+ // These are the command-line flags the program can understand.
+ // They define where the graph and input data is located, and what kind of
+ // input the model expects.
+ string image = "/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/saved-model/data/img0.JPG";
+ string export_dir =
+ "/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/saved-model/resnet50_saved_model_TFTRT_FP32_frozen";
+ string labels =
+ "/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/saved-model/data/imagenet_slim_labels.txt";
+ int32_t input_width = 224;
+ int32_t input_height = 224;
+ float input_mean = 127;
+ float input_std = 1;
+ bool self_test = false;
+ string root_dir = "";
+
+ std::vector flag_list = {
+ Flag("image", &image, "image to be processed"),
+ Flag("export_dir", &export_dir, "frozen TF-TRT saved model to be executed"),
+ Flag("labels", &labels, "name of file containing labels"),
+ Flag("input_width", &input_width, "resize image to this width in pixels"),
+ Flag("input_height", &input_height,
+ "resize image to this height in pixels"),
+ Flag("input_mean", &input_mean, "scale pixel values to this mean"),
+ Flag("input_std", &input_std, "scale pixel values to this std deviation"),
+ Flag("self_test", &self_test, "run a self test"),
+ Flag("root_dir", &root_dir,
+ "interpret image and graph file names relative to this directory"),
+ };
+ string usage = tensorflow::Flags::Usage(argv[0], flag_list);
+ const bool parse_result = tensorflow::Flags::Parse(&argc, argv, flag_list);
+ if (!parse_result) {
+ LOG(ERROR) << usage;
+ return -1;
+ }
+
+ // We need to call this to set up global state for TensorFlow.
+ tensorflow::port::InitMain(argv[0], &argc, &argv);
+ if (argc > 1) {
+ LOG(ERROR) << "Unknown argument " << argv[1] << "\n" << usage;
+ return -1;
+ }
+
+ tensorflow::SavedModelBundle bundle;
+ std::vector input_names;
+ std::vector output_names;
+
+ // Load the saved model from the provided path.
+ Status load_graph_status = LoadModel(export_dir, &bundle, &input_names, &output_names);
+ if (!load_graph_status.ok()) {
+ LOG(ERROR) << load_graph_status;
+ return -1;
+ }
+
+ auto sig_map = bundle.GetSignatures();
+ auto model_def = sig_map.at("serving_default");
+
+ printf("Model Signature");
+ for (auto const& p : sig_map) {
+ printf("key: %s", p.first.c_str());
+ }
+
+ printf("Model Input Nodes");
+ for (auto const& p : model_def.inputs()) {
+ printf("key: %s value: %s", p.first.c_str(), p.second.name().c_str());
+ }
+
+ printf("Model Output Nodes");
+ for (auto const& p : model_def.outputs()) {
+ printf("key: %s value: %s", p.first.c_str(), p.second.name().c_str());
+ }
+
+ auto input_name = model_def.inputs().at("input_2").name();
+ auto output_name = model_def.outputs().at("output_0").name();
+
+ // Get the image from disk as a float array of numbers, resized and normalized
+ // to the specifications the main graph expects.
+ std::vector resized_tensors;
+ string image_path = tensorflow::io::JoinPath(root_dir, image);
+ Status read_tensor_status =
+ ReadTensorFromImageFile(image_path, input_height, input_width, input_mean,
+ input_std, &resized_tensors);
+ if (!read_tensor_status.ok()) {
+ LOG(ERROR) << read_tensor_status;
+ return -1;
+ }
+ const Tensor& resized_tensor = resized_tensors[0];
+
+ // Actually run the image through the model.
+ std::vector outputs;
+
+ // fill the input tensors with data
+ tensorflow::Status status;
+ status = bundle.session->Run({ {input_name, resized_tensor}},
+ {output_name}, {}, &outputs);
+ if (!status.ok()) {
+ std::cerr << "Inference failed: " << status;
+ return -1;
+ }
+
+ // Do something interesting with the results we've generated.
+ Status print_status = PrintTopLabels(outputs, labels);
+ if (!print_status.ok()) {
+ LOG(ERROR) << "Running print failed: " << print_status;
+ return -1;
+ }
+
+ return 0;
+}
diff --git a/tftrt/examples-cpp/image_classification/saved-model/tftrt-build.sh b/tftrt/examples-cpp/image_classification/saved-model/tftrt-build.sh
new file mode 100755
index 000000000..2d4604aa3
--- /dev/null
+++ b/tftrt/examples-cpp/image_classification/saved-model/tftrt-build.sh
@@ -0,0 +1,13 @@
+# TODO: to programatically determine the python and tf API versions
+PYVER=3.8 #TODO get this by parsing `python --version`
+TFAPI=2 #TODO get this by parsing tf.__version__
+
+/opt/tensorflow/nvbuild.sh --configonly --python$PYVER --v$TFAPI
+
+BUILD_OPTS="$(cat /opt/tensorflow/nvbuildopts)"
+if [[ "$TFAPI" == "2" ]]; then
+ BUILD_OPTS="--config=v2 $BUILD_OPTS"
+fi
+
+cd tensorflow-source
+bazel build $BUILD_OPTS tensorflow/examples/image_classification/...
diff --git a/tftrt/examples-cpp/image_classification/saved-model/tftrt-conversion.ipynb b/tftrt/examples-cpp/image_classification/saved-model/tftrt-conversion.ipynb
new file mode 100755
index 000000000..4c3eb5f28
--- /dev/null
+++ b/tftrt/examples-cpp/image_classification/saved-model/tftrt-conversion.ipynb
@@ -0,0 +1,699 @@
+{
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {},
+ "colab_type": "code",
+ "id": "dR1W9kv7IPhE"
+ },
+ "outputs": [],
+ "source": [
+ "# Copyright 2021 NVIDIA Corporation. All Rights Reserved.\n",
+ "\n",
+ "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
+ "# you may not use this file except in compliance with the License.\n",
+ "# You may obtain a copy of the License at\n",
+ "\n",
+ "# http://www.apache.org/licenses/LICENSE-2.0\n",
+ "\n",
+ "# Unless required by applicable law or agreed to in writing, software\n",
+ "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
+ "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
+ "# See the License for the specific language governing permissions and\n",
+ "# limitations under the License.\n",
+ "# =============================================================================="
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "colab_type": "text",
+ "id": "Yb3TdMZAkVNq"
+ },
+ "source": [
+ "
\n",
+ "\n",
+ "# TensorFlow C++ Inference with TF-TRT Models\n",
+ "\n",
+ "\n",
+ "## Introduction\n",
+ "In this notebook, we will download a pretrained Keras ResNet-50 model, optimize it with TF-TRT, convert it to a frozen graph, then load and do inference with the TensorFlow C++ API.\n",
+ "\n",
+ "First, we download the image net labels."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "!mkdir data\n",
+ "!curl -L \"https://storage.googleapis.com/download.tensorflow.org/models/inception_v3_2016_08_28_frozen.pb.tar.gz\" | tar -C ./data -xz\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {},
+ "colab_type": "code",
+ "id": "8Fg4x4aomCY4",
+ "scrolled": true
+ },
+ "outputs": [],
+ "source": [
+ "!nvidia-smi"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "colab_type": "text",
+ "id": "LG4IBNn-2PWY"
+ },
+ "source": [
+ "### Install Dependencies"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "scrolled": true
+ },
+ "outputs": [],
+ "source": [
+ "!pip install pillow matplotlib"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 35
+ },
+ "colab_type": "code",
+ "id": "v0mfnfqg3ned",
+ "outputId": "11c043a0-b8e5-49e2-f907-5f1372c92a68",
+ "scrolled": true
+ },
+ "outputs": [],
+ "source": [
+ "import tensorflow as tf\n",
+ "print(\"Tensorflow version: \", tf.version.VERSION)\n",
+ "\n",
+ "# check TensorRT version\n",
+ "print(\"TensorRT version: \")\n",
+ "!dpkg -l | grep nvinfer"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "colab_type": "text",
+ "id": "9U8b2394CZRu"
+ },
+ "source": [
+ "An available TensorRT installation looks like:\n",
+ "\n",
+ "```\n",
+ "TensorRT version: \n",
+ "ii libnvinfer8 8.2.2-1+cuda11.4 amd64 TensorRT runtime libraries\n",
+ "```"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "colab_type": "text",
+ "id": "nWYufTjPCMgW"
+ },
+ "source": [
+ "### Importing required libraries"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {},
+ "colab_type": "code",
+ "id": "Yyzwxjlm37jx"
+ },
+ "outputs": [],
+ "source": [
+ "from __future__ import absolute_import, division, print_function, unicode_literals\n",
+ "import os\n",
+ "import time\n",
+ "\n",
+ "import numpy as np\n",
+ "import matplotlib.pyplot as plt\n",
+ "\n",
+ "import tensorflow as tf\n",
+ "from tensorflow import keras\n",
+ "from tensorflow.python.compiler.tensorrt import trt_convert as trt\n",
+ "from tensorflow.python.saved_model import tag_constants\n",
+ "from tensorflow.keras.applications.resnet50 import ResNet50\n",
+ "from tensorflow.keras.preprocessing import image\n",
+ "from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "colab_type": "text",
+ "id": "v-R2iN4akVOi"
+ },
+ "source": [
+ "## Data\n",
+ "We download several random images for testing from the Internet."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {},
+ "colab_type": "code",
+ "id": "tVJ2-8rokVOl",
+ "scrolled": true
+ },
+ "outputs": [],
+ "source": [
+ "!mkdir ./data\n",
+ "!wget -O ./data/img0.JPG \"https://d17fnq9dkz9hgj.cloudfront.net/breed-uploads/2018/08/siberian-husky-detail.jpg?bust=1535566590&width=630\"\n",
+ "!wget -O ./data/img1.JPG \"https://www.hakaimagazine.com/wp-content/uploads/header-gulf-birds.jpg\"\n",
+ "!wget -O ./data/img2.JPG \"https://www.artis.nl/media/filer_public_thumbnails/filer_public/00/f1/00f1b6db-fbed-4fef-9ab0-84e944ff11f8/chimpansee_amber_r_1920x1080.jpg__1920x1080_q85_subject_location-923%2C365_subsampling-2.jpg\"\n",
+ "!wget -O ./data/img3.JPG \"https://www.familyhandyman.com/wp-content/uploads/2018/09/How-to-Avoid-Snakes-Slithering-Up-Your-Toilet-shutterstock_780480850.jpg\""
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 269
+ },
+ "colab_type": "code",
+ "id": "F_9n-AR1kVOv",
+ "outputId": "e0ead6dc-e761-404e-a030-f6d3057a57da"
+ },
+ "outputs": [],
+ "source": [
+ "from tensorflow.keras.preprocessing import image\n",
+ "\n",
+ "fig, axes = plt.subplots(nrows=2, ncols=2)\n",
+ "\n",
+ "for i in range(4):\n",
+ " img_path = './data/img%d.JPG'%i\n",
+ " img = image.load_img(img_path, target_size=(224, 224), interpolation='bilinear')\n",
+ " plt.subplot(2,2,i+1)\n",
+ " plt.imshow(img);\n",
+ " plt.axis('off');"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "colab_type": "text",
+ "id": "xeV4r2YTkVO1"
+ },
+ "source": [
+ "## Model\n",
+ "\n",
+ "We next download and test a ResNet-50 pre-trained model from the Keras model zoo."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 73
+ },
+ "colab_type": "code",
+ "id": "WwRBOikEkVO3",
+ "outputId": "2d63bc46-8bac-492f-b519-9ae5f19176bc"
+ },
+ "outputs": [],
+ "source": [
+ "model = ResNet50(weights='imagenet')"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 410
+ },
+ "colab_type": "code",
+ "id": "lFKQPoLO_ikd",
+ "outputId": "c0b93de8-c94b-4977-992e-c780e12a3d52"
+ },
+ "outputs": [],
+ "source": [
+ "for i in range(4):\n",
+ " img_path = './data/img%d.JPG'%i\n",
+ " img = image.load_img(img_path, target_size=(224, 224),interpolation='bilinear')\n",
+ " x = image.img_to_array(img)\n",
+ " x = np.expand_dims(x, axis=0)\n",
+ " x = preprocess_input(x)\n",
+ "\n",
+ " preds = model.predict(x)\n",
+ " # decode the results into a list of tuples (class, description, probability)\n",
+ " # (one such list for each sample in the batch)\n",
+ " print('{} - Predicted: {}'.format(img_path, decode_predictions(preds, top=3)[0]))\n",
+ "\n",
+ " plt.subplot(2,2,i+1)\n",
+ " plt.imshow(img);\n",
+ " plt.axis('off');\n",
+ " plt.title(decode_predictions(preds, top=3)[0][0][1])\n",
+ " "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "colab_type": "text",
+ "id": "XrL3FEcdkVPA"
+ },
+ "source": [
+ "TF-TRT takes input as a TensorFlow saved model, therefore, we re-export the Keras model as a TF saved model."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 110
+ },
+ "colab_type": "code",
+ "id": "WxlUF3rlkVPH",
+ "outputId": "9f3864e7-f211-4c06-d2d2-585c1a477e34"
+ },
+ "outputs": [],
+ "source": [
+ "# Save the entire model as a SavedModel.\n",
+ "model.save('resnet50_saved_model') "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 453
+ },
+ "colab_type": "code",
+ "id": "RBu2RKs6kVPP",
+ "outputId": "8e063261-7efb-47fd-fa6c-1bb5076d418c"
+ },
+ "outputs": [],
+ "source": [
+ "!saved_model_cli show --all --dir resnet50_saved_model"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "colab_type": "text",
+ "id": "qBQwBvlNm-J8"
+ },
+ "source": [
+ "### Inference with native TF2.0 saved model"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {},
+ "colab_type": "code",
+ "id": "8zLN0GMCkVPe"
+ },
+ "outputs": [],
+ "source": [
+ "model = tf.keras.models.load_model('resnet50_saved_model')"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 219
+ },
+ "colab_type": "code",
+ "id": "Fbj-UEOxkVPs",
+ "outputId": "3a2b34f9-8034-48cb-b3fe-477f09966025"
+ },
+ "outputs": [],
+ "source": [
+ "img_path = './data/img0.JPG' # Siberian_husky\n",
+ "img = image.load_img(img_path, target_size=(224, 224))\n",
+ "x = image.img_to_array(img)\n",
+ "x = np.expand_dims(x, axis=0)\n",
+ "x = preprocess_input(x)\n",
+ "\n",
+ "preds = model.predict(x)\n",
+ "# decode the results into a list of tuples (class, description, probability)\n",
+ "# (one such list for each sample in the batch)\n",
+ "print('{} - Predicted: {}'.format(img_path, decode_predictions(preds, top=3)[0]))\n",
+ "plt.subplot(2,2,1)\n",
+ "plt.imshow(img);\n",
+ "plt.axis('off');\n",
+ "plt.title(decode_predictions(preds, top=3)[0][0][1])"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 35
+ },
+ "colab_type": "code",
+ "id": "CGc-dC6DvwRP",
+ "outputId": "e0a22e05-f4fe-47b6-93e8-2b806bf7098a"
+ },
+ "outputs": [],
+ "source": [
+ "batch_size = 1\n",
+ "batched_input = np.zeros((batch_size, 224, 224, 3), dtype=np.float32)\n",
+ "\n",
+ "for i in range(batch_size):\n",
+ " img_path = './data/img%d.JPG' % (i % 4)\n",
+ " img = image.load_img(img_path, target_size=(224, 224))\n",
+ " x = image.img_to_array(img)\n",
+ " x = np.expand_dims(x, axis=0)\n",
+ " x = preprocess_input(x)\n",
+ " batched_input[i, :] = x\n",
+ "batched_input = tf.constant(batched_input)\n",
+ "print('batched_input shape: ', batched_input.shape)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {},
+ "colab_type": "code",
+ "id": "rFBV6hQR7N3z"
+ },
+ "outputs": [],
+ "source": [
+ "# Benchmarking throughput\n",
+ "N_warmup_run = 50\n",
+ "N_run = 1000\n",
+ "elapsed_time = []\n",
+ "\n",
+ "for i in range(N_warmup_run):\n",
+ " preds = model.predict(batched_input)\n",
+ "\n",
+ "for i in range(N_run):\n",
+ " start_time = time.time()\n",
+ " preds = model.predict(batched_input)\n",
+ " end_time = time.time()\n",
+ " elapsed_time = np.append(elapsed_time, end_time - start_time)\n",
+ " if i % 50 == 0:\n",
+ " print('Step {}: {:4.1f}ms'.format(i, (elapsed_time[-50:].mean()) * 1000))\n",
+ "\n",
+ "print('Throughput: {:.0f} images/s'.format(N_run * batch_size / elapsed_time.sum()))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "colab_type": "text",
+ "id": "vC_RN0BAkVPy"
+ },
+ "source": [
+ "### TF-TRT FP32 model\n",
+ "\n",
+ "We next convert the TF native FP32 model to a TF-TRT FP32 model."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 126
+ },
+ "colab_type": "code",
+ "id": "0eLImSJ-kVPz",
+ "outputId": "e2c353c7-8e4b-49aa-ab97-f4d82797d4d8"
+ },
+ "outputs": [],
+ "source": [
+ "print('Converting to TF-TRT FP32...')\n",
+ "conversion_params = trt.DEFAULT_TRT_CONVERSION_PARAMS._replace(precision_mode=trt.TrtPrecisionMode.FP32,\n",
+ " max_workspace_size_bytes=8000000000)\n",
+ "\n",
+ "converter = trt.TrtGraphConverterV2(input_saved_model_dir='resnet50_saved_model',\n",
+ " conversion_params=conversion_params)\n",
+ "converter.convert()\n",
+ "converter.save(output_saved_model_dir='resnet50_saved_model_TFTRT_FP32')\n",
+ "print('Done Converting to TF-TRT FP32')"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 453
+ },
+ "colab_type": "code",
+ "id": "dlue_3npkVQC",
+ "outputId": "4dd6a366-fe9a-43c8-aad0-dd357bba41bb"
+ },
+ "outputs": [],
+ "source": [
+ "!saved_model_cli show --all --dir resnet50_saved_model_TFTRT_FP32"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "colab_type": "text",
+ "id": "Vd2DoGUp8ivj"
+ },
+ "source": [
+ "Next, we load and test the TF-TRT FP32 model."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from tensorflow.python.framework.convert_to_constants import convert_variables_to_constants_v2\n",
+ "import tensorflow as tf\n",
+ "\n",
+ "model = tf.saved_model.load(\"resnet50_saved_model_TFTRT_FP32\", tags=[tag_constants.SERVING]).signatures['serving_default']"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {},
+ "colab_type": "code",
+ "id": "rf97K_rxvwRm"
+ },
+ "outputs": [],
+ "source": [
+ "def predict_tftrt(input_saved_model):\n",
+ " \"\"\"Runs prediction on a single image and shows the result.\n",
+ " input_saved_model (string): Name of the input model stored in the current dir\n",
+ " \"\"\"\n",
+ " img_path = './data/img0.JPG' # Siberian_husky\n",
+ " img = image.load_img(img_path, target_size=(224, 224))\n",
+ " x = image.img_to_array(img)\n",
+ " x = np.expand_dims(x, axis=0)\n",
+ " x = preprocess_input(x)\n",
+ " x = tf.constant(x)\n",
+ " \n",
+ " saved_model_loaded = tf.saved_model.load(input_saved_model, tags=[tag_constants.SERVING])\n",
+ " signature_keys = list(saved_model_loaded.signatures.keys())\n",
+ " print(signature_keys)\n",
+ "\n",
+ " infer = saved_model_loaded.signatures['serving_default']\n",
+ " print(infer.structured_outputs)\n",
+ "\n",
+ " labeling = infer(x)\n",
+ " preds = labeling['predictions'].numpy()\n",
+ " print('{} - Predicted: {}'.format(img_path, decode_predictions(preds, top=3)[0]))\n",
+ " plt.subplot(2,2,1)\n",
+ " plt.imshow(img);\n",
+ " plt.axis('off');\n",
+ " plt.title(decode_predictions(preds, top=3)[0][0][1])"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 238
+ },
+ "colab_type": "code",
+ "id": "pRK0pRE-snvb",
+ "outputId": "1f7ab6c1-dbfa-4e3e-a21d-df9975c70455"
+ },
+ "outputs": [],
+ "source": [
+ "predict_tftrt('resnet50_saved_model_TFTRT_FP32')"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {},
+ "colab_type": "code",
+ "id": "z9b5j6jMvwRt"
+ },
+ "outputs": [],
+ "source": [
+ "def benchmark_tftrt(input_saved_model):\n",
+ " saved_model_loaded = tf.saved_model.load(input_saved_model, tags=[tag_constants.SERVING])\n",
+ " infer = saved_model_loaded.signatures['serving_default']\n",
+ "\n",
+ " N_warmup_run = 50\n",
+ " N_run = 1000\n",
+ " elapsed_time = []\n",
+ "\n",
+ " for i in range(N_warmup_run):\n",
+ " labeling = infer(batched_input)\n",
+ "\n",
+ " for i in range(N_run):\n",
+ " start_time = time.time()\n",
+ " labeling = infer(batched_input)\n",
+ " end_time = time.time()\n",
+ " elapsed_time = np.append(elapsed_time, end_time - start_time)\n",
+ " if i % 50 == 0:\n",
+ " print('Step {}: {:4.1f}ms'.format(i, (elapsed_time[-50:].mean()) * 1000))\n",
+ "\n",
+ " print('Throughput: {:.0f} images/s'.format(N_run * batch_size / elapsed_time.sum()))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {},
+ "colab_type": "code",
+ "id": "ai6bxNcNszHc"
+ },
+ "outputs": [],
+ "source": [
+ "benchmark_tftrt('resnet50_saved_model_TFTRT_FP32')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Prepare model for C++ inference\n",
+ "\n",
+ "We can see that the TF-TRT FP32 model provide great speedup over the native Keras model. Now let's prepare this model for C++ inference. We will freeze this graph and write it as a frozen saved model to disk."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {},
+ "colab_type": "code",
+ "id": "2mM9D3BTEzQS"
+ },
+ "outputs": [],
+ "source": [
+ "from tensorflow.python.saved_model import signature_constants\n",
+ "from tensorflow.python.saved_model import tag_constants\n",
+ "from tensorflow.python.framework import convert_to_constants\n",
+ "\n",
+ "def get_func_from_saved_model(saved_model_dir):\n",
+ " saved_model_loaded = tf.saved_model.load(\n",
+ " saved_model_dir, tags=[tag_constants.SERVING])\n",
+ " graph_func = saved_model_loaded.signatures[\n",
+ " signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY]\n",
+ " return graph_func, saved_model_loaded\n",
+ "\n",
+ "func, loaded_model = get_func_from_saved_model('resnet50_saved_model_TFTRT_FP32')\n",
+ "\n",
+ "# Create frozen func\n",
+ "frozen_func = convert_to_constants.convert_variables_to_constants_v2(func)\n",
+ "module = tf.Module()\n",
+ "module.myfunc = frozen_func\n",
+ "tf.saved_model.save(module,'resnet50_saved_model_TFTRT_FP32_frozen', signatures=frozen_func)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "colab_type": "text",
+ "id": "I13snJ9VkVQh"
+ },
+ "source": [
+ "### What's next\n",
+ "Refer back to the [Readme](README.md) to load the TF-TRT frozen saved model for inference with the TensorFlow C++ API."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ }
+ ],
+ "metadata": {
+ "accelerator": "GPU",
+ "colab": {
+ "include_colab_link": true,
+ "machine_shape": "hm",
+ "name": "Colab-TF20-TF-TRT-inference-from-Keras-saved-model.ipynb",
+ "provenance": []
+ },
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.8.10"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}