diff --git a/README.md b/README.md
old mode 100644
new mode 100755
diff --git a/tftrt/examples-cpp/image_classification/README.md b/tftrt/examples-cpp/image_classification/README.md
index 6b3e8ba4f..eec06e5d5 100755
--- a/tftrt/examples-cpp/image_classification/README.md
+++ b/tftrt/examples-cpp/image_classification/README.md
@@ -3,104 +3,11 @@
 
 # TF-TRT C++ Image Recognition Demo
 
-This example shows how you can load a native TF Keras ResNet-50 model, convert it to a TF-TRT optimized model (via the TF-TRT Python API), save the model as a frozen graph, and then finally load and serve the model with the TF C++ API. The process can be demonstrated with the below workflow diagram:
+This example shows how you can load a native TF Keras ResNet-50 model, convert it to a TF-TRT optimized model (via the TF-TRT Python API), save the model as either a frozen graph or a saved model, and then finally load and serve the model with the TF C++ API. The process can be demonstrated with the below workflow diagram:
 
 
-![TF-TRT C++ Inference workflow](TF-TRT_CPP_inference.png "TF-TRT C++ Inference")
+![TF-TRT C++ Inference workflow](TF-TRT_CPP_inference_overview.png "TF-TRT C++ Inference")
 
 This example is built based upon the original Google's TensorFlow C++ image classification [example](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/label_image), on top of which we added the TF-TRT conversion part and adapted the C++ code for loading and inferencing with the TF-TRT model.
 
-## Docker environment
-Docker images provide a convinient and repeatable environment for experimentation. This workflow was tested in the NVIDIA NGC TensorFlow 22.01 docker container that comes with a TensorFlow 2.x build. Tools required for building this example, such as Bazel, NVIDIA CUDA, CUDNN, NCCL libraries are all readily setup.
-
-To replecate the below steps, start by pulling the NGC TF container:
-
-```
-docker pull nvcr.io/nvidia/tensorflow:22.01-tf2-py3
-```
-Then start the container with nvidia-docker:
-
-```
-nvidia-docker run --rm -it -p 8888:8888 --name TFTRT_CPP nvcr.io/nvidia/tensorflow:22.01-tf2-py3
-```
-
-You will land at `/workspace` within the docker container. Clone the TF-TRT example repository with:
-
-```
-git clone https://github.com/tensorflow/tensorrt
-cd tensorrt 
-```
-
-Then copy the content of this C++ example directory to the TensorFlow example source directory:
-
-```
-cp -r ./tftrt/examples-cpp/image_classification/ /opt/tensorflow/tensorflow-source/tensorflow/examples/
-cd /opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification
-```
-
-<!-- #region -->
-## Convert to TF-TRT Model
-
-Start Jupyter lab with:
-
-```
-jupyter lab -ip 0.0.0.0
-```
-
-A Jupyter notebook for downloading the Keras ResNet-50 model and TF-TRT conversion is provided in `tf-trt-conversion.ipynb` for your  experimentation. By default, this notebook will produce a TF-TRT FP32 model at `/opt/tensorflow/tensorflow-source/tensorflow/examples/image-classification/frozen_models_trt_fp32/frozen_models_trt_fp32.pb`.
-
-As part of the conversion, the notebook will also carry out benchmarking and print out the throughput statistics. 
-
-
-<!-- #endregion -->
-
-## Build the C++ example
-The NVIDIA NGC container should have everything you need to run this example installed already.
-
-To build it, first, you need to copy the build scripts `tftrt_build.sh` to `/opt/tensorflow`:
-
-```
-cp tftrt-build.sh /opt/tensorflow
-```
-
-Then from `/opt/tensorflow`, run the build command:
-
-```bash
-cd /opt/tensorflow 
-bash ./tftrt-build.sh
-```
-
-That should build a binary executable `tftrt_label_image` that you can then run like this:
-
-```bash
-tensorflow-source/bazel-bin/tensorflow/examples/image_classification/tftrt_label_image \
---graph=/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/frozen_models_trt_fp32/frozen_models_trt_fp32.pb \
---image=/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/data/img0.JPG
-```
-
-This uses the default image `img0.JPG` which was download as part of the conversion notebook, and should
-output something similar to this:
-
-```
-2022-02-23 13:53:56.076348: I tensorflow/examples/image-classification/main.cc:276] malamute (250): 0.575496
-2022-02-23 13:53:56.076384: I tensorflow/examples/image-classification/main.cc:276] Saint Bernard (248): 0.399285
-2022-02-23 13:53:56.076412: I tensorflow/examples/image-classification/main.cc:276] Eskimo dog (249): 0.0228338
-2022-02-23 13:53:56.076423: I tensorflow/examples/image-classification/main.cc:276] Ibizan hound (174): 0.00127912
-2022-02-23 13:53:56.076449: I tensorflow/examples/image-classification/main.cc:276] Mexican hairless (269): 0.000520922
-```
-
-The program will also benchmark and output the throughput. Observe the improved throughput offered by moving from Python to C++ serving.
-
-Next, try it out on your own images by supplying the --image= argument, e.g.
-
-```bash
-tensorflow-source/bazel-bin/tensorflow/examples/label_image/tftrt_label_image --image=my_image.png
-```
-
-## What's next
-
-Try to build TF-TRT FP16 and INT8 models and test on your own data, and serve them with C++.
-
-```bash
-
-```
+See the respective sub-folder for details on either approach.
\ No newline at end of file
diff --git a/tftrt/examples-cpp/image_classification/TF-TRT_CPP_inference_overview.png b/tftrt/examples-cpp/image_classification/TF-TRT_CPP_inference_overview.png
new file mode 100644
index 000000000..de35058bc
Binary files /dev/null and b/tftrt/examples-cpp/image_classification/TF-TRT_CPP_inference_overview.png differ
diff --git a/tftrt/examples-cpp/image_classification/BUILD b/tftrt/examples-cpp/image_classification/frozen-graph/BUILD
similarity index 100%
rename from tftrt/examples-cpp/image_classification/BUILD
rename to tftrt/examples-cpp/image_classification/frozen-graph/BUILD
diff --git a/tftrt/examples-cpp/image_classification/frozen-graph/README.md b/tftrt/examples-cpp/image_classification/frozen-graph/README.md
new file mode 100755
index 000000000..9fd1ca305
--- /dev/null
+++ b/tftrt/examples-cpp/image_classification/frozen-graph/README.md
@@ -0,0 +1,108 @@
+<!-- #region -->
+<img src="https://developer.download.nvidia.com/notebooks/dlsw-notebooks/tftrt_cpp_frozen-graph/nvidia_logo.png" style="width: 90px; float: right;">
+
+# TF-TRT C++ Image Recognition Demo
+
+This example shows how you can load a native TF Keras ResNet-50 model, convert it to a TF-TRT optimized model (via the TF-TRT Python API), save the model as a frozen graph, and then finally load and serve the model with the TF C++ API. The process can be demonstrated with the below workflow diagram:
+
+
+![TF-TRT C++ Inference workflow](TF-TRT_CPP_inference.png "TF-TRT C++ Inference")
+
+This example is built based upon the original Google's TensorFlow C++ image classification [example](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/label_image), on top of which we added the TF-TRT conversion part and adapted the C++ code for loading and inferencing with the TF-TRT model.
+
+## Docker environment
+Docker images provide a convinient and repeatable environment for experimentation. This workflow was tested in the NVIDIA NGC TensorFlow 22.01 docker container that comes with a TensorFlow 2.x build. Tools required for building this example, such as Bazel, NVIDIA CUDA, CUDNN, NCCL libraries are all readily setup.
+
+To replecate the below steps, start by pulling the NGC TF container:
+
+```
+docker pull nvcr.io/nvidia/tensorflow:22.01-tf2-py3
+```
+Then start the container with nvidia-docker:
+
+```
+nvidia-docker run --rm -it -p 8888:8888 --name TFTRT_CPP nvcr.io/nvidia/tensorflow:22.01-tf2-py3
+```
+
+You will land at `/workspace` within the docker container. Clone the TF-TRT example repository with:
+
+```
+git clone https://github.com/tensorflow/tensorrt
+cd tensorrt 
+```
+
+Then copy the content of this C++ example directory to the TensorFlow example source directory:
+
+```
+cp -r ./tftrt/examples-cpp/image_classification /opt/tensorflow/tensorflow-source/tensorflow/examples/
+cd /opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/frozen-graph
+```
+
+<!-- #region -->
+## Convert to TF-TRT Model
+
+Start Jupyter lab with:
+
+```
+jupyter lab -ip 0.0.0.0
+```
+
+A Jupyter notebook for downloading the Keras ResNet-50 model and TF-TRT conversion is provided in `tf-trt-conversion.ipynb` for your  experimentation. By default, this notebook will produce a TF-TRT FP32 model at `/opt/tensorflow/tensorflow-source/tensorflow/examples/image-classification/frozen-graph/frozen_models_trt_fp32/frozen_models_trt_fp32.pb`.
+
+As part of the conversion, the notebook will also carry out benchmarking and print out the throughput statistics. 
+
+
+<!-- #endregion -->
+
+## Build the C++ example
+The NVIDIA NGC container should have everything you need to run this example installed already.
+
+To build it, first, you need to copy the build scripts `tftrt_build.sh` to `/opt/tensorflow`:
+
+```
+cp tftrt-build.sh /opt/tensorflow
+```
+
+Then from `/opt/tensorflow`, run the build command:
+
+```bash
+cd /opt/tensorflow 
+bash ./tftrt-build.sh
+```
+
+That should build a binary executable `tftrt_label_image` that you can then run like this:
+
+```bash
+tensorflow-source/bazel-bin/tensorflow/examples/image_classification/frozen-graph/tftrt_label_image \
+--graph=/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/frozen-graph/frozen_models_trt_fp32/frozen_models_trt_fp32.pb \
+--image=/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/frozen-graph/data/img0.JPG
+```
+
+This uses the default image `img0.JPG` which was download as part of the conversion notebook, and should
+output something similar to this:
+
+```
+2022-04-29 04:20:24.377345: I tensorflow/examples/image_classification/frozen-graph/main.cc:276] malamute (250): 0.575496
+2022-04-29 04:20:24.377370: I tensorflow/examples/image_classification/frozen-graph/main.cc:276] Saint Bernard (248): 0.399285
+2022-04-29 04:20:24.377380: I tensorflow/examples/image_classification/frozen-graph/main.cc:276] Eskimo dog (249): 0.0228338
+2022-04-29 04:20:24.377387: I tensorflow/examples/image_classification/frozen-graph/main.cc:276] Ibizan hound (174): 0.00127912
+2022-04-29 04:20:24.377394: I tensorflow/examples/image_classification/frozen-graph/main.cc:276] Mexican hairless (269): 0.000520922
+```
+
+The program will also benchmark and output the throughput. Observe the improved throughput offered by moving from Python to C++ serving.
+
+Next, try it out on your own images by supplying the --image= argument, e.g.
+
+```bash
+tensorflow-source/bazel-bin/tensorflow/examples/image_classification/frozen-graph/tftrt_label_image \
+--graph=/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/frozen-graph/frozen_models_trt_fp32/frozen_models_trt_fp32.pb \
+--image=my_image.png
+```
+
+## What's next
+
+Try to build TF-TRT FP16 and INT8 models and test on your own data, and serve them with C++.
+
+```bash
+
+```
diff --git a/tftrt/examples-cpp/image_classification/TF-TRT_CPP_inference.png b/tftrt/examples-cpp/image_classification/frozen-graph/TF-TRT_CPP_inference.png
old mode 100644
new mode 100755
similarity index 100%
rename from tftrt/examples-cpp/image_classification/TF-TRT_CPP_inference.png
rename to tftrt/examples-cpp/image_classification/frozen-graph/TF-TRT_CPP_inference.png
diff --git a/tftrt/examples-cpp/image_classification/main.cc b/tftrt/examples-cpp/image_classification/frozen-graph/main.cc
similarity index 98%
rename from tftrt/examples-cpp/image_classification/main.cc
rename to tftrt/examples-cpp/image_classification/frozen-graph/main.cc
index 5dc34da18..5248d143a 100755
--- a/tftrt/examples-cpp/image_classification/main.cc
+++ b/tftrt/examples-cpp/image_classification/frozen-graph/main.cc
@@ -302,11 +302,11 @@ int main(int argc, char* argv[]) {
   // These are the command-line flags the program can understand.
   // They define where the graph and input data is located, and what kind of
   // input the model expects.
-  string image =  "/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/data/img0.JPG";
+  string image = "/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/frozen-graph/data/img0.JPG";
   string graph =
-      "/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/data/resnet-50.pb";
+      "/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/frozen-graph/data/resnet-50.pb";
   string labels =
-      "/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/data/imagenet_slim_labels.txt";
+      "/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/frozen-graph/data/imagenet_slim_labels.txt";
   int32_t input_width = 224;
   int32_t input_height = 224;
   float input_mean = 127;
diff --git a/tftrt/examples-cpp/image_classification/tftrt-build.sh b/tftrt/examples-cpp/image_classification/frozen-graph/tftrt-build.sh
old mode 100644
new mode 100755
similarity index 100%
rename from tftrt/examples-cpp/image_classification/tftrt-build.sh
rename to tftrt/examples-cpp/image_classification/frozen-graph/tftrt-build.sh
diff --git a/tftrt/examples-cpp/image_classification/tftrt-conversion.ipynb b/tftrt/examples-cpp/image_classification/frozen-graph/tftrt-conversion.ipynb
similarity index 100%
rename from tftrt/examples-cpp/image_classification/tftrt-conversion.ipynb
rename to tftrt/examples-cpp/image_classification/frozen-graph/tftrt-conversion.ipynb
diff --git a/tftrt/examples-cpp/image_classification/saved-model/BUILD b/tftrt/examples-cpp/image_classification/saved-model/BUILD
new file mode 100755
index 000000000..2bc49d38d
--- /dev/null
+++ b/tftrt/examples-cpp/image_classification/saved-model/BUILD
@@ -0,0 +1,50 @@
+# Description:
+#   TensorFlow C++ inference example with TF-TRT model.
+
+load("//tensorflow:tensorflow.bzl", "tf_cc_binary")
+
+package(
+    default_visibility = ["//tensorflow:internal"],
+    licenses = ["notice"],
+)
+
+tf_cc_binary(
+    name = "tftrt_label_image",
+    srcs = [
+        "main.cc",
+    ],
+    linkopts = select({
+        "//tensorflow:android": [
+            "-pie",
+            "-landroid",
+            "-ljnigraphics",
+            "-llog",
+            "-lm",
+            "-z defs",
+            "-s",
+            "-Wl,--exclude-libs,ALL",
+        ],
+        "//conditions:default": ["-lm"],
+    }),
+    deps = select({
+        "//tensorflow:android": [
+            # cc:cc_ops is used to include image ops (for label_image)
+            # Jpg, gif, and png related code won't be included
+            "//tensorflow/cc:cc_ops",            
+            "//tensorflow/core:portable_tensorflow_lib",
+            # cc:android_tensorflow_image_op is for including jpeg/gif/png
+            # decoder to enable real-image evaluation on Android
+            "//tensorflow/core/kernels/image:android_tensorflow_image_op",
+        ],
+        "//conditions:default": [
+            "//tensorflow/cc:cc_ops",
+            "//tensorflow/cc/saved_model:loader",
+            "//tensorflow/core:core_cpu",
+            "//tensorflow/core:framework",
+            "//tensorflow/core:framework_internal",
+            "//tensorflow/core:lib",
+            "//tensorflow/core:protos_all_cc",
+            "//tensorflow/core:tensorflow",            
+        ],
+    }),
+)
\ No newline at end of file
diff --git a/tftrt/examples-cpp/image_classification/saved-model/README.md b/tftrt/examples-cpp/image_classification/saved-model/README.md
new file mode 100755
index 000000000..6c2322331
--- /dev/null
+++ b/tftrt/examples-cpp/image_classification/saved-model/README.md
@@ -0,0 +1,109 @@
+<!-- #region -->
+<img src="https://developer.download.nvidia.com//notebooks/dlsw-notebooks/tftrt_cpp_saved-model/nvidia_logo.png" style="width: 90px; float: right;">
+
+
+# TF-TRT C++ Image Recognition Demo
+
+This example shows how you can load a native TF Keras ResNet-50 model, convert it to a TF-TRT optimized model (via the TF-TRT Python API), save the model as a saved model, and then finally load and serve the model with the TF C++ API. The process can be demonstrated with the below workflow diagram:
+
+
+![TF-TRT C++ Inference workflow](TF-TRT_CPP_inference_saved_model.png "TF-TRT C++ Inference")
+
+This example is built based upon the original Google's TensorFlow C++ image classification [example](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/label_image), on top of which we added the TF-TRT conversion part and adapted the C++ code for loading and inferencing with the TF-TRT model.
+
+## Docker environment
+Docker images provide a convinient and repeatable environment for experimentation. This workflow was tested in the NVIDIA NGC TensorFlow 22.01 docker container that comes with a TensorFlow 2.x build. Tools required for building this example, such as Bazel, NVIDIA CUDA, CUDNN, NCCL libraries are all readily setup.
+
+To replecate the below steps, start by pulling the NGC TF container:
+
+```
+docker pull nvcr.io/nvidia/tensorflow:22.01-tf2-py3
+```
+Then start the container with nvidia-docker:
+
+```
+nvidia-docker run --rm -it -p 8888:8888 --name TFTRT_CPP nvcr.io/nvidia/tensorflow:22.01-tf2-py3
+```
+
+You will land at `/workspace` within the docker container. Clone the TF-TRT example repository with:
+
+```
+git clone https://github.com/tensorflow/tensorrt
+cd tensorrt 
+```
+
+Then copy the content of this C++ example directory to the TensorFlow example source directory:
+
+```
+cp -r ./tftrt/examples-cpp/image_classification/ /opt/tensorflow/tensorflow-source/tensorflow/examples/
+cd /opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/saved-model
+```
+
+<!-- #region -->
+## Convert to TF-TRT Model
+
+Start Jupyter lab with:
+
+```
+jupyter lab -ip 0.0.0.0
+```
+
+A Jupyter notebook for downloading the Keras ResNet-50 model and TF-TRT conversion is provided in `tf-trt-conversion.ipynb` for your  experimentation. By default, this notebook will produce a TF-TRT FP32 saved model at `/opt/tensorflow/tensorflow-source/tensorflow/examples/image-classification/saved-model/resnet50_saved_model_TFTRT_FP32_frozen`.
+
+As part of the conversion, the notebook will also carry out benchmarking and print out the throughput statistics. 
+
+
+<!-- #endregion -->
+
+## Build the C++ example
+The NVIDIA NGC container should have everything you need to run this example installed already.
+
+To build it, first, you need to copy the build scripts `tftrt_build.sh` to `/opt/tensorflow`:
+
+```
+cp tftrt-build.sh /opt/tensorflow
+```
+
+Then from `/opt/tensorflow`, run the build command:
+
+```bash
+cd /opt/tensorflow 
+bash ./tftrt-build.sh
+```
+
+That should build a binary executable `tftrt_label_image` that you can then run like this:
+
+```bash
+tensorflow-source/bazel-bin/tensorflow/examples/image_classification/saved-model/tftrt_label_image \
+--export_dir=/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/saved-model/resnet50_saved_model_TFTRT_FP32_frozen \
+--image=/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/saved-model/data/img0.JPG
+```
+
+This uses the default image `img0.JPG` which was download as part of the conversion notebook, and should
+output something similar to this:
+
+```
+2022-04-29 04:19:28.397102: I tensorflow/examples/image_classification/saved-model/main.cc:331] malamute (250): 0.575497
+2022-04-29 04:19:28.397126: I tensorflow/examples/image_classification/saved-model/main.cc:331] Saint Bernard (248): 0.399284
+2022-04-29 04:19:28.397134: I tensorflow/examples/image_classification/saved-model/main.cc:331] Eskimo dog (249): 0.0228338
+2022-04-29 04:19:28.397141: I tensorflow/examples/image_classification/saved-model/main.cc:331] Ibizan hound (174): 0.00127912
+2022-04-29 04:19:28.397147: I tensorflow/examples/image_classification/saved-model/main.cc:331] Mexican hairless (269): 0.000520922
+```
+
+The program will also benchmark and output the throughput. Observe the improved throughput offered by moving from Python to C++ serving.
+
+Next, try it out on your own images by supplying the --image= argument, e.g.
+
+```bash
+tensorflow-source/bazel-bin/tensorflow/examples/image_classification/saved-model/tftrt_label_image \
+--export_dir=/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/saved-model/resnet50_saved_model_TFTRT_FP32_frozen \
+--image=my_image.png
+```
+
+## What's next
+
+Try to build TF-TRT FP16 and INT8 models and test on your own data, and serve them with C++.
+
+```bash
+
+```
diff --git a/tftrt/examples-cpp/image_classification/saved-model/TF-TRT_CPP_inference_saved_model.png b/tftrt/examples-cpp/image_classification/saved-model/TF-TRT_CPP_inference_saved_model.png
new file mode 100755
index 000000000..153881ad3
Binary files /dev/null and b/tftrt/examples-cpp/image_classification/saved-model/TF-TRT_CPP_inference_saved_model.png differ
diff --git a/tftrt/examples-cpp/image_classification/saved-model/image_classification_build.sh b/tftrt/examples-cpp/image_classification/saved-model/image_classification_build.sh
new file mode 100755
index 000000000..38477247c
--- /dev/null
+++ b/tftrt/examples-cpp/image_classification/saved-model/image_classification_build.sh
@@ -0,0 +1,37 @@
+#!/bin/bash
+# Build the C++ TFTRT Example
+
+# Copyright 2019 NVIDIA Corporation. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+
+set -e
+if [[ ! -f /opt/tensorflow/nvbuild.sh || ! -f /opt/tensorflow/nvbuildopts ]]; then
+  echo This TF-TRT example is intended to be executed in the NGC TensorFlow container environment. Get one with, e.g. `docker pull nvcr.io/nvidia/tensorflow:19.10-py3`.
+  exit 1
+fi
+
+# TODO: to programatically determine the python and tf API versions
+PYVER=3.6 #TODO get this by parsing `python --version`
+TFAPI=1 #TODO get this by parsing tf.__version__
+
+/opt/tensorflow/nvbuild.sh --configonly --python$PYVER --v$TFAPI
+
+BUILD_OPTS="$(cat /opt/tensorflow/nvbuildopts)"
+if [[ "$TFAPI" == "2" ]]; then
+  BUILD_OPTS="--config=v2 $BUILD_OPTS"
+fi
+
+cd /opt/tensorflow/tensorflow-source
+bazel build $BUILD_OPTS tensorflow/examples/image-classification/...
diff --git a/tftrt/examples-cpp/image_classification/saved-model/main.cc b/tftrt/examples-cpp/image_classification/saved-model/main.cc
new file mode 100755
index 000000000..a21e2f330
--- /dev/null
+++ b/tftrt/examples-cpp/image_classification/saved-model/main.cc
@@ -0,0 +1,464 @@
+/* Copyright 2021 NVIDIA Corporation. All Rights Reserved.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+==============================================================================*/
+
+/* Copyright 2015 The TensorFlow Authors. All Rights Reserved.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+==============================================================================*/
+
+// A minimal but useful C++ example showing how to load a TF-TRT ResNet-50 model, 
+// prepare input images for it, run them through the graph, and interpret the results.
+//
+// It's designed to have as few dependencies and be as clear as possible, so
+// it's more verbose than it could be in production code. In particular, using
+// auto for the types of a lot of the returned values from TensorFlow calls can
+// remove a lot of boilerplate, but I find the explicit types useful in sample
+// code to make it simple to look up the classes involved.
+//
+// To use it, compile and then run in a working directory with the
+// learning/brain/tutorials/label_image/data/ folder below it, and you should
+// see the top five labels for the example Lena image output. You can then
+// customize it to use your own models or images by changing the file names at
+// the top of the main() function.
+//
+// Note that, for GIF inputs, to reuse existing code, only single-frame ones
+// are supported.
+
+#include <fstream>
+#include <utility>
+#include <google/protobuf/map.h>
+#include <iostream>
+#include <stdint.h>
+#include <string>
+#include <vector>
+
+#include "tensorflow/cc/ops/const_op.h"
+#include "tensorflow/cc/ops/array_ops.h"
+#include "tensorflow/cc/ops/image_ops.h"
+#include "tensorflow/cc/ops/standard_ops.h"
+#include "tensorflow/core/framework/graph.pb.h"
+#include "tensorflow/core/framework/tensor.h"
+#include "tensorflow/core/graph/default_device.h"
+#include "tensorflow/core/graph/graph_def_builder.h"
+#include "tensorflow/core/lib/core/errors.h"
+#include "tensorflow/core/lib/core/stringpiece.h"
+#include "tensorflow/core/lib/core/threadpool.h"
+#include "tensorflow/core/lib/io/path.h"
+#include "tensorflow/core/lib/strings/str_util.h"
+#include "tensorflow/core/lib/strings/stringprintf.h"
+#include "tensorflow/core/platform/env.h"
+#include "tensorflow/core/platform/init_main.h"
+#include "tensorflow/core/platform/logging.h"
+#include "tensorflow/core/platform/types.h"
+#include "tensorflow/core/public/session.h"
+#include "tensorflow/core/util/command_line_flags.h"
+#include "tensorflow/cc/saved_model/loader.h"
+#include "tensorflow/core/framework/tensor.pb.h"
+#include "tensorflow/core/lib/core/status.h"
+#include "tensorflow/core/platform/init_main.h"
+
+#include "absl/strings/string_view.h"
+
+// These are all common classes it's handy to reference with no namespace.
+using tensorflow::Flag;
+using tensorflow::int32;
+using tensorflow::Status;
+using tensorflow::string;
+using tensorflow::Tensor;
+using tensorflow::tstring;
+
+// Returns the name of nodes listed in the signature definition.
+std::vector<std::string>
+GetNodeNames(const google::protobuf::Map<std::string, tensorflow::TensorInfo>
+                 &signature) {
+  std::vector<std::string> names;
+  for (auto const &item : signature) {
+    absl::string_view name = item.second.name();
+    // Remove tensor suffix like ":0".
+    size_t last_colon = name.find_last_of(':');
+    if (last_colon != absl::string_view::npos) {
+      name.remove_suffix(name.size() - last_colon);
+    }
+    names.push_back(std::string(name));
+  }
+  return names;
+}
+
+// Loads a SavedModel from export_dir into the SavedModelBundle.
+tensorflow::Status LoadModel(const std::string &export_dir,
+                             tensorflow::SavedModelBundle *bundle,
+                             std::vector<std::string> *input_names,
+                             std::vector<std::string> *output_names) {
+
+  tensorflow::RunOptions run_options;
+  TF_RETURN_IF_ERROR(tensorflow::LoadSavedModel(tensorflow::SessionOptions(),
+                                                run_options, export_dir,
+                                                {"serve"}, bundle));
+
+  // Print the signature defs.
+  auto signature_map = bundle->GetSignatures();
+  for (const auto &name_and_signature_def : signature_map) {
+    const auto &name = name_and_signature_def.first;
+    const auto &signature_def = name_and_signature_def.second;
+    std::cerr << "Name: " << name << std::endl;
+    std::cerr << "SignatureDef: " << signature_def.DebugString() << std::endl;
+  }
+
+  // Extract input and output tensor names from the signature def.
+  const tensorflow::SignatureDef &signature = signature_map["serving_default"];
+  *input_names = GetNodeNames(signature.inputs());
+  *output_names = GetNodeNames(signature.outputs());
+
+  return tensorflow::Status::OK();
+}
+
+// Takes a file name, and loads a list of labels from it, one per line, and
+// returns a vector of the strings. It pads with empty strings so the length
+// of the result is a multiple of 16, because our model expects that.
+Status ReadLabelsFile(const string& file_name, std::vector<string>* result,
+                      size_t* found_label_count) {
+  std::ifstream file(file_name);
+  if (!file) {
+    return tensorflow::errors::NotFound("Labels file ", file_name,
+                                        " not found.");
+  }
+  result->clear();
+  string line;
+  while (std::getline(file, line)) {
+    result->push_back(line);
+  }
+  *found_label_count = result->size();
+  const int padding = 16;
+  while (result->size() % padding) {
+    result->emplace_back();
+  }
+  return Status::OK();
+}
+
+static Status ReadEntireFile(tensorflow::Env* env, const string& filename,
+                             Tensor* output) {
+  tensorflow::uint64 file_size = 0;
+  TF_RETURN_IF_ERROR(env->GetFileSize(filename, &file_size));
+
+  string contents;
+  contents.resize(file_size);
+
+  std::unique_ptr<tensorflow::RandomAccessFile> file;
+  TF_RETURN_IF_ERROR(env->NewRandomAccessFile(filename, &file));
+
+  tensorflow::StringPiece data;
+  TF_RETURN_IF_ERROR(file->Read(0, file_size, &data, &(contents)[0]));
+  if (data.size() != file_size) {
+    return tensorflow::errors::DataLoss("Truncated read of '", filename,
+                                        "' expected ", file_size, " got ",
+                                        data.size());
+  }
+  output->scalar<tstring>()() = tstring(data);
+  return Status::OK();
+}
+
+// Given an image file name, read in the data, try to decode it as an image,
+// resize it to the requested size, and then scale the values as desired.
+Status ReadTensorFromImageFile(const string& file_name, const int input_height,
+                               const int input_width, const float input_mean,
+                               const float input_std,
+                               std::vector<Tensor>* out_tensors) {
+  auto root = tensorflow::Scope::NewRootScope();
+  using namespace ::tensorflow::ops;  // NOLINT(build/namespaces)
+
+  string input_name = "file_reader";
+  string output_name = "normalized";
+
+  // read file_name into a tensor named input
+  Tensor input(tensorflow::DT_STRING, tensorflow::TensorShape());
+  TF_RETURN_IF_ERROR(
+      ReadEntireFile(tensorflow::Env::Default(), file_name, &input));
+
+  // use a placeholder to read input data
+  auto file_reader =
+      Placeholder(root.WithOpName("input"), tensorflow::DataType::DT_STRING);
+
+  std::vector<std::pair<string, tensorflow::Tensor>> inputs = {
+      {"input", input},
+  };
+
+  // Now try to figure out what kind of file it is and decode it.
+  const int wanted_channels = 3;
+  tensorflow::Output image_reader;
+  if (tensorflow::str_util::EndsWith(file_name, ".png")) {
+    image_reader = DecodePng(root.WithOpName("png_reader"), file_reader,
+                             DecodePng::Channels(wanted_channels));
+  } else if (tensorflow::str_util::EndsWith(file_name, ".gif")) {
+    // gif decoder returns 4-D tensor, remove the first dim
+    image_reader =
+        Squeeze(root.WithOpName("squeeze_first_dim"),
+                DecodeGif(root.WithOpName("gif_reader"), file_reader));
+  } else if (tensorflow::str_util::EndsWith(file_name, ".bmp")) {
+    image_reader = DecodeBmp(root.WithOpName("bmp_reader"), file_reader);
+  } else {
+    // Assume if it's neither a PNG nor a GIF then it must be a JPEG.
+    image_reader = DecodeJpeg(root.WithOpName("jpeg_reader"), file_reader,
+                              DecodeJpeg::Channels(wanted_channels));
+  }
+  // Now cast the image data to float so we can do normal math on it.
+  auto float_caster =
+      Cast(root.WithOpName("float_caster"), image_reader, tensorflow::DT_FLOAT);
+  // The convention for image ops in TensorFlow is that all images are expected
+  // to be in batches, so that they're four-dimensional arrays with indices of
+  // [batch, height, width, channel]. Because we only have a single image, we
+  // have to add a batch dimension of 1 to the start with ExpandDims().
+  auto dims_expander = ExpandDims(root, float_caster, 0);
+  // Bilinearly resize the image to fit the required dimensions.
+  auto resized = ResizeBilinear(
+      root, dims_expander,
+      Const(root.WithOpName("size"), {input_height, input_width}));
+
+  // Preprocess image in "caffe" style: https://github.com/keras-team/keras/blob/d8fcb9d4d4dad45080ecfdd575483653028f8eda/keras/applications/imagenet_utils.py#L206
+  // Convert channel from RGB -> BGR
+  auto unstack_image_node = tensorflow::ops::Unstack(root.WithOpName("unstack_image"), resized, 3, tensorflow::ops::Unstack::Attrs().Axis(3));
+  auto stacked_image_node = tensorflow::ops::Stack(root.WithOpName("stacked_image"), {unstack_image_node[2],unstack_image_node[1],unstack_image_node[0]}, tensorflow::ops::Stack::Attrs().Axis(3));
+  
+    
+  // Substract mean: BGR
+  std::vector<float> vec = {103.939, 116.779, 123.68};
+  Tensor img_mean(tensorflow::DT_FLOAT, {3});
+  std::copy_n(vec.begin(), vec.size(), img_mean.flat<float>().data());
+        
+  Div(root.WithOpName(output_name), Sub(root, stacked_image_node, img_mean),
+      {input_std});
+    
+  // This runs the GraphDef network definition that we've just constructed, and
+  // returns the results in the output tensor.
+  tensorflow::GraphDef graph;
+  TF_RETURN_IF_ERROR(root.ToGraphDef(&graph));
+
+  std::unique_ptr<tensorflow::Session> session(
+      tensorflow::NewSession(tensorflow::SessionOptions()));
+  TF_RETURN_IF_ERROR(session->Create(graph));
+  TF_RETURN_IF_ERROR(session->Run({inputs}, {output_name}, {}, out_tensors));
+  return Status::OK();
+}
+
+// Reads a model graph definition from disk, and creates a session object you
+// can use to run it.
+Status LoadGraph(const string& graph_file_name,
+                 std::unique_ptr<tensorflow::Session>* session) {
+  tensorflow::GraphDef graph_def;
+  Status load_graph_status =
+      ReadBinaryProto(tensorflow::Env::Default(), graph_file_name, &graph_def);
+  if (!load_graph_status.ok()) {
+    return tensorflow::errors::NotFound("Failed to load compute graph at '",
+                                        graph_file_name, "'");
+  }
+  session->reset(tensorflow::NewSession(tensorflow::SessionOptions()));
+  Status session_create_status = (*session)->Create(graph_def);
+  if (!session_create_status.ok()) {
+    return session_create_status;
+  }
+  return Status::OK();
+}
+
+// Analyzes the output of the Inception graph to retrieve the highest scores and
+// their positions in the tensor, which correspond to categories.
+Status GetTopLabels(const std::vector<Tensor>& outputs, int how_many_labels,
+                    Tensor* indices, Tensor* scores) {
+  auto root = tensorflow::Scope::NewRootScope();
+  using namespace ::tensorflow::ops;  // NOLINT(build/namespaces)
+
+  string output_name = "top_k";
+  TopK(root.WithOpName(output_name), outputs[0], how_many_labels);
+  // This runs the GraphDef network definition that we've just constructed, and
+  // returns the results in the output tensors.
+  tensorflow::GraphDef graph;
+  TF_RETURN_IF_ERROR(root.ToGraphDef(&graph));
+
+  std::unique_ptr<tensorflow::Session> session(
+      tensorflow::NewSession(tensorflow::SessionOptions()));
+  TF_RETURN_IF_ERROR(session->Create(graph));
+  // The TopK node returns two outputs, the scores and their original indices,
+  // so we have to append :0 and :1 to specify them both.
+  std::vector<Tensor> out_tensors;
+  TF_RETURN_IF_ERROR(session->Run({}, {output_name + ":0", output_name + ":1"},
+                                  {}, &out_tensors));
+  *scores = out_tensors[0];
+  *indices = out_tensors[1];
+  return Status::OK();
+}
+
+// Given the output of a model run, and the name of a file containing the labels
+// this prints out the top five highest-scoring values.
+Status PrintTopLabels(const std::vector<Tensor>& outputs,
+                      const string& labels_file_name) {
+  std::vector<string> labels;
+  size_t label_count;
+  Status read_labels_status =
+      ReadLabelsFile(labels_file_name, &labels, &label_count);
+  if (!read_labels_status.ok()) {
+    LOG(ERROR) << read_labels_status;
+    return read_labels_status;
+  }
+  const int how_many_labels = std::min(5, static_cast<int>(label_count));
+  Tensor indices;
+  Tensor scores;
+  TF_RETURN_IF_ERROR(GetTopLabels(outputs, how_many_labels, &indices, &scores));
+  tensorflow::TTypes<float>::Flat scores_flat = scores.flat<float>();
+  tensorflow::TTypes<int32>::Flat indices_flat = indices.flat<int32>();
+  for (int pos = 0; pos < how_many_labels; ++pos) {
+    const int label_index = indices_flat(pos);
+    const float score = scores_flat(pos);
+    LOG(INFO) << labels[label_index] << " (" << label_index << "): " << score;
+  }
+  return Status::OK();
+}
+
+// This is a testing function that returns whether the top label index is the
+// one that's expected.
+Status CheckTopLabel(const std::vector<Tensor>& outputs, int expected,
+                     bool* is_expected) {
+  *is_expected = false;
+  Tensor indices;
+  Tensor scores;
+  const int how_many_labels = 1;
+  TF_RETURN_IF_ERROR(GetTopLabels(outputs, how_many_labels, &indices, &scores));
+  tensorflow::TTypes<int32>::Flat indices_flat = indices.flat<int32>();
+  if (indices_flat(0) != expected) {
+    LOG(ERROR) << "Expected label #" << expected << " but got #"
+               << indices_flat(0);
+    *is_expected = false;
+  } else {
+    *is_expected = true;
+  }
+  return Status::OK();
+}
+
+int main(int argc, char* argv[]) {
+  // These are the command-line flags the program can understand.
+  // They define where the graph and input data is located, and what kind of
+  // input the model expects.
+  string image =  "/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/saved-model/data/img0.JPG";
+  string export_dir =
+      "/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/saved-model/resnet50_saved_model_TFTRT_FP32_frozen";
+  string labels =
+      "/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/saved-model/data/imagenet_slim_labels.txt";
+  int32_t input_width = 224;
+  int32_t input_height = 224;
+  float input_mean = 127;
+  float input_std = 1;
+  bool self_test = false;
+  string root_dir = "";
+    
+  std::vector<Flag> flag_list = {
+      Flag("image", &image, "image to be processed"),
+      Flag("export_dir", &export_dir, "frozen TF-TRT saved model to be executed"),
+      Flag("labels", &labels, "name of file containing labels"),
+      Flag("input_width", &input_width, "resize image to this width in pixels"),
+      Flag("input_height", &input_height,
+           "resize image to this height in pixels"),
+      Flag("input_mean", &input_mean, "scale pixel values to this mean"),
+      Flag("input_std", &input_std, "scale pixel values to this std deviation"),
+      Flag("self_test", &self_test, "run a self test"),
+      Flag("root_dir", &root_dir,
+           "interpret image and graph file names relative to this directory"),
+  };
+  string usage = tensorflow::Flags::Usage(argv[0], flag_list);
+  const bool parse_result = tensorflow::Flags::Parse(&argc, argv, flag_list);
+  if (!parse_result) {
+    LOG(ERROR) << usage;
+    return -1;
+  }
+
+  // We need to call this to set up global state for TensorFlow.
+  tensorflow::port::InitMain(argv[0], &argc, &argv);
+  if (argc > 1) {
+    LOG(ERROR) << "Unknown argument " << argv[1] << "\n" << usage;
+    return -1;
+  }
+
+  tensorflow::SavedModelBundle bundle;
+  std::vector<std::string> input_names;
+  std::vector<std::string> output_names;
+    
+  // Load the saved model from the provided path.
+  Status load_graph_status = LoadModel(export_dir, &bundle, &input_names, &output_names);
+  if (!load_graph_status.ok()) {
+    LOG(ERROR) << load_graph_status;
+    return -1;
+  }
+    
+  auto sig_map = bundle.GetSignatures();
+  auto model_def = sig_map.at("serving_default");
+
+  printf("Model Signature");
+  for (auto const& p : sig_map) {
+      printf("key: %s", p.first.c_str());
+  }
+
+  printf("Model Input Nodes");
+  for (auto const& p : model_def.inputs()) {
+      printf("key: %s value: %s", p.first.c_str(), p.second.name().c_str());
+  }
+
+  printf("Model Output Nodes");
+  for (auto const& p : model_def.outputs()) {
+      printf("key: %s value: %s", p.first.c_str(), p.second.name().c_str());
+  }
+    
+  auto input_name = model_def.inputs().at("input_2").name();
+  auto output_name = model_def.outputs().at("output_0").name();
+    
+  // Get the image from disk as a float array of numbers, resized and normalized
+  // to the specifications the main graph expects.
+  std::vector<Tensor> resized_tensors;
+  string image_path = tensorflow::io::JoinPath(root_dir, image);
+  Status read_tensor_status =
+      ReadTensorFromImageFile(image_path, input_height, input_width, input_mean,
+                              input_std, &resized_tensors);
+  if (!read_tensor_status.ok()) {
+    LOG(ERROR) << read_tensor_status;
+    return -1;
+  }
+  const Tensor& resized_tensor = resized_tensors[0];
+
+  // Actually run the image through the model.
+  std::vector<Tensor> outputs;
+    
+  // fill the input tensors with data
+  tensorflow::Status status;
+  status = bundle.session->Run({ {input_name, resized_tensor}},                                   
+                                   {output_name}, {}, &outputs);
+  if (!status.ok()) {
+    std::cerr << "Inference failed: " << status;
+    return -1;
+  }    
+
+  // Do something interesting with the results we've generated.
+  Status print_status = PrintTopLabels(outputs, labels);
+  if (!print_status.ok()) {
+    LOG(ERROR) << "Running print failed: " << print_status;
+    return -1;
+  }
+                                  
+  return 0;
+}
diff --git a/tftrt/examples-cpp/image_classification/saved-model/tftrt-build.sh b/tftrt/examples-cpp/image_classification/saved-model/tftrt-build.sh
new file mode 100755
index 000000000..2d4604aa3
--- /dev/null
+++ b/tftrt/examples-cpp/image_classification/saved-model/tftrt-build.sh
@@ -0,0 +1,13 @@
+# TODO: to programatically determine the python and tf API versions
+PYVER=3.8 #TODO get this by parsing `python --version`
+TFAPI=2 #TODO get this by parsing tf.__version__
+
+/opt/tensorflow/nvbuild.sh --configonly --python$PYVER --v$TFAPI
+
+BUILD_OPTS="$(cat /opt/tensorflow/nvbuildopts)"
+if [[ "$TFAPI" == "2" ]]; then
+  BUILD_OPTS="--config=v2 $BUILD_OPTS"
+fi
+
+cd tensorflow-source
+bazel build $BUILD_OPTS tensorflow/examples/image_classification/...
diff --git a/tftrt/examples-cpp/image_classification/saved-model/tftrt-conversion.ipynb b/tftrt/examples-cpp/image_classification/saved-model/tftrt-conversion.ipynb
new file mode 100755
index 000000000..4c3eb5f28
--- /dev/null
+++ b/tftrt/examples-cpp/image_classification/saved-model/tftrt-conversion.ipynb
@@ -0,0 +1,699 @@
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {},
+    "colab_type": "code",
+    "id": "dR1W9kv7IPhE"
+   },
+   "outputs": [],
+   "source": [
+    "# Copyright 2021 NVIDIA Corporation. All Rights Reserved.\n",
+    "\n",
+    "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
+    "# you may not use this file except in compliance with the License.\n",
+    "# You may obtain a copy of the License at\n",
+    "\n",
+    "#     http://www.apache.org/licenses/LICENSE-2.0\n",
+    "\n",
+    "# Unless required by applicable law or agreed to in writing, software\n",
+    "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
+    "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
+    "# See the License for the specific language governing permissions and\n",
+    "# limitations under the License.\n",
+    "# =============================================================================="
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text",
+    "id": "Yb3TdMZAkVNq"
+   },
+   "source": [
+    "<img src=\"http://developer.download.nvidia.com/compute/machine-learning/frameworks/nvidia_logo.png\" style=\"width: 90px; float: right;\">\n",
+    "\n",
+    "# TensorFlow C++ Inference with TF-TRT Models\n",
+    "\n",
+    "\n",
+    "## Introduction\n",
+    "In this notebook, we will download a pretrained Keras ResNet-50 model, optimize it with TF-TRT, convert it to a frozen graph, then load and do inference with the TensorFlow C++ API.\n",
+    "\n",
+    "First, we download the image net labels."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!mkdir data\n",
+    "!curl -L \"https://storage.googleapis.com/download.tensorflow.org/models/inception_v3_2016_08_28_frozen.pb.tar.gz\" | tar -C ./data -xz\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {},
+    "colab_type": "code",
+    "id": "8Fg4x4aomCY4",
+    "scrolled": true
+   },
+   "outputs": [],
+   "source": [
+    "!nvidia-smi"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text",
+    "id": "LG4IBNn-2PWY"
+   },
+   "source": [
+    "### Install Dependencies"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "scrolled": true
+   },
+   "outputs": [],
+   "source": [
+    "!pip install pillow matplotlib"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 35
+    },
+    "colab_type": "code",
+    "id": "v0mfnfqg3ned",
+    "outputId": "11c043a0-b8e5-49e2-f907-5f1372c92a68",
+    "scrolled": true
+   },
+   "outputs": [],
+   "source": [
+    "import tensorflow as tf\n",
+    "print(\"Tensorflow version: \", tf.version.VERSION)\n",
+    "\n",
+    "# check TensorRT version\n",
+    "print(\"TensorRT version: \")\n",
+    "!dpkg -l | grep nvinfer"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text",
+    "id": "9U8b2394CZRu"
+   },
+   "source": [
+    "An available TensorRT installation looks like:\n",
+    "\n",
+    "```\n",
+    "TensorRT version: \n",
+    "ii  libnvinfer8                       8.2.2-1+cuda11.4                  amd64        TensorRT runtime libraries\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text",
+    "id": "nWYufTjPCMgW"
+   },
+   "source": [
+    "### Importing required libraries"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {},
+    "colab_type": "code",
+    "id": "Yyzwxjlm37jx"
+   },
+   "outputs": [],
+   "source": [
+    "from __future__ import absolute_import, division, print_function, unicode_literals\n",
+    "import os\n",
+    "import time\n",
+    "\n",
+    "import numpy as np\n",
+    "import matplotlib.pyplot as plt\n",
+    "\n",
+    "import tensorflow as tf\n",
+    "from tensorflow import keras\n",
+    "from tensorflow.python.compiler.tensorrt import trt_convert as trt\n",
+    "from tensorflow.python.saved_model import tag_constants\n",
+    "from tensorflow.keras.applications.resnet50 import ResNet50\n",
+    "from tensorflow.keras.preprocessing import image\n",
+    "from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text",
+    "id": "v-R2iN4akVOi"
+   },
+   "source": [
+    "## Data\n",
+    "We download several random images for testing from the Internet."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {},
+    "colab_type": "code",
+    "id": "tVJ2-8rokVOl",
+    "scrolled": true
+   },
+   "outputs": [],
+   "source": [
+    "!mkdir ./data\n",
+    "!wget  -O ./data/img0.JPG \"https://d17fnq9dkz9hgj.cloudfront.net/breed-uploads/2018/08/siberian-husky-detail.jpg?bust=1535566590&width=630\"\n",
+    "!wget  -O ./data/img1.JPG \"https://www.hakaimagazine.com/wp-content/uploads/header-gulf-birds.jpg\"\n",
+    "!wget  -O ./data/img2.JPG \"https://www.artis.nl/media/filer_public_thumbnails/filer_public/00/f1/00f1b6db-fbed-4fef-9ab0-84e944ff11f8/chimpansee_amber_r_1920x1080.jpg__1920x1080_q85_subject_location-923%2C365_subsampling-2.jpg\"\n",
+    "!wget  -O ./data/img3.JPG \"https://www.familyhandyman.com/wp-content/uploads/2018/09/How-to-Avoid-Snakes-Slithering-Up-Your-Toilet-shutterstock_780480850.jpg\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 269
+    },
+    "colab_type": "code",
+    "id": "F_9n-AR1kVOv",
+    "outputId": "e0ead6dc-e761-404e-a030-f6d3057a57da"
+   },
+   "outputs": [],
+   "source": [
+    "from tensorflow.keras.preprocessing import image\n",
+    "\n",
+    "fig, axes = plt.subplots(nrows=2, ncols=2)\n",
+    "\n",
+    "for i in range(4):\n",
+    "  img_path = './data/img%d.JPG'%i\n",
+    "  img = image.load_img(img_path, target_size=(224, 224),  interpolation='bilinear')\n",
+    "  plt.subplot(2,2,i+1)\n",
+    "  plt.imshow(img);\n",
+    "  plt.axis('off');"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text",
+    "id": "xeV4r2YTkVO1"
+   },
+   "source": [
+    "## Model\n",
+    "\n",
+    "We next download and test a ResNet-50 pre-trained model from the Keras model zoo."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 73
+    },
+    "colab_type": "code",
+    "id": "WwRBOikEkVO3",
+    "outputId": "2d63bc46-8bac-492f-b519-9ae5f19176bc"
+   },
+   "outputs": [],
+   "source": [
+    "model = ResNet50(weights='imagenet')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 410
+    },
+    "colab_type": "code",
+    "id": "lFKQPoLO_ikd",
+    "outputId": "c0b93de8-c94b-4977-992e-c780e12a3d52"
+   },
+   "outputs": [],
+   "source": [
+    "for i in range(4):\n",
+    "  img_path = './data/img%d.JPG'%i\n",
+    "  img = image.load_img(img_path, target_size=(224, 224),interpolation='bilinear')\n",
+    "  x = image.img_to_array(img)\n",
+    "  x = np.expand_dims(x, axis=0)\n",
+    "  x = preprocess_input(x)\n",
+    "\n",
+    "  preds = model.predict(x)\n",
+    "  # decode the results into a list of tuples (class, description, probability)\n",
+    "  # (one such list for each sample in the batch)\n",
+    "  print('{} - Predicted: {}'.format(img_path, decode_predictions(preds, top=3)[0]))\n",
+    "\n",
+    "  plt.subplot(2,2,i+1)\n",
+    "  plt.imshow(img);\n",
+    "  plt.axis('off');\n",
+    "  plt.title(decode_predictions(preds, top=3)[0][0][1])\n",
+    "    "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text",
+    "id": "XrL3FEcdkVPA"
+   },
+   "source": [
+    "TF-TRT takes input as a TensorFlow saved model, therefore, we re-export the Keras model as a TF saved model."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 110
+    },
+    "colab_type": "code",
+    "id": "WxlUF3rlkVPH",
+    "outputId": "9f3864e7-f211-4c06-d2d2-585c1a477e34"
+   },
+   "outputs": [],
+   "source": [
+    "# Save the entire model as a SavedModel.\n",
+    "model.save('resnet50_saved_model') "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 453
+    },
+    "colab_type": "code",
+    "id": "RBu2RKs6kVPP",
+    "outputId": "8e063261-7efb-47fd-fa6c-1bb5076d418c"
+   },
+   "outputs": [],
+   "source": [
+    "!saved_model_cli show --all --dir resnet50_saved_model"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text",
+    "id": "qBQwBvlNm-J8"
+   },
+   "source": [
+    "### Inference with native TF2.0 saved model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {},
+    "colab_type": "code",
+    "id": "8zLN0GMCkVPe"
+   },
+   "outputs": [],
+   "source": [
+    "model = tf.keras.models.load_model('resnet50_saved_model')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 219
+    },
+    "colab_type": "code",
+    "id": "Fbj-UEOxkVPs",
+    "outputId": "3a2b34f9-8034-48cb-b3fe-477f09966025"
+   },
+   "outputs": [],
+   "source": [
+    "img_path = './data/img0.JPG'  # Siberian_husky\n",
+    "img = image.load_img(img_path, target_size=(224, 224))\n",
+    "x = image.img_to_array(img)\n",
+    "x = np.expand_dims(x, axis=0)\n",
+    "x = preprocess_input(x)\n",
+    "\n",
+    "preds = model.predict(x)\n",
+    "# decode the results into a list of tuples (class, description, probability)\n",
+    "# (one such list for each sample in the batch)\n",
+    "print('{} - Predicted: {}'.format(img_path, decode_predictions(preds, top=3)[0]))\n",
+    "plt.subplot(2,2,1)\n",
+    "plt.imshow(img);\n",
+    "plt.axis('off');\n",
+    "plt.title(decode_predictions(preds, top=3)[0][0][1])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 35
+    },
+    "colab_type": "code",
+    "id": "CGc-dC6DvwRP",
+    "outputId": "e0a22e05-f4fe-47b6-93e8-2b806bf7098a"
+   },
+   "outputs": [],
+   "source": [
+    "batch_size = 1\n",
+    "batched_input = np.zeros((batch_size, 224, 224, 3), dtype=np.float32)\n",
+    "\n",
+    "for i in range(batch_size):\n",
+    "  img_path = './data/img%d.JPG' % (i % 4)\n",
+    "  img = image.load_img(img_path, target_size=(224, 224))\n",
+    "  x = image.img_to_array(img)\n",
+    "  x = np.expand_dims(x, axis=0)\n",
+    "  x = preprocess_input(x)\n",
+    "  batched_input[i, :] = x\n",
+    "batched_input = tf.constant(batched_input)\n",
+    "print('batched_input shape: ', batched_input.shape)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {},
+    "colab_type": "code",
+    "id": "rFBV6hQR7N3z"
+   },
+   "outputs": [],
+   "source": [
+    "# Benchmarking throughput\n",
+    "N_warmup_run = 50\n",
+    "N_run = 1000\n",
+    "elapsed_time = []\n",
+    "\n",
+    "for i in range(N_warmup_run):\n",
+    "  preds = model.predict(batched_input)\n",
+    "\n",
+    "for i in range(N_run):\n",
+    "  start_time = time.time()\n",
+    "  preds = model.predict(batched_input)\n",
+    "  end_time = time.time()\n",
+    "  elapsed_time = np.append(elapsed_time, end_time - start_time)\n",
+    "  if i % 50 == 0:\n",
+    "    print('Step {}: {:4.1f}ms'.format(i, (elapsed_time[-50:].mean()) * 1000))\n",
+    "\n",
+    "print('Throughput: {:.0f} images/s'.format(N_run * batch_size / elapsed_time.sum()))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text",
+    "id": "vC_RN0BAkVPy"
+   },
+   "source": [
+    "### TF-TRT FP32 model\n",
+    "\n",
+    "We next convert the TF native FP32 model to a TF-TRT FP32 model."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 126
+    },
+    "colab_type": "code",
+    "id": "0eLImSJ-kVPz",
+    "outputId": "e2c353c7-8e4b-49aa-ab97-f4d82797d4d8"
+   },
+   "outputs": [],
+   "source": [
+    "print('Converting to TF-TRT FP32...')\n",
+    "conversion_params = trt.DEFAULT_TRT_CONVERSION_PARAMS._replace(precision_mode=trt.TrtPrecisionMode.FP32,\n",
+    "                                                               max_workspace_size_bytes=8000000000)\n",
+    "\n",
+    "converter = trt.TrtGraphConverterV2(input_saved_model_dir='resnet50_saved_model',\n",
+    "                                    conversion_params=conversion_params)\n",
+    "converter.convert()\n",
+    "converter.save(output_saved_model_dir='resnet50_saved_model_TFTRT_FP32')\n",
+    "print('Done Converting to TF-TRT FP32')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 453
+    },
+    "colab_type": "code",
+    "id": "dlue_3npkVQC",
+    "outputId": "4dd6a366-fe9a-43c8-aad0-dd357bba41bb"
+   },
+   "outputs": [],
+   "source": [
+    "!saved_model_cli show --all --dir resnet50_saved_model_TFTRT_FP32"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text",
+    "id": "Vd2DoGUp8ivj"
+   },
+   "source": [
+    "Next, we load and test the TF-TRT FP32 model."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from tensorflow.python.framework.convert_to_constants import convert_variables_to_constants_v2\n",
+    "import tensorflow as tf\n",
+    "\n",
+    "model = tf.saved_model.load(\"resnet50_saved_model_TFTRT_FP32\", tags=[tag_constants.SERVING]).signatures['serving_default']"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {},
+    "colab_type": "code",
+    "id": "rf97K_rxvwRm"
+   },
+   "outputs": [],
+   "source": [
+    "def predict_tftrt(input_saved_model):\n",
+    "    \"\"\"Runs prediction on a single image and shows the result.\n",
+    "    input_saved_model (string): Name of the input model stored in the current dir\n",
+    "    \"\"\"\n",
+    "    img_path = './data/img0.JPG'  # Siberian_husky\n",
+    "    img = image.load_img(img_path, target_size=(224, 224))\n",
+    "    x = image.img_to_array(img)\n",
+    "    x = np.expand_dims(x, axis=0)\n",
+    "    x = preprocess_input(x)\n",
+    "    x = tf.constant(x)\n",
+    "    \n",
+    "    saved_model_loaded = tf.saved_model.load(input_saved_model, tags=[tag_constants.SERVING])\n",
+    "    signature_keys = list(saved_model_loaded.signatures.keys())\n",
+    "    print(signature_keys)\n",
+    "\n",
+    "    infer = saved_model_loaded.signatures['serving_default']\n",
+    "    print(infer.structured_outputs)\n",
+    "\n",
+    "    labeling = infer(x)\n",
+    "    preds = labeling['predictions'].numpy()\n",
+    "    print('{} - Predicted: {}'.format(img_path, decode_predictions(preds, top=3)[0]))\n",
+    "    plt.subplot(2,2,1)\n",
+    "    plt.imshow(img);\n",
+    "    plt.axis('off');\n",
+    "    plt.title(decode_predictions(preds, top=3)[0][0][1])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 238
+    },
+    "colab_type": "code",
+    "id": "pRK0pRE-snvb",
+    "outputId": "1f7ab6c1-dbfa-4e3e-a21d-df9975c70455"
+   },
+   "outputs": [],
+   "source": [
+    "predict_tftrt('resnet50_saved_model_TFTRT_FP32')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {},
+    "colab_type": "code",
+    "id": "z9b5j6jMvwRt"
+   },
+   "outputs": [],
+   "source": [
+    "def benchmark_tftrt(input_saved_model):\n",
+    "    saved_model_loaded = tf.saved_model.load(input_saved_model, tags=[tag_constants.SERVING])\n",
+    "    infer = saved_model_loaded.signatures['serving_default']\n",
+    "\n",
+    "    N_warmup_run = 50\n",
+    "    N_run = 1000\n",
+    "    elapsed_time = []\n",
+    "\n",
+    "    for i in range(N_warmup_run):\n",
+    "      labeling = infer(batched_input)\n",
+    "\n",
+    "    for i in range(N_run):\n",
+    "      start_time = time.time()\n",
+    "      labeling = infer(batched_input)\n",
+    "      end_time = time.time()\n",
+    "      elapsed_time = np.append(elapsed_time, end_time - start_time)\n",
+    "      if i % 50 == 0:\n",
+    "        print('Step {}: {:4.1f}ms'.format(i, (elapsed_time[-50:].mean()) * 1000))\n",
+    "\n",
+    "    print('Throughput: {:.0f} images/s'.format(N_run * batch_size / elapsed_time.sum()))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {},
+    "colab_type": "code",
+    "id": "ai6bxNcNszHc"
+   },
+   "outputs": [],
+   "source": [
+    "benchmark_tftrt('resnet50_saved_model_TFTRT_FP32')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Prepare model for C++ inference\n",
+    "\n",
+    "We can see that the TF-TRT FP32 model provide great speedup over the native Keras model. Now let's prepare this model for C++ inference. We will freeze this graph and write it as a frozen saved model to disk."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {},
+    "colab_type": "code",
+    "id": "2mM9D3BTEzQS"
+   },
+   "outputs": [],
+   "source": [
+    "from tensorflow.python.saved_model import signature_constants\n",
+    "from tensorflow.python.saved_model import tag_constants\n",
+    "from tensorflow.python.framework import convert_to_constants\n",
+    "\n",
+    "def get_func_from_saved_model(saved_model_dir):\n",
+    "    saved_model_loaded = tf.saved_model.load(\n",
+    "        saved_model_dir, tags=[tag_constants.SERVING])\n",
+    "    graph_func = saved_model_loaded.signatures[\n",
+    "        signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY]\n",
+    "    return graph_func, saved_model_loaded\n",
+    "\n",
+    "func, loaded_model = get_func_from_saved_model('resnet50_saved_model_TFTRT_FP32')\n",
+    "\n",
+    "# Create frozen func\n",
+    "frozen_func = convert_to_constants.convert_variables_to_constants_v2(func)\n",
+    "module = tf.Module()\n",
+    "module.myfunc = frozen_func\n",
+    "tf.saved_model.save(module,'resnet50_saved_model_TFTRT_FP32_frozen', signatures=frozen_func)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text",
+    "id": "I13snJ9VkVQh"
+   },
+   "source": [
+    "### What's next\n",
+    "Refer back to the [Readme](README.md) to load the TF-TRT frozen saved model for inference with the TensorFlow C++ API."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "accelerator": "GPU",
+  "colab": {
+   "include_colab_link": true,
+   "machine_shape": "hm",
+   "name": "Colab-TF20-TF-TRT-inference-from-Keras-saved-model.ipynb",
+   "provenance": []
+  },
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.8.10"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}