diff --git a/README.md b/README.md old mode 100644 new mode 100755 diff --git a/tftrt/examples-cpp/image_classification/README.md b/tftrt/examples-cpp/image_classification/README.md index 6b3e8ba4f..eec06e5d5 100755 --- a/tftrt/examples-cpp/image_classification/README.md +++ b/tftrt/examples-cpp/image_classification/README.md @@ -3,104 +3,11 @@ # TF-TRT C++ Image Recognition Demo -This example shows how you can load a native TF Keras ResNet-50 model, convert it to a TF-TRT optimized model (via the TF-TRT Python API), save the model as a frozen graph, and then finally load and serve the model with the TF C++ API. The process can be demonstrated with the below workflow diagram: +This example shows how you can load a native TF Keras ResNet-50 model, convert it to a TF-TRT optimized model (via the TF-TRT Python API), save the model as either a frozen graph or a saved model, and then finally load and serve the model with the TF C++ API. The process can be demonstrated with the below workflow diagram: -![TF-TRT C++ Inference workflow](TF-TRT_CPP_inference.png "TF-TRT C++ Inference") +![TF-TRT C++ Inference workflow](TF-TRT_CPP_inference_overview.png "TF-TRT C++ Inference") This example is built based upon the original Google's TensorFlow C++ image classification [example](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/label_image), on top of which we added the TF-TRT conversion part and adapted the C++ code for loading and inferencing with the TF-TRT model. -## Docker environment -Docker images provide a convinient and repeatable environment for experimentation. This workflow was tested in the NVIDIA NGC TensorFlow 22.01 docker container that comes with a TensorFlow 2.x build. Tools required for building this example, such as Bazel, NVIDIA CUDA, CUDNN, NCCL libraries are all readily setup. - -To replecate the below steps, start by pulling the NGC TF container: - -``` -docker pull nvcr.io/nvidia/tensorflow:22.01-tf2-py3 -``` -Then start the container with nvidia-docker: - -``` -nvidia-docker run --rm -it -p 8888:8888 --name TFTRT_CPP nvcr.io/nvidia/tensorflow:22.01-tf2-py3 -``` - -You will land at `/workspace` within the docker container. Clone the TF-TRT example repository with: - -``` -git clone https://github.com/tensorflow/tensorrt -cd tensorrt -``` - -Then copy the content of this C++ example directory to the TensorFlow example source directory: - -``` -cp -r ./tftrt/examples-cpp/image_classification/ /opt/tensorflow/tensorflow-source/tensorflow/examples/ -cd /opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification -``` - - -## Convert to TF-TRT Model - -Start Jupyter lab with: - -``` -jupyter lab -ip 0.0.0.0 -``` - -A Jupyter notebook for downloading the Keras ResNet-50 model and TF-TRT conversion is provided in `tf-trt-conversion.ipynb` for your experimentation. By default, this notebook will produce a TF-TRT FP32 model at `/opt/tensorflow/tensorflow-source/tensorflow/examples/image-classification/frozen_models_trt_fp32/frozen_models_trt_fp32.pb`. - -As part of the conversion, the notebook will also carry out benchmarking and print out the throughput statistics. - - - - -## Build the C++ example -The NVIDIA NGC container should have everything you need to run this example installed already. - -To build it, first, you need to copy the build scripts `tftrt_build.sh` to `/opt/tensorflow`: - -``` -cp tftrt-build.sh /opt/tensorflow -``` - -Then from `/opt/tensorflow`, run the build command: - -```bash -cd /opt/tensorflow -bash ./tftrt-build.sh -``` - -That should build a binary executable `tftrt_label_image` that you can then run like this: - -```bash -tensorflow-source/bazel-bin/tensorflow/examples/image_classification/tftrt_label_image \ ---graph=/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/frozen_models_trt_fp32/frozen_models_trt_fp32.pb \ ---image=/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/data/img0.JPG -``` - -This uses the default image `img0.JPG` which was download as part of the conversion notebook, and should -output something similar to this: - -``` -2022-02-23 13:53:56.076348: I tensorflow/examples/image-classification/main.cc:276] malamute (250): 0.575496 -2022-02-23 13:53:56.076384: I tensorflow/examples/image-classification/main.cc:276] Saint Bernard (248): 0.399285 -2022-02-23 13:53:56.076412: I tensorflow/examples/image-classification/main.cc:276] Eskimo dog (249): 0.0228338 -2022-02-23 13:53:56.076423: I tensorflow/examples/image-classification/main.cc:276] Ibizan hound (174): 0.00127912 -2022-02-23 13:53:56.076449: I tensorflow/examples/image-classification/main.cc:276] Mexican hairless (269): 0.000520922 -``` - -The program will also benchmark and output the throughput. Observe the improved throughput offered by moving from Python to C++ serving. - -Next, try it out on your own images by supplying the --image= argument, e.g. - -```bash -tensorflow-source/bazel-bin/tensorflow/examples/label_image/tftrt_label_image --image=my_image.png -``` - -## What's next - -Try to build TF-TRT FP16 and INT8 models and test on your own data, and serve them with C++. - -```bash - -``` +See the respective sub-folder for details on either approach. \ No newline at end of file diff --git a/tftrt/examples-cpp/image_classification/TF-TRT_CPP_inference_overview.png b/tftrt/examples-cpp/image_classification/TF-TRT_CPP_inference_overview.png new file mode 100644 index 000000000..de35058bc Binary files /dev/null and b/tftrt/examples-cpp/image_classification/TF-TRT_CPP_inference_overview.png differ diff --git a/tftrt/examples-cpp/image_classification/BUILD b/tftrt/examples-cpp/image_classification/frozen-graph/BUILD similarity index 100% rename from tftrt/examples-cpp/image_classification/BUILD rename to tftrt/examples-cpp/image_classification/frozen-graph/BUILD diff --git a/tftrt/examples-cpp/image_classification/frozen-graph/README.md b/tftrt/examples-cpp/image_classification/frozen-graph/README.md new file mode 100755 index 000000000..9fd1ca305 --- /dev/null +++ b/tftrt/examples-cpp/image_classification/frozen-graph/README.md @@ -0,0 +1,108 @@ + + + +# TF-TRT C++ Image Recognition Demo + +This example shows how you can load a native TF Keras ResNet-50 model, convert it to a TF-TRT optimized model (via the TF-TRT Python API), save the model as a frozen graph, and then finally load and serve the model with the TF C++ API. The process can be demonstrated with the below workflow diagram: + + +![TF-TRT C++ Inference workflow](TF-TRT_CPP_inference.png "TF-TRT C++ Inference") + +This example is built based upon the original Google's TensorFlow C++ image classification [example](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/label_image), on top of which we added the TF-TRT conversion part and adapted the C++ code for loading and inferencing with the TF-TRT model. + +## Docker environment +Docker images provide a convinient and repeatable environment for experimentation. This workflow was tested in the NVIDIA NGC TensorFlow 22.01 docker container that comes with a TensorFlow 2.x build. Tools required for building this example, such as Bazel, NVIDIA CUDA, CUDNN, NCCL libraries are all readily setup. + +To replecate the below steps, start by pulling the NGC TF container: + +``` +docker pull nvcr.io/nvidia/tensorflow:22.01-tf2-py3 +``` +Then start the container with nvidia-docker: + +``` +nvidia-docker run --rm -it -p 8888:8888 --name TFTRT_CPP nvcr.io/nvidia/tensorflow:22.01-tf2-py3 +``` + +You will land at `/workspace` within the docker container. Clone the TF-TRT example repository with: + +``` +git clone https://github.com/tensorflow/tensorrt +cd tensorrt +``` + +Then copy the content of this C++ example directory to the TensorFlow example source directory: + +``` +cp -r ./tftrt/examples-cpp/image_classification /opt/tensorflow/tensorflow-source/tensorflow/examples/ +cd /opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/frozen-graph +``` + + +## Convert to TF-TRT Model + +Start Jupyter lab with: + +``` +jupyter lab -ip 0.0.0.0 +``` + +A Jupyter notebook for downloading the Keras ResNet-50 model and TF-TRT conversion is provided in `tf-trt-conversion.ipynb` for your experimentation. By default, this notebook will produce a TF-TRT FP32 model at `/opt/tensorflow/tensorflow-source/tensorflow/examples/image-classification/frozen-graph/frozen_models_trt_fp32/frozen_models_trt_fp32.pb`. + +As part of the conversion, the notebook will also carry out benchmarking and print out the throughput statistics. + + + + +## Build the C++ example +The NVIDIA NGC container should have everything you need to run this example installed already. + +To build it, first, you need to copy the build scripts `tftrt_build.sh` to `/opt/tensorflow`: + +``` +cp tftrt-build.sh /opt/tensorflow +``` + +Then from `/opt/tensorflow`, run the build command: + +```bash +cd /opt/tensorflow +bash ./tftrt-build.sh +``` + +That should build a binary executable `tftrt_label_image` that you can then run like this: + +```bash +tensorflow-source/bazel-bin/tensorflow/examples/image_classification/frozen-graph/tftrt_label_image \ +--graph=/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/frozen-graph/frozen_models_trt_fp32/frozen_models_trt_fp32.pb \ +--image=/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/frozen-graph/data/img0.JPG +``` + +This uses the default image `img0.JPG` which was download as part of the conversion notebook, and should +output something similar to this: + +``` +2022-04-29 04:20:24.377345: I tensorflow/examples/image_classification/frozen-graph/main.cc:276] malamute (250): 0.575496 +2022-04-29 04:20:24.377370: I tensorflow/examples/image_classification/frozen-graph/main.cc:276] Saint Bernard (248): 0.399285 +2022-04-29 04:20:24.377380: I tensorflow/examples/image_classification/frozen-graph/main.cc:276] Eskimo dog (249): 0.0228338 +2022-04-29 04:20:24.377387: I tensorflow/examples/image_classification/frozen-graph/main.cc:276] Ibizan hound (174): 0.00127912 +2022-04-29 04:20:24.377394: I tensorflow/examples/image_classification/frozen-graph/main.cc:276] Mexican hairless (269): 0.000520922 +``` + +The program will also benchmark and output the throughput. Observe the improved throughput offered by moving from Python to C++ serving. + +Next, try it out on your own images by supplying the --image= argument, e.g. + +```bash +tensorflow-source/bazel-bin/tensorflow/examples/image_classification/frozen-graph/tftrt_label_image \ +--graph=/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/frozen-graph/frozen_models_trt_fp32/frozen_models_trt_fp32.pb \ +--image=my_image.png +``` + +## What's next + +Try to build TF-TRT FP16 and INT8 models and test on your own data, and serve them with C++. + +```bash + +``` diff --git a/tftrt/examples-cpp/image_classification/TF-TRT_CPP_inference.png b/tftrt/examples-cpp/image_classification/frozen-graph/TF-TRT_CPP_inference.png old mode 100644 new mode 100755 similarity index 100% rename from tftrt/examples-cpp/image_classification/TF-TRT_CPP_inference.png rename to tftrt/examples-cpp/image_classification/frozen-graph/TF-TRT_CPP_inference.png diff --git a/tftrt/examples-cpp/image_classification/main.cc b/tftrt/examples-cpp/image_classification/frozen-graph/main.cc similarity index 98% rename from tftrt/examples-cpp/image_classification/main.cc rename to tftrt/examples-cpp/image_classification/frozen-graph/main.cc index 5dc34da18..5248d143a 100755 --- a/tftrt/examples-cpp/image_classification/main.cc +++ b/tftrt/examples-cpp/image_classification/frozen-graph/main.cc @@ -302,11 +302,11 @@ int main(int argc, char* argv[]) { // These are the command-line flags the program can understand. // They define where the graph and input data is located, and what kind of // input the model expects. - string image = "/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/data/img0.JPG"; + string image = "/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/frozen-graph/data/img0.JPG"; string graph = - "/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/data/resnet-50.pb"; + "/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/frozen-graph/data/resnet-50.pb"; string labels = - "/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/data/imagenet_slim_labels.txt"; + "/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/frozen-graph/data/imagenet_slim_labels.txt"; int32_t input_width = 224; int32_t input_height = 224; float input_mean = 127; diff --git a/tftrt/examples-cpp/image_classification/tftrt-build.sh b/tftrt/examples-cpp/image_classification/frozen-graph/tftrt-build.sh old mode 100644 new mode 100755 similarity index 100% rename from tftrt/examples-cpp/image_classification/tftrt-build.sh rename to tftrt/examples-cpp/image_classification/frozen-graph/tftrt-build.sh diff --git a/tftrt/examples-cpp/image_classification/tftrt-conversion.ipynb b/tftrt/examples-cpp/image_classification/frozen-graph/tftrt-conversion.ipynb similarity index 100% rename from tftrt/examples-cpp/image_classification/tftrt-conversion.ipynb rename to tftrt/examples-cpp/image_classification/frozen-graph/tftrt-conversion.ipynb diff --git a/tftrt/examples-cpp/image_classification/saved-model/BUILD b/tftrt/examples-cpp/image_classification/saved-model/BUILD new file mode 100755 index 000000000..2bc49d38d --- /dev/null +++ b/tftrt/examples-cpp/image_classification/saved-model/BUILD @@ -0,0 +1,50 @@ +# Description: +# TensorFlow C++ inference example with TF-TRT model. + +load("//tensorflow:tensorflow.bzl", "tf_cc_binary") + +package( + default_visibility = ["//tensorflow:internal"], + licenses = ["notice"], +) + +tf_cc_binary( + name = "tftrt_label_image", + srcs = [ + "main.cc", + ], + linkopts = select({ + "//tensorflow:android": [ + "-pie", + "-landroid", + "-ljnigraphics", + "-llog", + "-lm", + "-z defs", + "-s", + "-Wl,--exclude-libs,ALL", + ], + "//conditions:default": ["-lm"], + }), + deps = select({ + "//tensorflow:android": [ + # cc:cc_ops is used to include image ops (for label_image) + # Jpg, gif, and png related code won't be included + "//tensorflow/cc:cc_ops", + "//tensorflow/core:portable_tensorflow_lib", + # cc:android_tensorflow_image_op is for including jpeg/gif/png + # decoder to enable real-image evaluation on Android + "//tensorflow/core/kernels/image:android_tensorflow_image_op", + ], + "//conditions:default": [ + "//tensorflow/cc:cc_ops", + "//tensorflow/cc/saved_model:loader", + "//tensorflow/core:core_cpu", + "//tensorflow/core:framework", + "//tensorflow/core:framework_internal", + "//tensorflow/core:lib", + "//tensorflow/core:protos_all_cc", + "//tensorflow/core:tensorflow", + ], + }), +) \ No newline at end of file diff --git a/tftrt/examples-cpp/image_classification/saved-model/README.md b/tftrt/examples-cpp/image_classification/saved-model/README.md new file mode 100755 index 000000000..6c2322331 --- /dev/null +++ b/tftrt/examples-cpp/image_classification/saved-model/README.md @@ -0,0 +1,109 @@ + + + + +# TF-TRT C++ Image Recognition Demo + +This example shows how you can load a native TF Keras ResNet-50 model, convert it to a TF-TRT optimized model (via the TF-TRT Python API), save the model as a saved model, and then finally load and serve the model with the TF C++ API. The process can be demonstrated with the below workflow diagram: + + +![TF-TRT C++ Inference workflow](TF-TRT_CPP_inference_saved_model.png "TF-TRT C++ Inference") + +This example is built based upon the original Google's TensorFlow C++ image classification [example](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/label_image), on top of which we added the TF-TRT conversion part and adapted the C++ code for loading and inferencing with the TF-TRT model. + +## Docker environment +Docker images provide a convinient and repeatable environment for experimentation. This workflow was tested in the NVIDIA NGC TensorFlow 22.01 docker container that comes with a TensorFlow 2.x build. Tools required for building this example, such as Bazel, NVIDIA CUDA, CUDNN, NCCL libraries are all readily setup. + +To replecate the below steps, start by pulling the NGC TF container: + +``` +docker pull nvcr.io/nvidia/tensorflow:22.01-tf2-py3 +``` +Then start the container with nvidia-docker: + +``` +nvidia-docker run --rm -it -p 8888:8888 --name TFTRT_CPP nvcr.io/nvidia/tensorflow:22.01-tf2-py3 +``` + +You will land at `/workspace` within the docker container. Clone the TF-TRT example repository with: + +``` +git clone https://github.com/tensorflow/tensorrt +cd tensorrt +``` + +Then copy the content of this C++ example directory to the TensorFlow example source directory: + +``` +cp -r ./tftrt/examples-cpp/image_classification/ /opt/tensorflow/tensorflow-source/tensorflow/examples/ +cd /opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/saved-model +``` + + +## Convert to TF-TRT Model + +Start Jupyter lab with: + +``` +jupyter lab -ip 0.0.0.0 +``` + +A Jupyter notebook for downloading the Keras ResNet-50 model and TF-TRT conversion is provided in `tf-trt-conversion.ipynb` for your experimentation. By default, this notebook will produce a TF-TRT FP32 saved model at `/opt/tensorflow/tensorflow-source/tensorflow/examples/image-classification/saved-model/resnet50_saved_model_TFTRT_FP32_frozen`. + +As part of the conversion, the notebook will also carry out benchmarking and print out the throughput statistics. + + + + +## Build the C++ example +The NVIDIA NGC container should have everything you need to run this example installed already. + +To build it, first, you need to copy the build scripts `tftrt_build.sh` to `/opt/tensorflow`: + +``` +cp tftrt-build.sh /opt/tensorflow +``` + +Then from `/opt/tensorflow`, run the build command: + +```bash +cd /opt/tensorflow +bash ./tftrt-build.sh +``` + +That should build a binary executable `tftrt_label_image` that you can then run like this: + +```bash +tensorflow-source/bazel-bin/tensorflow/examples/image_classification/saved-model/tftrt_label_image \ +--export_dir=/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/saved-model/resnet50_saved_model_TFTRT_FP32_frozen \ +--image=/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/saved-model/data/img0.JPG +``` + +This uses the default image `img0.JPG` which was download as part of the conversion notebook, and should +output something similar to this: + +``` +2022-04-29 04:19:28.397102: I tensorflow/examples/image_classification/saved-model/main.cc:331] malamute (250): 0.575497 +2022-04-29 04:19:28.397126: I tensorflow/examples/image_classification/saved-model/main.cc:331] Saint Bernard (248): 0.399284 +2022-04-29 04:19:28.397134: I tensorflow/examples/image_classification/saved-model/main.cc:331] Eskimo dog (249): 0.0228338 +2022-04-29 04:19:28.397141: I tensorflow/examples/image_classification/saved-model/main.cc:331] Ibizan hound (174): 0.00127912 +2022-04-29 04:19:28.397147: I tensorflow/examples/image_classification/saved-model/main.cc:331] Mexican hairless (269): 0.000520922 +``` + +The program will also benchmark and output the throughput. Observe the improved throughput offered by moving from Python to C++ serving. + +Next, try it out on your own images by supplying the --image= argument, e.g. + +```bash +tensorflow-source/bazel-bin/tensorflow/examples/image_classification/saved-model/tftrt_label_image \ +--export_dir=/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/saved-model/resnet50_saved_model_TFTRT_FP32_frozen \ +--image=my_image.png +``` + +## What's next + +Try to build TF-TRT FP16 and INT8 models and test on your own data, and serve them with C++. + +```bash + +``` diff --git a/tftrt/examples-cpp/image_classification/saved-model/TF-TRT_CPP_inference_saved_model.png b/tftrt/examples-cpp/image_classification/saved-model/TF-TRT_CPP_inference_saved_model.png new file mode 100755 index 000000000..153881ad3 Binary files /dev/null and b/tftrt/examples-cpp/image_classification/saved-model/TF-TRT_CPP_inference_saved_model.png differ diff --git a/tftrt/examples-cpp/image_classification/saved-model/image_classification_build.sh b/tftrt/examples-cpp/image_classification/saved-model/image_classification_build.sh new file mode 100755 index 000000000..38477247c --- /dev/null +++ b/tftrt/examples-cpp/image_classification/saved-model/image_classification_build.sh @@ -0,0 +1,37 @@ +#!/bin/bash +# Build the C++ TFTRT Example + +# Copyright 2019 NVIDIA Corporation. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================== + +set -e +if [[ ! -f /opt/tensorflow/nvbuild.sh || ! -f /opt/tensorflow/nvbuildopts ]]; then + echo This TF-TRT example is intended to be executed in the NGC TensorFlow container environment. Get one with, e.g. `docker pull nvcr.io/nvidia/tensorflow:19.10-py3`. + exit 1 +fi + +# TODO: to programatically determine the python and tf API versions +PYVER=3.6 #TODO get this by parsing `python --version` +TFAPI=1 #TODO get this by parsing tf.__version__ + +/opt/tensorflow/nvbuild.sh --configonly --python$PYVER --v$TFAPI + +BUILD_OPTS="$(cat /opt/tensorflow/nvbuildopts)" +if [[ "$TFAPI" == "2" ]]; then + BUILD_OPTS="--config=v2 $BUILD_OPTS" +fi + +cd /opt/tensorflow/tensorflow-source +bazel build $BUILD_OPTS tensorflow/examples/image-classification/... diff --git a/tftrt/examples-cpp/image_classification/saved-model/main.cc b/tftrt/examples-cpp/image_classification/saved-model/main.cc new file mode 100755 index 000000000..a21e2f330 --- /dev/null +++ b/tftrt/examples-cpp/image_classification/saved-model/main.cc @@ -0,0 +1,464 @@ +/* Copyright 2021 NVIDIA Corporation. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ + +/* Copyright 2015 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ + +// A minimal but useful C++ example showing how to load a TF-TRT ResNet-50 model, +// prepare input images for it, run them through the graph, and interpret the results. +// +// It's designed to have as few dependencies and be as clear as possible, so +// it's more verbose than it could be in production code. In particular, using +// auto for the types of a lot of the returned values from TensorFlow calls can +// remove a lot of boilerplate, but I find the explicit types useful in sample +// code to make it simple to look up the classes involved. +// +// To use it, compile and then run in a working directory with the +// learning/brain/tutorials/label_image/data/ folder below it, and you should +// see the top five labels for the example Lena image output. You can then +// customize it to use your own models or images by changing the file names at +// the top of the main() function. +// +// Note that, for GIF inputs, to reuse existing code, only single-frame ones +// are supported. + +#include +#include +#include +#include +#include +#include +#include + +#include "tensorflow/cc/ops/const_op.h" +#include "tensorflow/cc/ops/array_ops.h" +#include "tensorflow/cc/ops/image_ops.h" +#include "tensorflow/cc/ops/standard_ops.h" +#include "tensorflow/core/framework/graph.pb.h" +#include "tensorflow/core/framework/tensor.h" +#include "tensorflow/core/graph/default_device.h" +#include "tensorflow/core/graph/graph_def_builder.h" +#include "tensorflow/core/lib/core/errors.h" +#include "tensorflow/core/lib/core/stringpiece.h" +#include "tensorflow/core/lib/core/threadpool.h" +#include "tensorflow/core/lib/io/path.h" +#include "tensorflow/core/lib/strings/str_util.h" +#include "tensorflow/core/lib/strings/stringprintf.h" +#include "tensorflow/core/platform/env.h" +#include "tensorflow/core/platform/init_main.h" +#include "tensorflow/core/platform/logging.h" +#include "tensorflow/core/platform/types.h" +#include "tensorflow/core/public/session.h" +#include "tensorflow/core/util/command_line_flags.h" +#include "tensorflow/cc/saved_model/loader.h" +#include "tensorflow/core/framework/tensor.pb.h" +#include "tensorflow/core/lib/core/status.h" +#include "tensorflow/core/platform/init_main.h" + +#include "absl/strings/string_view.h" + +// These are all common classes it's handy to reference with no namespace. +using tensorflow::Flag; +using tensorflow::int32; +using tensorflow::Status; +using tensorflow::string; +using tensorflow::Tensor; +using tensorflow::tstring; + +// Returns the name of nodes listed in the signature definition. +std::vector +GetNodeNames(const google::protobuf::Map + &signature) { + std::vector names; + for (auto const &item : signature) { + absl::string_view name = item.second.name(); + // Remove tensor suffix like ":0". + size_t last_colon = name.find_last_of(':'); + if (last_colon != absl::string_view::npos) { + name.remove_suffix(name.size() - last_colon); + } + names.push_back(std::string(name)); + } + return names; +} + +// Loads a SavedModel from export_dir into the SavedModelBundle. +tensorflow::Status LoadModel(const std::string &export_dir, + tensorflow::SavedModelBundle *bundle, + std::vector *input_names, + std::vector *output_names) { + + tensorflow::RunOptions run_options; + TF_RETURN_IF_ERROR(tensorflow::LoadSavedModel(tensorflow::SessionOptions(), + run_options, export_dir, + {"serve"}, bundle)); + + // Print the signature defs. + auto signature_map = bundle->GetSignatures(); + for (const auto &name_and_signature_def : signature_map) { + const auto &name = name_and_signature_def.first; + const auto &signature_def = name_and_signature_def.second; + std::cerr << "Name: " << name << std::endl; + std::cerr << "SignatureDef: " << signature_def.DebugString() << std::endl; + } + + // Extract input and output tensor names from the signature def. + const tensorflow::SignatureDef &signature = signature_map["serving_default"]; + *input_names = GetNodeNames(signature.inputs()); + *output_names = GetNodeNames(signature.outputs()); + + return tensorflow::Status::OK(); +} + +// Takes a file name, and loads a list of labels from it, one per line, and +// returns a vector of the strings. It pads with empty strings so the length +// of the result is a multiple of 16, because our model expects that. +Status ReadLabelsFile(const string& file_name, std::vector* result, + size_t* found_label_count) { + std::ifstream file(file_name); + if (!file) { + return tensorflow::errors::NotFound("Labels file ", file_name, + " not found."); + } + result->clear(); + string line; + while (std::getline(file, line)) { + result->push_back(line); + } + *found_label_count = result->size(); + const int padding = 16; + while (result->size() % padding) { + result->emplace_back(); + } + return Status::OK(); +} + +static Status ReadEntireFile(tensorflow::Env* env, const string& filename, + Tensor* output) { + tensorflow::uint64 file_size = 0; + TF_RETURN_IF_ERROR(env->GetFileSize(filename, &file_size)); + + string contents; + contents.resize(file_size); + + std::unique_ptr file; + TF_RETURN_IF_ERROR(env->NewRandomAccessFile(filename, &file)); + + tensorflow::StringPiece data; + TF_RETURN_IF_ERROR(file->Read(0, file_size, &data, &(contents)[0])); + if (data.size() != file_size) { + return tensorflow::errors::DataLoss("Truncated read of '", filename, + "' expected ", file_size, " got ", + data.size()); + } + output->scalar()() = tstring(data); + return Status::OK(); +} + +// Given an image file name, read in the data, try to decode it as an image, +// resize it to the requested size, and then scale the values as desired. +Status ReadTensorFromImageFile(const string& file_name, const int input_height, + const int input_width, const float input_mean, + const float input_std, + std::vector* out_tensors) { + auto root = tensorflow::Scope::NewRootScope(); + using namespace ::tensorflow::ops; // NOLINT(build/namespaces) + + string input_name = "file_reader"; + string output_name = "normalized"; + + // read file_name into a tensor named input + Tensor input(tensorflow::DT_STRING, tensorflow::TensorShape()); + TF_RETURN_IF_ERROR( + ReadEntireFile(tensorflow::Env::Default(), file_name, &input)); + + // use a placeholder to read input data + auto file_reader = + Placeholder(root.WithOpName("input"), tensorflow::DataType::DT_STRING); + + std::vector> inputs = { + {"input", input}, + }; + + // Now try to figure out what kind of file it is and decode it. + const int wanted_channels = 3; + tensorflow::Output image_reader; + if (tensorflow::str_util::EndsWith(file_name, ".png")) { + image_reader = DecodePng(root.WithOpName("png_reader"), file_reader, + DecodePng::Channels(wanted_channels)); + } else if (tensorflow::str_util::EndsWith(file_name, ".gif")) { + // gif decoder returns 4-D tensor, remove the first dim + image_reader = + Squeeze(root.WithOpName("squeeze_first_dim"), + DecodeGif(root.WithOpName("gif_reader"), file_reader)); + } else if (tensorflow::str_util::EndsWith(file_name, ".bmp")) { + image_reader = DecodeBmp(root.WithOpName("bmp_reader"), file_reader); + } else { + // Assume if it's neither a PNG nor a GIF then it must be a JPEG. + image_reader = DecodeJpeg(root.WithOpName("jpeg_reader"), file_reader, + DecodeJpeg::Channels(wanted_channels)); + } + // Now cast the image data to float so we can do normal math on it. + auto float_caster = + Cast(root.WithOpName("float_caster"), image_reader, tensorflow::DT_FLOAT); + // The convention for image ops in TensorFlow is that all images are expected + // to be in batches, so that they're four-dimensional arrays with indices of + // [batch, height, width, channel]. Because we only have a single image, we + // have to add a batch dimension of 1 to the start with ExpandDims(). + auto dims_expander = ExpandDims(root, float_caster, 0); + // Bilinearly resize the image to fit the required dimensions. + auto resized = ResizeBilinear( + root, dims_expander, + Const(root.WithOpName("size"), {input_height, input_width})); + + // Preprocess image in "caffe" style: https://github.com/keras-team/keras/blob/d8fcb9d4d4dad45080ecfdd575483653028f8eda/keras/applications/imagenet_utils.py#L206 + // Convert channel from RGB -> BGR + auto unstack_image_node = tensorflow::ops::Unstack(root.WithOpName("unstack_image"), resized, 3, tensorflow::ops::Unstack::Attrs().Axis(3)); + auto stacked_image_node = tensorflow::ops::Stack(root.WithOpName("stacked_image"), {unstack_image_node[2],unstack_image_node[1],unstack_image_node[0]}, tensorflow::ops::Stack::Attrs().Axis(3)); + + + // Substract mean: BGR + std::vector vec = {103.939, 116.779, 123.68}; + Tensor img_mean(tensorflow::DT_FLOAT, {3}); + std::copy_n(vec.begin(), vec.size(), img_mean.flat().data()); + + Div(root.WithOpName(output_name), Sub(root, stacked_image_node, img_mean), + {input_std}); + + // This runs the GraphDef network definition that we've just constructed, and + // returns the results in the output tensor. + tensorflow::GraphDef graph; + TF_RETURN_IF_ERROR(root.ToGraphDef(&graph)); + + std::unique_ptr session( + tensorflow::NewSession(tensorflow::SessionOptions())); + TF_RETURN_IF_ERROR(session->Create(graph)); + TF_RETURN_IF_ERROR(session->Run({inputs}, {output_name}, {}, out_tensors)); + return Status::OK(); +} + +// Reads a model graph definition from disk, and creates a session object you +// can use to run it. +Status LoadGraph(const string& graph_file_name, + std::unique_ptr* session) { + tensorflow::GraphDef graph_def; + Status load_graph_status = + ReadBinaryProto(tensorflow::Env::Default(), graph_file_name, &graph_def); + if (!load_graph_status.ok()) { + return tensorflow::errors::NotFound("Failed to load compute graph at '", + graph_file_name, "'"); + } + session->reset(tensorflow::NewSession(tensorflow::SessionOptions())); + Status session_create_status = (*session)->Create(graph_def); + if (!session_create_status.ok()) { + return session_create_status; + } + return Status::OK(); +} + +// Analyzes the output of the Inception graph to retrieve the highest scores and +// their positions in the tensor, which correspond to categories. +Status GetTopLabels(const std::vector& outputs, int how_many_labels, + Tensor* indices, Tensor* scores) { + auto root = tensorflow::Scope::NewRootScope(); + using namespace ::tensorflow::ops; // NOLINT(build/namespaces) + + string output_name = "top_k"; + TopK(root.WithOpName(output_name), outputs[0], how_many_labels); + // This runs the GraphDef network definition that we've just constructed, and + // returns the results in the output tensors. + tensorflow::GraphDef graph; + TF_RETURN_IF_ERROR(root.ToGraphDef(&graph)); + + std::unique_ptr session( + tensorflow::NewSession(tensorflow::SessionOptions())); + TF_RETURN_IF_ERROR(session->Create(graph)); + // The TopK node returns two outputs, the scores and their original indices, + // so we have to append :0 and :1 to specify them both. + std::vector out_tensors; + TF_RETURN_IF_ERROR(session->Run({}, {output_name + ":0", output_name + ":1"}, + {}, &out_tensors)); + *scores = out_tensors[0]; + *indices = out_tensors[1]; + return Status::OK(); +} + +// Given the output of a model run, and the name of a file containing the labels +// this prints out the top five highest-scoring values. +Status PrintTopLabels(const std::vector& outputs, + const string& labels_file_name) { + std::vector labels; + size_t label_count; + Status read_labels_status = + ReadLabelsFile(labels_file_name, &labels, &label_count); + if (!read_labels_status.ok()) { + LOG(ERROR) << read_labels_status; + return read_labels_status; + } + const int how_many_labels = std::min(5, static_cast(label_count)); + Tensor indices; + Tensor scores; + TF_RETURN_IF_ERROR(GetTopLabels(outputs, how_many_labels, &indices, &scores)); + tensorflow::TTypes::Flat scores_flat = scores.flat(); + tensorflow::TTypes::Flat indices_flat = indices.flat(); + for (int pos = 0; pos < how_many_labels; ++pos) { + const int label_index = indices_flat(pos); + const float score = scores_flat(pos); + LOG(INFO) << labels[label_index] << " (" << label_index << "): " << score; + } + return Status::OK(); +} + +// This is a testing function that returns whether the top label index is the +// one that's expected. +Status CheckTopLabel(const std::vector& outputs, int expected, + bool* is_expected) { + *is_expected = false; + Tensor indices; + Tensor scores; + const int how_many_labels = 1; + TF_RETURN_IF_ERROR(GetTopLabels(outputs, how_many_labels, &indices, &scores)); + tensorflow::TTypes::Flat indices_flat = indices.flat(); + if (indices_flat(0) != expected) { + LOG(ERROR) << "Expected label #" << expected << " but got #" + << indices_flat(0); + *is_expected = false; + } else { + *is_expected = true; + } + return Status::OK(); +} + +int main(int argc, char* argv[]) { + // These are the command-line flags the program can understand. + // They define where the graph and input data is located, and what kind of + // input the model expects. + string image = "/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/saved-model/data/img0.JPG"; + string export_dir = + "/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/saved-model/resnet50_saved_model_TFTRT_FP32_frozen"; + string labels = + "/opt/tensorflow/tensorflow-source/tensorflow/examples/image_classification/saved-model/data/imagenet_slim_labels.txt"; + int32_t input_width = 224; + int32_t input_height = 224; + float input_mean = 127; + float input_std = 1; + bool self_test = false; + string root_dir = ""; + + std::vector flag_list = { + Flag("image", &image, "image to be processed"), + Flag("export_dir", &export_dir, "frozen TF-TRT saved model to be executed"), + Flag("labels", &labels, "name of file containing labels"), + Flag("input_width", &input_width, "resize image to this width in pixels"), + Flag("input_height", &input_height, + "resize image to this height in pixels"), + Flag("input_mean", &input_mean, "scale pixel values to this mean"), + Flag("input_std", &input_std, "scale pixel values to this std deviation"), + Flag("self_test", &self_test, "run a self test"), + Flag("root_dir", &root_dir, + "interpret image and graph file names relative to this directory"), + }; + string usage = tensorflow::Flags::Usage(argv[0], flag_list); + const bool parse_result = tensorflow::Flags::Parse(&argc, argv, flag_list); + if (!parse_result) { + LOG(ERROR) << usage; + return -1; + } + + // We need to call this to set up global state for TensorFlow. + tensorflow::port::InitMain(argv[0], &argc, &argv); + if (argc > 1) { + LOG(ERROR) << "Unknown argument " << argv[1] << "\n" << usage; + return -1; + } + + tensorflow::SavedModelBundle bundle; + std::vector input_names; + std::vector output_names; + + // Load the saved model from the provided path. + Status load_graph_status = LoadModel(export_dir, &bundle, &input_names, &output_names); + if (!load_graph_status.ok()) { + LOG(ERROR) << load_graph_status; + return -1; + } + + auto sig_map = bundle.GetSignatures(); + auto model_def = sig_map.at("serving_default"); + + printf("Model Signature"); + for (auto const& p : sig_map) { + printf("key: %s", p.first.c_str()); + } + + printf("Model Input Nodes"); + for (auto const& p : model_def.inputs()) { + printf("key: %s value: %s", p.first.c_str(), p.second.name().c_str()); + } + + printf("Model Output Nodes"); + for (auto const& p : model_def.outputs()) { + printf("key: %s value: %s", p.first.c_str(), p.second.name().c_str()); + } + + auto input_name = model_def.inputs().at("input_2").name(); + auto output_name = model_def.outputs().at("output_0").name(); + + // Get the image from disk as a float array of numbers, resized and normalized + // to the specifications the main graph expects. + std::vector resized_tensors; + string image_path = tensorflow::io::JoinPath(root_dir, image); + Status read_tensor_status = + ReadTensorFromImageFile(image_path, input_height, input_width, input_mean, + input_std, &resized_tensors); + if (!read_tensor_status.ok()) { + LOG(ERROR) << read_tensor_status; + return -1; + } + const Tensor& resized_tensor = resized_tensors[0]; + + // Actually run the image through the model. + std::vector outputs; + + // fill the input tensors with data + tensorflow::Status status; + status = bundle.session->Run({ {input_name, resized_tensor}}, + {output_name}, {}, &outputs); + if (!status.ok()) { + std::cerr << "Inference failed: " << status; + return -1; + } + + // Do something interesting with the results we've generated. + Status print_status = PrintTopLabels(outputs, labels); + if (!print_status.ok()) { + LOG(ERROR) << "Running print failed: " << print_status; + return -1; + } + + return 0; +} diff --git a/tftrt/examples-cpp/image_classification/saved-model/tftrt-build.sh b/tftrt/examples-cpp/image_classification/saved-model/tftrt-build.sh new file mode 100755 index 000000000..2d4604aa3 --- /dev/null +++ b/tftrt/examples-cpp/image_classification/saved-model/tftrt-build.sh @@ -0,0 +1,13 @@ +# TODO: to programatically determine the python and tf API versions +PYVER=3.8 #TODO get this by parsing `python --version` +TFAPI=2 #TODO get this by parsing tf.__version__ + +/opt/tensorflow/nvbuild.sh --configonly --python$PYVER --v$TFAPI + +BUILD_OPTS="$(cat /opt/tensorflow/nvbuildopts)" +if [[ "$TFAPI" == "2" ]]; then + BUILD_OPTS="--config=v2 $BUILD_OPTS" +fi + +cd tensorflow-source +bazel build $BUILD_OPTS tensorflow/examples/image_classification/... diff --git a/tftrt/examples-cpp/image_classification/saved-model/tftrt-conversion.ipynb b/tftrt/examples-cpp/image_classification/saved-model/tftrt-conversion.ipynb new file mode 100755 index 000000000..4c3eb5f28 --- /dev/null +++ b/tftrt/examples-cpp/image_classification/saved-model/tftrt-conversion.ipynb @@ -0,0 +1,699 @@ +{ + "cells": [ + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": {}, + "colab_type": "code", + "id": "dR1W9kv7IPhE" + }, + "outputs": [], + "source": [ + "# Copyright 2021 NVIDIA Corporation. All Rights Reserved.\n", + "\n", + "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", + "# you may not use this file except in compliance with the License.\n", + "# You may obtain a copy of the License at\n", + "\n", + "# http://www.apache.org/licenses/LICENSE-2.0\n", + "\n", + "# Unless required by applicable law or agreed to in writing, software\n", + "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", + "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", + "# See the License for the specific language governing permissions and\n", + "# limitations under the License.\n", + "# ==============================================================================" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "Yb3TdMZAkVNq" + }, + "source": [ + "\n", + "\n", + "# TensorFlow C++ Inference with TF-TRT Models\n", + "\n", + "\n", + "## Introduction\n", + "In this notebook, we will download a pretrained Keras ResNet-50 model, optimize it with TF-TRT, convert it to a frozen graph, then load and do inference with the TensorFlow C++ API.\n", + "\n", + "First, we download the image net labels." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!mkdir data\n", + "!curl -L \"https://storage.googleapis.com/download.tensorflow.org/models/inception_v3_2016_08_28_frozen.pb.tar.gz\" | tar -C ./data -xz\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": {}, + "colab_type": "code", + "id": "8Fg4x4aomCY4", + "scrolled": true + }, + "outputs": [], + "source": [ + "!nvidia-smi" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "LG4IBNn-2PWY" + }, + "source": [ + "### Install Dependencies" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "!pip install pillow matplotlib" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 35 + }, + "colab_type": "code", + "id": "v0mfnfqg3ned", + "outputId": "11c043a0-b8e5-49e2-f907-5f1372c92a68", + "scrolled": true + }, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "print(\"Tensorflow version: \", tf.version.VERSION)\n", + "\n", + "# check TensorRT version\n", + "print(\"TensorRT version: \")\n", + "!dpkg -l | grep nvinfer" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "9U8b2394CZRu" + }, + "source": [ + "An available TensorRT installation looks like:\n", + "\n", + "```\n", + "TensorRT version: \n", + "ii libnvinfer8 8.2.2-1+cuda11.4 amd64 TensorRT runtime libraries\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "nWYufTjPCMgW" + }, + "source": [ + "### Importing required libraries" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": {}, + "colab_type": "code", + "id": "Yyzwxjlm37jx" + }, + "outputs": [], + "source": [ + "from __future__ import absolute_import, division, print_function, unicode_literals\n", + "import os\n", + "import time\n", + "\n", + "import numpy as np\n", + "import matplotlib.pyplot as plt\n", + "\n", + "import tensorflow as tf\n", + "from tensorflow import keras\n", + "from tensorflow.python.compiler.tensorrt import trt_convert as trt\n", + "from tensorflow.python.saved_model import tag_constants\n", + "from tensorflow.keras.applications.resnet50 import ResNet50\n", + "from tensorflow.keras.preprocessing import image\n", + "from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "v-R2iN4akVOi" + }, + "source": [ + "## Data\n", + "We download several random images for testing from the Internet." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": {}, + "colab_type": "code", + "id": "tVJ2-8rokVOl", + "scrolled": true + }, + "outputs": [], + "source": [ + "!mkdir ./data\n", + "!wget -O ./data/img0.JPG \"https://d17fnq9dkz9hgj.cloudfront.net/breed-uploads/2018/08/siberian-husky-detail.jpg?bust=1535566590&width=630\"\n", + "!wget -O ./data/img1.JPG \"https://www.hakaimagazine.com/wp-content/uploads/header-gulf-birds.jpg\"\n", + "!wget -O ./data/img2.JPG \"https://www.artis.nl/media/filer_public_thumbnails/filer_public/00/f1/00f1b6db-fbed-4fef-9ab0-84e944ff11f8/chimpansee_amber_r_1920x1080.jpg__1920x1080_q85_subject_location-923%2C365_subsampling-2.jpg\"\n", + "!wget -O ./data/img3.JPG \"https://www.familyhandyman.com/wp-content/uploads/2018/09/How-to-Avoid-Snakes-Slithering-Up-Your-Toilet-shutterstock_780480850.jpg\"" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 269 + }, + "colab_type": "code", + "id": "F_9n-AR1kVOv", + "outputId": "e0ead6dc-e761-404e-a030-f6d3057a57da" + }, + "outputs": [], + "source": [ + "from tensorflow.keras.preprocessing import image\n", + "\n", + "fig, axes = plt.subplots(nrows=2, ncols=2)\n", + "\n", + "for i in range(4):\n", + " img_path = './data/img%d.JPG'%i\n", + " img = image.load_img(img_path, target_size=(224, 224), interpolation='bilinear')\n", + " plt.subplot(2,2,i+1)\n", + " plt.imshow(img);\n", + " plt.axis('off');" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "xeV4r2YTkVO1" + }, + "source": [ + "## Model\n", + "\n", + "We next download and test a ResNet-50 pre-trained model from the Keras model zoo." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 73 + }, + "colab_type": "code", + "id": "WwRBOikEkVO3", + "outputId": "2d63bc46-8bac-492f-b519-9ae5f19176bc" + }, + "outputs": [], + "source": [ + "model = ResNet50(weights='imagenet')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 410 + }, + "colab_type": "code", + "id": "lFKQPoLO_ikd", + "outputId": "c0b93de8-c94b-4977-992e-c780e12a3d52" + }, + "outputs": [], + "source": [ + "for i in range(4):\n", + " img_path = './data/img%d.JPG'%i\n", + " img = image.load_img(img_path, target_size=(224, 224),interpolation='bilinear')\n", + " x = image.img_to_array(img)\n", + " x = np.expand_dims(x, axis=0)\n", + " x = preprocess_input(x)\n", + "\n", + " preds = model.predict(x)\n", + " # decode the results into a list of tuples (class, description, probability)\n", + " # (one such list for each sample in the batch)\n", + " print('{} - Predicted: {}'.format(img_path, decode_predictions(preds, top=3)[0]))\n", + "\n", + " plt.subplot(2,2,i+1)\n", + " plt.imshow(img);\n", + " plt.axis('off');\n", + " plt.title(decode_predictions(preds, top=3)[0][0][1])\n", + " " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "XrL3FEcdkVPA" + }, + "source": [ + "TF-TRT takes input as a TensorFlow saved model, therefore, we re-export the Keras model as a TF saved model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 110 + }, + "colab_type": "code", + "id": "WxlUF3rlkVPH", + "outputId": "9f3864e7-f211-4c06-d2d2-585c1a477e34" + }, + "outputs": [], + "source": [ + "# Save the entire model as a SavedModel.\n", + "model.save('resnet50_saved_model') " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 453 + }, + "colab_type": "code", + "id": "RBu2RKs6kVPP", + "outputId": "8e063261-7efb-47fd-fa6c-1bb5076d418c" + }, + "outputs": [], + "source": [ + "!saved_model_cli show --all --dir resnet50_saved_model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "qBQwBvlNm-J8" + }, + "source": [ + "### Inference with native TF2.0 saved model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": {}, + "colab_type": "code", + "id": "8zLN0GMCkVPe" + }, + "outputs": [], + "source": [ + "model = tf.keras.models.load_model('resnet50_saved_model')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 219 + }, + "colab_type": "code", + "id": "Fbj-UEOxkVPs", + "outputId": "3a2b34f9-8034-48cb-b3fe-477f09966025" + }, + "outputs": [], + "source": [ + "img_path = './data/img0.JPG' # Siberian_husky\n", + "img = image.load_img(img_path, target_size=(224, 224))\n", + "x = image.img_to_array(img)\n", + "x = np.expand_dims(x, axis=0)\n", + "x = preprocess_input(x)\n", + "\n", + "preds = model.predict(x)\n", + "# decode the results into a list of tuples (class, description, probability)\n", + "# (one such list for each sample in the batch)\n", + "print('{} - Predicted: {}'.format(img_path, decode_predictions(preds, top=3)[0]))\n", + "plt.subplot(2,2,1)\n", + "plt.imshow(img);\n", + "plt.axis('off');\n", + "plt.title(decode_predictions(preds, top=3)[0][0][1])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 35 + }, + "colab_type": "code", + "id": "CGc-dC6DvwRP", + "outputId": "e0a22e05-f4fe-47b6-93e8-2b806bf7098a" + }, + "outputs": [], + "source": [ + "batch_size = 1\n", + "batched_input = np.zeros((batch_size, 224, 224, 3), dtype=np.float32)\n", + "\n", + "for i in range(batch_size):\n", + " img_path = './data/img%d.JPG' % (i % 4)\n", + " img = image.load_img(img_path, target_size=(224, 224))\n", + " x = image.img_to_array(img)\n", + " x = np.expand_dims(x, axis=0)\n", + " x = preprocess_input(x)\n", + " batched_input[i, :] = x\n", + "batched_input = tf.constant(batched_input)\n", + "print('batched_input shape: ', batched_input.shape)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": {}, + "colab_type": "code", + "id": "rFBV6hQR7N3z" + }, + "outputs": [], + "source": [ + "# Benchmarking throughput\n", + "N_warmup_run = 50\n", + "N_run = 1000\n", + "elapsed_time = []\n", + "\n", + "for i in range(N_warmup_run):\n", + " preds = model.predict(batched_input)\n", + "\n", + "for i in range(N_run):\n", + " start_time = time.time()\n", + " preds = model.predict(batched_input)\n", + " end_time = time.time()\n", + " elapsed_time = np.append(elapsed_time, end_time - start_time)\n", + " if i % 50 == 0:\n", + " print('Step {}: {:4.1f}ms'.format(i, (elapsed_time[-50:].mean()) * 1000))\n", + "\n", + "print('Throughput: {:.0f} images/s'.format(N_run * batch_size / elapsed_time.sum()))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "vC_RN0BAkVPy" + }, + "source": [ + "### TF-TRT FP32 model\n", + "\n", + "We next convert the TF native FP32 model to a TF-TRT FP32 model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 126 + }, + "colab_type": "code", + "id": "0eLImSJ-kVPz", + "outputId": "e2c353c7-8e4b-49aa-ab97-f4d82797d4d8" + }, + "outputs": [], + "source": [ + "print('Converting to TF-TRT FP32...')\n", + "conversion_params = trt.DEFAULT_TRT_CONVERSION_PARAMS._replace(precision_mode=trt.TrtPrecisionMode.FP32,\n", + " max_workspace_size_bytes=8000000000)\n", + "\n", + "converter = trt.TrtGraphConverterV2(input_saved_model_dir='resnet50_saved_model',\n", + " conversion_params=conversion_params)\n", + "converter.convert()\n", + "converter.save(output_saved_model_dir='resnet50_saved_model_TFTRT_FP32')\n", + "print('Done Converting to TF-TRT FP32')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 453 + }, + "colab_type": "code", + "id": "dlue_3npkVQC", + "outputId": "4dd6a366-fe9a-43c8-aad0-dd357bba41bb" + }, + "outputs": [], + "source": [ + "!saved_model_cli show --all --dir resnet50_saved_model_TFTRT_FP32" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "Vd2DoGUp8ivj" + }, + "source": [ + "Next, we load and test the TF-TRT FP32 model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from tensorflow.python.framework.convert_to_constants import convert_variables_to_constants_v2\n", + "import tensorflow as tf\n", + "\n", + "model = tf.saved_model.load(\"resnet50_saved_model_TFTRT_FP32\", tags=[tag_constants.SERVING]).signatures['serving_default']" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": {}, + "colab_type": "code", + "id": "rf97K_rxvwRm" + }, + "outputs": [], + "source": [ + "def predict_tftrt(input_saved_model):\n", + " \"\"\"Runs prediction on a single image and shows the result.\n", + " input_saved_model (string): Name of the input model stored in the current dir\n", + " \"\"\"\n", + " img_path = './data/img0.JPG' # Siberian_husky\n", + " img = image.load_img(img_path, target_size=(224, 224))\n", + " x = image.img_to_array(img)\n", + " x = np.expand_dims(x, axis=0)\n", + " x = preprocess_input(x)\n", + " x = tf.constant(x)\n", + " \n", + " saved_model_loaded = tf.saved_model.load(input_saved_model, tags=[tag_constants.SERVING])\n", + " signature_keys = list(saved_model_loaded.signatures.keys())\n", + " print(signature_keys)\n", + "\n", + " infer = saved_model_loaded.signatures['serving_default']\n", + " print(infer.structured_outputs)\n", + "\n", + " labeling = infer(x)\n", + " preds = labeling['predictions'].numpy()\n", + " print('{} - Predicted: {}'.format(img_path, decode_predictions(preds, top=3)[0]))\n", + " plt.subplot(2,2,1)\n", + " plt.imshow(img);\n", + " plt.axis('off');\n", + " plt.title(decode_predictions(preds, top=3)[0][0][1])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 238 + }, + "colab_type": "code", + "id": "pRK0pRE-snvb", + "outputId": "1f7ab6c1-dbfa-4e3e-a21d-df9975c70455" + }, + "outputs": [], + "source": [ + "predict_tftrt('resnet50_saved_model_TFTRT_FP32')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": {}, + "colab_type": "code", + "id": "z9b5j6jMvwRt" + }, + "outputs": [], + "source": [ + "def benchmark_tftrt(input_saved_model):\n", + " saved_model_loaded = tf.saved_model.load(input_saved_model, tags=[tag_constants.SERVING])\n", + " infer = saved_model_loaded.signatures['serving_default']\n", + "\n", + " N_warmup_run = 50\n", + " N_run = 1000\n", + " elapsed_time = []\n", + "\n", + " for i in range(N_warmup_run):\n", + " labeling = infer(batched_input)\n", + "\n", + " for i in range(N_run):\n", + " start_time = time.time()\n", + " labeling = infer(batched_input)\n", + " end_time = time.time()\n", + " elapsed_time = np.append(elapsed_time, end_time - start_time)\n", + " if i % 50 == 0:\n", + " print('Step {}: {:4.1f}ms'.format(i, (elapsed_time[-50:].mean()) * 1000))\n", + "\n", + " print('Throughput: {:.0f} images/s'.format(N_run * batch_size / elapsed_time.sum()))" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": {}, + "colab_type": "code", + "id": "ai6bxNcNszHc" + }, + "outputs": [], + "source": [ + "benchmark_tftrt('resnet50_saved_model_TFTRT_FP32')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Prepare model for C++ inference\n", + "\n", + "We can see that the TF-TRT FP32 model provide great speedup over the native Keras model. Now let's prepare this model for C++ inference. We will freeze this graph and write it as a frozen saved model to disk." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": {}, + "colab_type": "code", + "id": "2mM9D3BTEzQS" + }, + "outputs": [], + "source": [ + "from tensorflow.python.saved_model import signature_constants\n", + "from tensorflow.python.saved_model import tag_constants\n", + "from tensorflow.python.framework import convert_to_constants\n", + "\n", + "def get_func_from_saved_model(saved_model_dir):\n", + " saved_model_loaded = tf.saved_model.load(\n", + " saved_model_dir, tags=[tag_constants.SERVING])\n", + " graph_func = saved_model_loaded.signatures[\n", + " signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY]\n", + " return graph_func, saved_model_loaded\n", + "\n", + "func, loaded_model = get_func_from_saved_model('resnet50_saved_model_TFTRT_FP32')\n", + "\n", + "# Create frozen func\n", + "frozen_func = convert_to_constants.convert_variables_to_constants_v2(func)\n", + "module = tf.Module()\n", + "module.myfunc = frozen_func\n", + "tf.saved_model.save(module,'resnet50_saved_model_TFTRT_FP32_frozen', signatures=frozen_func)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "I13snJ9VkVQh" + }, + "source": [ + "### What's next\n", + "Refer back to the [Readme](README.md) to load the TF-TRT frozen saved model for inference with the TensorFlow C++ API." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "include_colab_link": true, + "machine_shape": "hm", + "name": "Colab-TF20-TF-TRT-inference-from-Keras-saved-model.ipynb", + "provenance": [] + }, + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.10" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +}