Copy edit the Custom Remote Inference notebook (apache#29460)

* Copy edit the Custom Remote Inference notebook * Update PR per feedback
JayajP · Nov 17, 2023 · c1b83d2 · c1b83d2
1 parent e299a4e
commit c1b83d2
Showing 1 changed file with 24 additions and 28 deletions.
diff --git a/examples/notebooks/beam-ml/custom_remote_inference.ipynb b/examples/notebooks/beam-ml/custom_remote_inference.ipynb
@@ -53,16 +53,15 @@
         "id": "GNbarEZsalS2"
       },
       "source": [
-        "This example demonstrates how to implement a custom inference call in Apache Beam using the Google Cloud Vision API.\n",
+        "This example demonstrates how to implement a custom inference call in Apache Beam by using the Google Cloud Vision API.\n",
         "\n",
         "The prefered way to run inference in Apache Beam is by using the [RunInference API](https://beam.apache.org/documentation/sdks/python-machine-learning/).\n",
         "The RunInference API enables you to run models as part of your pipeline in a way that is optimized for machine learning inference.\n",
-        "To reduce the number of steps that you need to take, RunInference supports features like batching. For more infomation about the RunInference API, review the [RunInference API](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.html#apache_beam.ml.inference.RunInference),\n",
-        "which demonstrates how to implement model inference in PyTorch, scikit-learn, and TensorFlow.\n",
+        "To reduce the number of steps in your pipeline, RunInference supports features like batching. For more infomation about the RunInference API, review the [RunInference API](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.html#apache_beam.ml.inference.RunInference).\n",
         "\n",
-        "There is [VertexAIModelHandlerJson](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/ml/inference/vertex_ai_inference.py) which is used to make remote inference calls to VertexAI. In this notebook, we will make custom `ModelHandler` to do remote inference calls using CloudVision API.\n",
+        "This notebook creates a custom model handler to make remote inference calls by using the Cloud Vision API. To make remote inference calls to Vertex AI, use the [Vertex AI model handler JSON](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/ml/inference/vertex_ai_inference.py).\n",
         "\n",
-        "**Note:** all images are licensed CC-BY, creators are listed in the [LICENSE.txt](https://storage.googleapis.com/apache-beam-samples/image_captioning/LICENSE.txt) file."
+        "**Note:** All images are licensed CC-BY. Creators are listed in the [LICENSE.txt](https://storage.googleapis.com/apache-beam-samples/image_captioning/LICENSE.txt) file."
       ]
     },
     {
@@ -92,18 +91,18 @@
         "id": "4io1vzkzF683"
       },
       "source": [
-        "We want to run the Google Cloud Vision API on a large set of images, and Apache Beam is the ideal tool to handle this workflow.\n",
+        "To run the Google Cloud Vision API on a large set of images, Apache Beam is the ideal tool to handle the workflow.\n",
         "This example demonstates how to retrieve image labels with this API on a small set of images.\n",
         "\n",
-        "The example follows these steps to implement this workflow:\n",
+        "The example follows these steps:\n",
         "* Read the images.\n",
-        "* Send the images to an external API to run inference using `RunInference` PTransform.\n",
+        "* Send the images to an external API to run inference by using the `RunInference PTransform`.\n",
         "* Postprocess the results of your API.\n",
         "\n",
         "**Caution:** Be aware of API quotas and the heavy load you might incur on your external API. Verify that your pipeline and API are configured correctly for your use case.\n",
         "\n",
-        "To optimize the calls to the external API, limit the parallel calls to the external remote API by configuring [PipelineOptions](https://beam.apache.org/documentation/programming-guide/#configuring-pipeline-options).\n",
-        "In Apache Beam, different runners provide options to handle the parallelism, for example:\n",
+        "To optimize the calls to the external API, limit the parallel calls to the external remote API by [configuring pipeline options](https://beam.apache.org/documentation/programming-guide/#configuring-pipeline-options).\n",
+        "In Apache Beam, each runner provides options to handle the parallelism. The following list includes two examples:\n",
         "* With the [Direct Runner](https://beam.apache.org/documentation/runners/direct/), use the `direct_num_workers` pipeline option.\n",
         "* With the [Google Cloud Dataflow Runner](https://beam.apache.org/documentation/runners/dataflow/), use the `max_num_workers` pipeline option.\n",
         "\n",
@@ -116,9 +115,7 @@
         "id": "FAawWOaiIYaS"
       },
       "source": [
-        "## Before you begin\n",
-        "\n",
-        "This section provides installation steps."
+        "## Before you begin"
       ]
     },
     {
@@ -127,7 +124,7 @@
         "id": "XhpKOxINrIqz"
       },
       "source": [
-        "First, download and install the dependencies."
+        "Download and install the dependencies."
       ]
     },
     {
@@ -188,7 +185,7 @@
       "source": [
         "## Run remote inference on Cloud Vision API\n",
         "\n",
-        "This section demonstates the steps to run remote inference on the Cloud Vision API.\n",
+        "This section shows how to run remote inference on the Cloud Vision API.\n",
         "\n",
         "Download and install Apache Beam and the required modules."
       ]
@@ -254,16 +251,16 @@
         "id": "HLy7VKJhLrmT"
       },
       "source": [
-        "### Create a Custom ModelHandler\n",
+        "### Create a custom model handler\n",
         "\n",
-        "In order to implement remote inference, create a custom model handler. The `run_inference` method is the most interesting part. In this function, we implement the model call and return its results.\n",
+        "In order to implement remote inference, create a custom model handler. Use the `run_inference` method to implement the model call and to return its results.\n",
         "\n",
-        "When running remote inference, prepare to encounter, identify, and handle failure as gracefully as possible. We recommend using the following techniques:\n",
+        "When you run remote inference, prepare to encounter, identify, and handle failure as gracefully as possible. We recommend using the following techniques:\n",
         "\n",
         "* **Exponential backoff:** Retry failed remote calls with exponentially growing pauses between retries. Using exponential backoff ensures that failures don't lead to an overwhelming number of retries in quick succession.\n",
         "\n",
-        "* **Dead-letter queues:** Route failed inferences to a separate `PCollection` without failing the whole transform. You can continue execution without failing the job (batch jobs' default behavior) or retrying indefinitely (streaming jobs' default behavior).\n",
-        "You can then run custom pipeline logic on the dead-letter queue (unprocessed messages queue) to log the failure, alert, and push the failed message to temporary storage so that it can eventually be reprocessed."
+        "* **Dead-letter queues:** Route failed inferences to a separate `PCollection` without failing the whole transform. Continue execution without failing the job (batch jobs' default behavior) or retrying indefinitely (streaming jobs' default behavior).\n",
+        "You can then run custom pipeline logic on the dead-letter (unprocessed messages) queue to log the failure, send an alert, and push the failed message to temporary storage so that it can eventually be reprocessed."
       ]
     },
     {
@@ -276,9 +273,9 @@
       "source": [
         "class CloudVisionModelHandler(ModelHandler):\n",
         "  \"\"\"DoFn that accepts a batch of images as bytearray\n",
-        "  and sends that batch to the Cloud vision API for remote inference.\"\"\"\n",
+        "  and sends that batch to the Cloud Vision API for remote inference\"\"\"\n",
         "  def load_model(self):\n",
-        "    \"\"\"Init the Google Vision API client.\"\"\"\n",
+        "    \"\"\"Initiate the Google Vision API client.\"\"\"\n",
         "    client = vision.ImageAnnotatorClient()\n",
         "    return client\n",
         "\n",
@@ -308,11 +305,10 @@
       "source": [
         "### Manage batching\n",
         "\n",
-        "Before we can chain together the pipeline steps, we need to understand batching.\n",
-        "When running inference with your model, either in Apache Beam or in an external API, you can batch your input to increase the efficiency of the model execution.\n",
-        "`RunInference` PTransform manages batching in this pipeline with `BatchElements` transform to group elements together and form a batch of the desired size.\n",
+        "When you run inference with your model, either in Apache Beam or in an external API, batch your input to increase the efficiency of the model execution.\n",
+        "The `RunInference PTransform` automatically manages batching by using the `BatchElements` transform to dynamically group elements together into batches based on the throughput of the pipeline.\n",
         "\n",
-        "* If you are designing your own API endpoint, make sure that it can handle batches.\n",
+        "If you are designing your own API endpoint, make sure that it can handle batches.\n",
         "\n"
       ]
     },
@@ -324,13 +320,13 @@
       "source": [
         "### Create the pipeline\n",
         "\n",
-        "This section demonstrates how to chain the steps together to do the following:\n",
+        "This section demonstrates how to chain the pipeline steps together to complete the following tasks:\n",
         "\n",
         "* Read data.\n",
         "\n",
         "* Transform the data to fit the model input.\n",
         "\n",
-        "* RunInference with custom CloudVision ModelHandler.\n",
+        "* Run inference with a custom Cloud Vision model handler.\n",
         "\n",
         "* Process and display the results."
       ]