khanhlvg · st-nhanho · Mar 24, 2022 · Apr 29, 2022 · May 6, 2022 · May 6, 2022
diff --git a/android.tflite b/android.tflite
diff --git a/object_detection/README.md b/object_detection/README.md
@@ -1,63 +1,50 @@
 # TensorFlow Lite Python object detection example with Raspberry Pi
 
-This example uses [TensorFlow Lite](https://tensorflow.org/lite) with Python
-on a Raspberry Pi to perform real-time object detection using images
-streamed from the Pi Camera. It draws a bounding box around each detected
-object in the camera preview (when the object score is above a given threshold).
+This example uses [TensorFlow Lite](https://tensorflow.org/lite) with Python on
+a Raspberry Pi to perform real-time object detection using images streamed from
+the Pi Camera. It draws a bounding box around each detected object in the camera
+preview (when the object score is above a given threshold).
 
 At the end of this page, there are extra steps to accelerate the example using
 the Coral USB Accelerator to increase inference speed.
 
-
 ## Set up your hardware
 
-Before you begin, you need to [set up your Raspberry Pi](
-https://projects.raspberrypi.org/en/projects/raspberry-pi-setting-up) with
-Raspberry Pi OS (preferably updated to Buster).
-
-You also need to [connect and configure the Pi Camera](
-https://www.raspberrypi.org/documentation/configuration/camera.md) if you use the 
-Pi Camera. This code also works with USB camera connect to the Raspberry Pi.
-
-And to see the results from the camera, you need a monitor connected
-to the Raspberry Pi. It's okay if you're using SSH to access the Pi shell
-(you don't need to use a keyboard connected to the Pi)—you only need a monitor
-attached to the Pi to see the camera stream.
-
-
-## Install the TensorFlow Lite runtime
-
-In this project, all you need from the TensorFlow Lite API is the `Interpreter`
-class. So instead of installing the large `tensorflow` package, we're using the
-much smaller `tflite_runtime` package.
+Before you begin, you need to
+[set up your Raspberry Pi](https://projects.raspberrypi.org/en/projects/raspberry-pi-setting-up)
+with Raspberry Pi OS (preferably updated to Buster).
 
-To install this on your Raspberry Pi, follow the instructions in the
-[Python quickstart](https://www.tensorflow.org/lite/guide/python#install_tensorflow_lite_for_python).
+You also need to
+[connect and configure the Pi Camera](https://www.raspberrypi.org/documentation/configuration/camera.md)
+if you use the Pi Camera. This code also works with USB camera connect to the
+Raspberry Pi.
 
-You can install the TFLite runtime using this script.
-
-```
-sh setup.sh
-```
+And to see the results from the camera, you need a monitor connected to the
+Raspberry Pi. It's okay if you're using SSH to access the Pi shell (you don't
+need to use a keyboard connected to the Pi)—you only need a monitor attached to
+the Pi to see the camera stream.
 
 ## Download the example files
 
 First, clone this Git repo onto your Raspberry Pi like this:
 
 ```
-git clone https://github.com/khanhlvg/tflite_raspberry_pi --depth 1
+git clone https://github.com/tensorflow/examples --depth 1
 ```
 
-Then use our script to install a couple Python packages, and
-download the MobileNet model and labels file:
+Then use our script to install a couple Python packages, and download the
+EfficientDet-Lite model:
 
 ```
-cd object_detection
+cd examples/lite/examples/object_detection/raspberry_pi
 
 # The script install the required dependencies and download the TFLite models.
 sh setup.sh
 ```
 
+In this project, all you need from the TFLite Task Library is the `tflite_support` package. The setup scripts automatically install
+the `tflite_support` package.
+
 ## Run the example
 
 ```
@@ -68,28 +55,31 @@ python3 detect.py \
 You should see the camera feed appear on the monitor attached to your Raspberry
 Pi. Put some objects in front of the camera, like a coffee mug or keyboard, and
 you'll see boxes drawn around those that the model recognizes, including the
-label and score for each. It also prints the amount of time it took
-to perform each inference in milliseconds at the top-left corner of the screen.
+label and score for each. It also prints the number of frames per second (FPS)
+at the top-left corner of the screen. As the pipeline contains some processes
+other than model inference, including visualizing the detection results, you can
+expect a higher FPS if your inference pipeline runs in headless mode without
+visualization.
 
 For more information about executing inferences with TensorFlow Lite, read
 [TensorFlow Lite inference](https://www.tensorflow.org/lite/guide/inference).
 
-
 ## Speed up model inference (optional)
 
 If you want to significantly speed up the inference time, you can attach an
-[Coral USB Accelerator](
-https://coral.withgoogle.com/products/accelerator)—a USB accessory that adds
-the [Edge TPU ML accelerator](https://coral.withgoogle.com/docs/edgetpu/faq/)
-to any Linux-based system.
+[Coral USB Accelerator](https://coral.withgoogle.com/products/accelerator)—a USB
+accessory that adds the
+[Edge TPU ML accelerator](https://coral.withgoogle.com/docs/edgetpu/faq/) to any
+Linux-based system.
 
 If you have a Coral USB Accelerator, you can run the sample with it enabled:
 
-1.  First, be sure you have completed the [USB Accelerator setup instructions](
-    https://coral.withgoogle.com/docs/accelerator/get-started/).
+1.  First, be sure you have completed the
+    [USB Accelerator setup instructions](https://coral.withgoogle.com/docs/accelerator/get-started/).
 
-2.  Run the object detection script using the EdgeTPU TFLite model and enable 
-    the EdgeTPU option.   
+2.  Run the object detection script using the EdgeTPU TFLite model and enable
+    the EdgeTPU option. Be noted that the EdgeTPU requires a specific TFLite
+    model that is different from the one used above.
 
 ```
 python3 detect.py \
@@ -100,5 +90,5 @@ python3 detect.py \
 You should see significantly faster inference speeds.
 
 For more information about creating and running TensorFlow Lite models with
-Coral devices, read [TensorFlow models on the Edge TPU](
-https://coral.withgoogle.com/docs/edgetpu/models-intro/).
+Coral devices, read
+[TensorFlow models on the Edge TPU](https://coral.withgoogle.com/docs/edgetpu/models-intro/).
diff --git a/object_detection/detect.py b/object_detection/detect.py
@@ -11,20 +11,41 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-
-"""Main script to run pose classification and pose estimation."""
+"""Main script to run the object detection routine."""
 import argparse
 import sys
 import time
-
+from typing import List, NamedTuple
 import cv2
+
+from tflite_support.task import vision
+from tflite_support.task import core
+from tflite_support.task import processor
+
 import utils
 
-from object_detector import ObjectDetector
-from object_detector import ObjectDetectorOptions
+class Rect(NamedTuple):
+  """A rectangle in 2D space."""
+  left: float
+  top: float
+  right: float
+  bottom: float
+
+
+class Category(NamedTuple):
+  """A result of a classification task."""
+  label: str
+  score: float
+
+
+class Detection(NamedTuple):
+  """A detected object as the result of an ObjectDetector."""
+  bounding_box: Rect
+  categories: List[Category]
 
 
 def run(model: str, camera_id: int, width: int, height: int,
+        max_results: int, score_threshold: float,
         num_threads: int, enable_edgetpu: bool) -> None:
   """Continuously run inference on images acquired from the camera.
 
@@ -33,19 +54,20 @@ def run(model: str, camera_id: int, width: int, height: int,
     camera_id: The camera id to be passed to OpenCV.
     width: The width of the frame captured from the camera.
     height: The height of the frame captured from the camera.
+    max_results: Maximum number of classification results to display.
+    score_threshold: The score threshold of classification results.
     num_threads: The number of CPU threads to run the model.
     enable_edgetpu: True/False whether the model is a EdgeTPU model.
   """
 
   # Variables to calculate FPS
   counter, fps = 0, 0
-  start_time_fps = time.time()
+  start_time = time.time()
 
   # Start capturing video input from the camera
   cap = cv2.VideoCapture(camera_id)
   cap.set(cv2.CAP_PROP_FRAME_WIDTH, width)
   cap.set(cv2.CAP_PROP_FRAME_HEIGHT, height)
-  cap.set(cv2.CAP_PROP_BUFFERSIZE, 2)
 
   # Visualization parameters
   row_size = 20  # pixels
@@ -56,106 +78,122 @@ def run(model: str, camera_id: int, width: int, height: int,
   fps_avg_frame_count = 10
 
   # Initialize the object detection model
-  options = ObjectDetectorOptions(num_threads=num_threads,
-                                  score_threshold=0.3,
-                                  max_results=3,
-                                  enable_edgetpu=enable_edgetpu)
-  detector = ObjectDetector(model_path=model, options=options)
+  if enable_edgetpu:
+    base_options = core.BaseOptions(file_name=model, use_coral=True)
+  else:
+    base_options = core.BaseOptions(file_name=model)
+
+  detection_options = processor.DetectionOptions(max_results=max_results,
+                                                 score_threshold=score_threshold)
+  options = vision.ObjectDetectorOptions(base_options=base_options, detection_options=detection_options)
+
+  detector = vision.ObjectDetector.create_from_options(options)
 
   # Continuously capture images from the camera and run inference
   while cap.isOpened():
-    start_time_frame = time.time()
-
-    start_time = time.time()
     success, image = cap.read()
     if not success:
       sys.exit(
-        'ERROR: Unable to read from webcam. Please verify your webcam settings.'
+          'ERROR: Unable to read from webcam. Please verify your webcam settings.'
       )
 
     counter += 1
     image = cv2.flip(image, 1)
-    elapsed_time = int((time.time() - start_time) * 1000)
-    print('OpenCV read image time: {0}ms'.format(elapsed_time))
 
     # Run object detection estimation using the model.
-    start_time = time.time()
-    detections = detector.detect(image)
-    elapsed_time = int((time.time() - start_time) * 1000)
-    print('Inference time: {0}ms'.format(elapsed_time))
+    rgb_image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
+    tensor_image = vision.TensorImage.create_from_array(rgb_image)
+    detection_results = detector.detect(tensor_image)
+
+    detections = []
+    # Parse the model output into a list of Detection entities.
+    for detection in detection_results.detections:
+      bounding_box = Rect(
+          top=detection.bounding_box.origin_y,
+          left=detection.bounding_box.origin_x,
+          bottom=detection.bounding_box.origin_y + detection.bounding_box.height,
+          right=detection.bounding_box.origin_x + detection.bounding_box.width)
+      category = Category(
+          score=detection.classes[0].score,
+          label=detection.classes[0].class_name,
+          )
+      result = Detection(bounding_box=bounding_box, categories=[category])
+      detections.append(result)
 
-    start_time = time.time()
     # Draw keypoints and edges on input image
     image = utils.visualize(image, detections)
 
     # Calculate the FPS
     if counter % fps_avg_frame_count == 0:
-      end_time_fps = time.time()
-      fps = fps_avg_frame_count / (end_time_fps - start_time_fps)
-      start_time_fps = time.time()
+      end_time = time.time()
+      fps = fps_avg_frame_count / (end_time - start_time)
+      start_time = time.time()
 
     # Show the FPS
     fps_text = 'FPS = {:.1f}'.format(fps)
     text_location = (left_margin, row_size)
     cv2.putText(image, fps_text, text_location, cv2.FONT_HERSHEY_PLAIN,
                 font_size, text_color, font_thickness)
 
-    cv2.imshow('object_detector', image)
-    elapsed_time = int((time.time() - start_time) * 1000)
-    print('Visualization time: {0}ms'.format(elapsed_time))
-
-    start_time = time.time()
     # Stop the program if the ESC key is pressed.
     if cv2.waitKey(1) == 27:
       break
-    elapsed_time = int((time.time() - start_time) * 1000)
-    print('Wait key time: {0}ms'.format(elapsed_time))
-
-    elapsed_time = int((time.time() - start_time_frame) * 1000)
-    print('Time per frame (end-to-end): {0}ms'.format(elapsed_time))
-    print()
+    cv2.imshow('object_detector', image)
 
   cap.release()
   cv2.destroyAllWindows()
 
 
 def main():
   parser = argparse.ArgumentParser(
-    formatter_class=argparse.ArgumentDefaultsHelpFormatter)
+      formatter_class=argparse.ArgumentDefaultsHelpFormatter)
+  parser.add_argument(
+      '--model',
+      help='Path of the object detection model.',
+      required=False,
+      default='efficientdet_lite0.tflite')
+  parser.add_argument(
+      '--cameraId', help='Id of camera.', required=False, type=int, default=0)
   parser.add_argument(
-    '--model',
-    help='Path of the object detection model.',
-    required=False,
-    default='efficientdet_lite0.tflite')
+      '--frameWidth',
+      help='Width of frame to capture from camera.',
+      required=False,
+      type=int,
+      default=640)
   parser.add_argument(
-    '--cameraId', help='Id of camera.', required=False, type=int, default=0)
+      '--frameHeight',
+      help='Height of frame to capture from camera.',
+      required=False,
+      type=int,
+      default=480)
   parser.add_argument(
-    '--frameWidth',
-    help='Width of frame to capture from camera.',
-    required=False,
-    type=int,
-    default=640)
+      '--maxResults',
+      help='Maximum number of results to show.',
+      required=False,
+      type=int,
+      default=5)
   parser.add_argument(
-    '--frameHeight',
-    help='Height of frame to capture from camera.',
-    required=False,
-    type=int,
-    default=480)
+      '--scoreThreshold',
+      help='The score threshold of classification results.',
+      required=False,
+      type=float,
+      default=0.0)
   parser.add_argument(
-    '--numThreads',
-    help='Number of CPU threads to run the model.',
-    required=False,
-    type=int,
-    default=4)
+      '--numThreads',
+      help='Number of CPU threads to run the model.',
+      required=False,
+      type=int,
+      default=4)
   parser.add_argument(
-    '--enableEdgeTPU',
-    help='Whether to run the model on EdgeTPU.',
-    action="store_true",
-    required=False,
-    default=False)
+      '--enableEdgeTPU',
+      help='Whether to run the model on EdgeTPU.',
+      action='store_true',
+      required=False,
+      default=False)
   args = parser.parse_args()
 
   run(args.model, int(args.cameraId), args.frameWidth, args.frameHeight,
+      args.maxResults, args.scoreThreshold,
       int(args.numThreads), bool(args.enableEdgeTPU))