Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update - Object Detection TFLite Task Library #3

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added android.tflite
Binary file not shown.
86 changes: 38 additions & 48 deletions object_detection/README.md
Original file line number Diff line number Diff line change
@@ -1,63 +1,50 @@
# TensorFlow Lite Python object detection example with Raspberry Pi

This example uses [TensorFlow Lite](https://tensorflow.org/lite) with Python
on a Raspberry Pi to perform real-time object detection using images
streamed from the Pi Camera. It draws a bounding box around each detected
object in the camera preview (when the object score is above a given threshold).
This example uses [TensorFlow Lite](https://tensorflow.org/lite) with Python on
a Raspberry Pi to perform real-time object detection using images streamed from
the Pi Camera. It draws a bounding box around each detected object in the camera
preview (when the object score is above a given threshold).

At the end of this page, there are extra steps to accelerate the example using
the Coral USB Accelerator to increase inference speed.


## Set up your hardware

Before you begin, you need to [set up your Raspberry Pi](
https://projects.raspberrypi.org/en/projects/raspberry-pi-setting-up) with
Raspberry Pi OS (preferably updated to Buster).

You also need to [connect and configure the Pi Camera](
https://www.raspberrypi.org/documentation/configuration/camera.md) if you use the
Pi Camera. This code also works with USB camera connect to the Raspberry Pi.

And to see the results from the camera, you need a monitor connected
to the Raspberry Pi. It's okay if you're using SSH to access the Pi shell
(you don't need to use a keyboard connected to the Pi)—you only need a monitor
attached to the Pi to see the camera stream.


## Install the TensorFlow Lite runtime

In this project, all you need from the TensorFlow Lite API is the `Interpreter`
class. So instead of installing the large `tensorflow` package, we're using the
much smaller `tflite_runtime` package.
Before you begin, you need to
[set up your Raspberry Pi](https://projects.raspberrypi.org/en/projects/raspberry-pi-setting-up)
with Raspberry Pi OS (preferably updated to Buster).

To install this on your Raspberry Pi, follow the instructions in the
[Python quickstart](https://www.tensorflow.org/lite/guide/python#install_tensorflow_lite_for_python).
You also need to
[connect and configure the Pi Camera](https://www.raspberrypi.org/documentation/configuration/camera.md)
if you use the Pi Camera. This code also works with USB camera connect to the
Raspberry Pi.

You can install the TFLite runtime using this script.

```
sh setup.sh
```
And to see the results from the camera, you need a monitor connected to the
Raspberry Pi. It's okay if you're using SSH to access the Pi shell (you don't
need to use a keyboard connected to the Pi)—you only need a monitor attached to
the Pi to see the camera stream.

## Download the example files

First, clone this Git repo onto your Raspberry Pi like this:

```
git clone https://github.com/khanhlvg/tflite_raspberry_pi --depth 1
git clone https://github.com/tensorflow/examples --depth 1
```

Then use our script to install a couple Python packages, and
download the MobileNet model and labels file:
Then use our script to install a couple Python packages, and download the
EfficientDet-Lite model:

```
cd object_detection
cd examples/lite/examples/object_detection/raspberry_pi

# The script install the required dependencies and download the TFLite models.
sh setup.sh
```

In this project, all you need from the TFLite Task Library is the `tflite_support` package. The setup scripts automatically install
the `tflite_support` package.

## Run the example

```
Expand All @@ -68,28 +55,31 @@ python3 detect.py \
You should see the camera feed appear on the monitor attached to your Raspberry
Pi. Put some objects in front of the camera, like a coffee mug or keyboard, and
you'll see boxes drawn around those that the model recognizes, including the
label and score for each. It also prints the amount of time it took
to perform each inference in milliseconds at the top-left corner of the screen.
label and score for each. It also prints the number of frames per second (FPS)
at the top-left corner of the screen. As the pipeline contains some processes
other than model inference, including visualizing the detection results, you can
expect a higher FPS if your inference pipeline runs in headless mode without
visualization.

For more information about executing inferences with TensorFlow Lite, read
[TensorFlow Lite inference](https://www.tensorflow.org/lite/guide/inference).


## Speed up model inference (optional)

If you want to significantly speed up the inference time, you can attach an
[Coral USB Accelerator](
https://coral.withgoogle.com/products/accelerator)—a USB accessory that adds
the [Edge TPU ML accelerator](https://coral.withgoogle.com/docs/edgetpu/faq/)
to any Linux-based system.
[Coral USB Accelerator](https://coral.withgoogle.com/products/accelerator)—a USB
accessory that adds the
[Edge TPU ML accelerator](https://coral.withgoogle.com/docs/edgetpu/faq/) to any
Linux-based system.

If you have a Coral USB Accelerator, you can run the sample with it enabled:

1. First, be sure you have completed the [USB Accelerator setup instructions](
https://coral.withgoogle.com/docs/accelerator/get-started/).
1. First, be sure you have completed the
[USB Accelerator setup instructions](https://coral.withgoogle.com/docs/accelerator/get-started/).

2. Run the object detection script using the EdgeTPU TFLite model and enable
the EdgeTPU option.
2. Run the object detection script using the EdgeTPU TFLite model and enable
the EdgeTPU option. Be noted that the EdgeTPU requires a specific TFLite
model that is different from the one used above.

```
python3 detect.py \
Expand All @@ -100,5 +90,5 @@ python3 detect.py \
You should see significantly faster inference speeds.

For more information about creating and running TensorFlow Lite models with
Coral devices, read [TensorFlow models on the Edge TPU](
https://coral.withgoogle.com/docs/edgetpu/models-intro/).
Coral devices, read
[TensorFlow models on the Edge TPU](https://coral.withgoogle.com/docs/edgetpu/models-intro/).
164 changes: 101 additions & 63 deletions object_detection/detect.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,20 +11,41 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""Main script to run pose classification and pose estimation."""
"""Main script to run the object detection routine."""
import argparse
import sys
import time

from typing import List, NamedTuple
import cv2

from tflite_support.task import vision
from tflite_support.task import core
from tflite_support.task import processor

import utils

from object_detector import ObjectDetector
from object_detector import ObjectDetectorOptions
class Rect(NamedTuple):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't need to redefine these types: Rect, Category, Detection. There are equivalent data types in the TFLite Task Library. Instead, you should take the DetectionResult from the Task Library and update the utils.py file to receive a DetectionResult instance from the Task Library and show it, instead of List[Detection] like it's now.

"""A rectangle in 2D space."""
left: float
top: float
right: float
bottom: float


class Category(NamedTuple):
"""A result of a classification task."""
label: str
score: float


class Detection(NamedTuple):
"""A detected object as the result of an ObjectDetector."""
bounding_box: Rect
categories: List[Category]


def run(model: str, camera_id: int, width: int, height: int,
max_results: int, score_threshold: float,
num_threads: int, enable_edgetpu: bool) -> None:
"""Continuously run inference on images acquired from the camera.

Expand All @@ -33,19 +54,20 @@ def run(model: str, camera_id: int, width: int, height: int,
camera_id: The camera id to be passed to OpenCV.
width: The width of the frame captured from the camera.
height: The height of the frame captured from the camera.
max_results: Maximum number of classification results to display.
score_threshold: The score threshold of classification results.
num_threads: The number of CPU threads to run the model.
enable_edgetpu: True/False whether the model is a EdgeTPU model.
"""

# Variables to calculate FPS
counter, fps = 0, 0
start_time_fps = time.time()
start_time = time.time()

# Start capturing video input from the camera
cap = cv2.VideoCapture(camera_id)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, width)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, height)
cap.set(cv2.CAP_PROP_BUFFERSIZE, 2)

# Visualization parameters
row_size = 20 # pixels
Expand All @@ -56,106 +78,122 @@ def run(model: str, camera_id: int, width: int, height: int,
fps_avg_frame_count = 10

# Initialize the object detection model
options = ObjectDetectorOptions(num_threads=num_threads,
score_threshold=0.3,
max_results=3,
enable_edgetpu=enable_edgetpu)
detector = ObjectDetector(model_path=model, options=options)
if enable_edgetpu:
base_options = core.BaseOptions(file_name=model, use_coral=True)
else:
base_options = core.BaseOptions(file_name=model)

detection_options = processor.DetectionOptions(max_results=max_results,
score_threshold=score_threshold)
options = vision.ObjectDetectorOptions(base_options=base_options, detection_options=detection_options)

detector = vision.ObjectDetector.create_from_options(options)

# Continuously capture images from the camera and run inference
while cap.isOpened():
start_time_frame = time.time()

start_time = time.time()
success, image = cap.read()
if not success:
sys.exit(
'ERROR: Unable to read from webcam. Please verify your webcam settings.'
'ERROR: Unable to read from webcam. Please verify your webcam settings.'
)

counter += 1
image = cv2.flip(image, 1)
elapsed_time = int((time.time() - start_time) * 1000)
print('OpenCV read image time: {0}ms'.format(elapsed_time))

# Run object detection estimation using the model.
start_time = time.time()
detections = detector.detect(image)
elapsed_time = int((time.time() - start_time) * 1000)
print('Inference time: {0}ms'.format(elapsed_time))
rgb_image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
tensor_image = vision.TensorImage.create_from_array(rgb_image)
detection_results = detector.detect(tensor_image)

detections = []
# Parse the model output into a list of Detection entities.
for detection in detection_results.detections:
bounding_box = Rect(
top=detection.bounding_box.origin_y,
left=detection.bounding_box.origin_x,
bottom=detection.bounding_box.origin_y + detection.bounding_box.height,
right=detection.bounding_box.origin_x + detection.bounding_box.width)
category = Category(
score=detection.classes[0].score,
label=detection.classes[0].class_name,
)
result = Detection(bounding_box=bounding_box, categories=[category])
detections.append(result)

start_time = time.time()
# Draw keypoints and edges on input image
image = utils.visualize(image, detections)

# Calculate the FPS
if counter % fps_avg_frame_count == 0:
end_time_fps = time.time()
fps = fps_avg_frame_count / (end_time_fps - start_time_fps)
start_time_fps = time.time()
end_time = time.time()
fps = fps_avg_frame_count / (end_time - start_time)
start_time = time.time()

# Show the FPS
fps_text = 'FPS = {:.1f}'.format(fps)
text_location = (left_margin, row_size)
cv2.putText(image, fps_text, text_location, cv2.FONT_HERSHEY_PLAIN,
font_size, text_color, font_thickness)

cv2.imshow('object_detector', image)
elapsed_time = int((time.time() - start_time) * 1000)
print('Visualization time: {0}ms'.format(elapsed_time))

start_time = time.time()
# Stop the program if the ESC key is pressed.
if cv2.waitKey(1) == 27:
break
elapsed_time = int((time.time() - start_time) * 1000)
print('Wait key time: {0}ms'.format(elapsed_time))

elapsed_time = int((time.time() - start_time_frame) * 1000)
print('Time per frame (end-to-end): {0}ms'.format(elapsed_time))
print()
cv2.imshow('object_detector', image)

cap.release()
cv2.destroyAllWindows()


def main():
parser = argparse.ArgumentParser(
formatter_class=argparse.ArgumentDefaultsHelpFormatter)
formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument(
'--model',
help='Path of the object detection model.',
required=False,
default='efficientdet_lite0.tflite')
parser.add_argument(
'--cameraId', help='Id of camera.', required=False, type=int, default=0)
parser.add_argument(
'--model',
help='Path of the object detection model.',
required=False,
default='efficientdet_lite0.tflite')
'--frameWidth',
help='Width of frame to capture from camera.',
required=False,
type=int,
default=640)
parser.add_argument(
'--cameraId', help='Id of camera.', required=False, type=int, default=0)
'--frameHeight',
help='Height of frame to capture from camera.',
required=False,
type=int,
default=480)
parser.add_argument(
'--frameWidth',
help='Width of frame to capture from camera.',
required=False,
type=int,
default=640)
'--maxResults',
help='Maximum number of results to show.',
required=False,
type=int,
default=5)
parser.add_argument(
'--frameHeight',
help='Height of frame to capture from camera.',
required=False,
type=int,
default=480)
'--scoreThreshold',
help='The score threshold of classification results.',
required=False,
type=float,
default=0.0)
parser.add_argument(
'--numThreads',
help='Number of CPU threads to run the model.',
required=False,
type=int,
default=4)
'--numThreads',
help='Number of CPU threads to run the model.',
required=False,
type=int,
default=4)
parser.add_argument(
'--enableEdgeTPU',
help='Whether to run the model on EdgeTPU.',
action="store_true",
required=False,
default=False)
'--enableEdgeTPU',
help='Whether to run the model on EdgeTPU.',
action='store_true',
required=False,
default=False)
args = parser.parse_args()

run(args.model, int(args.cameraId), args.frameWidth, args.frameHeight,
args.maxResults, args.scoreThreshold,
int(args.numThreads), bool(args.enableEdgeTPU))


Expand Down
Loading