Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tf v2 migration #67

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
## Real-time Hand-Detection using Neural Networks (SSD) on Tensorflow.
## Real-time Hand-Detection using Neural Networks (SSD) on Tensorflow 2.

This repo documents steps and scripts used to train a hand detector using Tensorflow (Object Detection API). As with any DNN based task, the most expensive (and riskiest) part of the process has to do with finding or creating the right (annotated) dataset. I was interested mainly in detecting hands on a table (egocentric view point). I experimented first with the [Oxford Hands Dataset](http://www.robots.ox.ac.uk/~vgg/data/hands/) (the results were not good). I then tried the [Egohands Dataset](http://vision.soic.indiana.edu/projects/egohands/) which was a much better fit to my requirements.
This repo documents steps and scripts used to train a hand detector using Tensorflow 2 (Object Detection API). As with any DNN based task, the most expensive (and riskiest) part of the process has to do with finding or creating the right (annotated) dataset. I was interested mainly in detecting hands on a table (egocentric view point). I experimented first with the [Oxford Hands Dataset](http://www.robots.ox.ac.uk/~vgg/data/hands/) (the results were not good). I then tried the [Egohands Dataset](http://vision.soic.indiana.edu/projects/egohands/) which was a much better fit to my requirements.

The goal of this repo/post is to demonstrate how neural networks can be applied to the (hard) problem of tracking hands (egocentric and other views). Better still, provide code that can be adapted to other uses cases.

Expand All @@ -23,7 +23,7 @@ Both examples above were run on a macbook pro **CPU** (i7, 2.5GHz, 16GB). Some f
| 16 | 320 * 240 | Macbook pro (i7, 2.5GHz, 16GB) | Run while visualizing results (image above) |
| 11 | 640 * 480 | Macbook pro (i7, 2.5GHz, 16GB) | Run while visualizing results (image above) |

> Note: The code in this repo is written and tested with Tensorflow `1.4.0-rc0`. Using a different version may result in [some errors](https://github.com/tensorflow/models/issues/1581).
> Note: The code in this repo is written and tested with Tensorflow `2.3.1`. Using a different version may result in [some errors](https://github.com/tensorflow/models/issues/1581).
You may need to [generate your own frozen model](https://pythonprogramming.net/testing-custom-object-detector-tensorflow-object-detection-api-tutorial/?completed=/training-custom-objects-tensorflow-object-detection-api-tutorial/) graph using the [model checkpoints](model-checkpoint) in the repo to fit your TF version.

The tensorflow object detection repo has a [python file for exporting a checkpoint to frozen graph here](https://github.com/tensorflow/models/blob/master/research/object_detection/export_inference_graph.py). You can copy it to the current directory and use it as follows
Expand Down
2 changes: 1 addition & 1 deletion detect_multi_threaded.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
def worker(input_q, output_q, cap_params, frame_processed):
print(">> loading frozen model for worker")
detection_graph, sess = detector_utils.load_inference_graph()
sess = tf.Session(graph=detection_graph)
sess = tf.compat.v1.Session(graph=detection_graph)
while True:
#print("> ===== in worker loop, frame ", frame_processed)
frame = input_q.get()
Expand Down
171 changes: 171 additions & 0 deletions detect_multi_threaded.py.old
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
from utils import detector_utils as detector_utils
import cv2
import tensorflow as tf
import multiprocessing
from multiprocessing import Queue, Pool
import time
from utils.detector_utils import WebcamVideoStream
import datetime
import argparse

frame_processed = 0
score_thresh = 0.2

# Create a worker thread that loads graph and
# does detection on images in an input queue and puts it on an output queue


def worker(input_q, output_q, cap_params, frame_processed):
print(">> loading frozen model for worker")
detection_graph, sess = detector_utils.load_inference_graph()
sess = tf.Session(graph=detection_graph)
while True:
#print("> ===== in worker loop, frame ", frame_processed)
frame = input_q.get()
if (frame is not None):
# Actual detection. Variable boxes contains the bounding box cordinates for hands detected,
# while scores contains the confidence for each of these boxes.
# Hint: If len(boxes) > 1 , you may assume you have found atleast one hand (within your score threshold)

boxes, scores = detector_utils.detect_objects(
frame, detection_graph, sess)
# draw bounding boxes
detector_utils.draw_box_on_image(
cap_params['num_hands_detect'], cap_params["score_thresh"],
scores, boxes, cap_params['im_width'], cap_params['im_height'],
frame)
# add frame annotated with bounding box to queue
output_q.put(frame)
frame_processed += 1
else:
output_q.put(frame)
sess.close()


if __name__ == '__main__':

parser = argparse.ArgumentParser()
parser.add_argument(
'-src',
'--source',
dest='video_source',
type=int,
default=0,
help='Device index of the camera.')
parser.add_argument(
'-nhands',
'--num_hands',
dest='num_hands',
type=int,
default=2,
help='Max number of hands to detect.')
parser.add_argument(
'-fps',
'--fps',
dest='fps',
type=int,
default=1,
help='Show FPS on detection/display visualization')
parser.add_argument(
'-wd',
'--width',
dest='width',
type=int,
default=300,
help='Width of the frames in the video stream.')
parser.add_argument(
'-ht',
'--height',
dest='height',
type=int,
default=200,
help='Height of the frames in the video stream.')
parser.add_argument(
'-ds',
'--display',
dest='display',
type=int,
default=1,
help='Display the detected images using OpenCV. This reduces FPS')
parser.add_argument(
'-num-w',
'--num-workers',
dest='num_workers',
type=int,
default=4,
help='Number of workers.')
parser.add_argument(
'-q-size',
'--queue-size',
dest='queue_size',
type=int,
default=5,
help='Size of the queue.')
args = parser.parse_args()

input_q = Queue(maxsize=args.queue_size)
output_q = Queue(maxsize=args.queue_size)

video_capture = WebcamVideoStream(
src=args.video_source, width=args.width, height=args.height).start()

cap_params = {}
frame_processed = 0
cap_params['im_width'], cap_params['im_height'] = video_capture.size()
cap_params['score_thresh'] = score_thresh

# max number of hands we want to detect/track
cap_params['num_hands_detect'] = args.num_hands

print(cap_params, args)

# spin up workers to paralleize detection.
pool = Pool(args.num_workers, worker,
(input_q, output_q, cap_params, frame_processed))

start_time = datetime.datetime.now()
num_frames = 0
fps = 0
index = 0

cv2.namedWindow('Multi-Threaded Detection', cv2.WINDOW_NORMAL)

while True:
frame = video_capture.read()
frame = cv2.flip(frame, 1)
index += 1

input_q.put(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
output_frame = output_q.get()

output_frame = cv2.cvtColor(output_frame, cv2.COLOR_RGB2BGR)

elapsed_time = (datetime.datetime.now() - start_time).total_seconds()
num_frames += 1
fps = num_frames / elapsed_time
# print("frame ", index, num_frames, elapsed_time, fps)

if (output_frame is not None):
if (args.display > 0):
if (args.fps > 0):
detector_utils.draw_fps_on_image("FPS : " + str(int(fps)),
output_frame)
cv2.imshow('Multi-Threaded Detection', output_frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
else:
if (num_frames == 400):
num_frames = 0
start_time = datetime.datetime.now()
else:
print("frames processed: ", index, "elapsed time: ",
elapsed_time, "fps: ", str(int(fps)))
else:
# print("video end")
break
elapsed_time = (datetime.datetime.now() - start_time).total_seconds()
fps = num_frames / elapsed_time
print("fps", fps)
pool.terminate()
video_capture.stop()
cv2.destroyAllWindows()
121 changes: 121 additions & 0 deletions detect_single_threaded.py.old
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
from utils import detector_utils as detector_utils
import cv2
import tensorflow as tf
import datetime
import argparse

detection_graph, sess = detector_utils.load_inference_graph()

if __name__ == '__main__':

parser = argparse.ArgumentParser()
parser.add_argument(
'-sth',
'--scorethreshold',
dest='score_thresh',
type=float,
default=0.2,
help='Score threshold for displaying bounding boxes')
parser.add_argument(
'-fps',
'--fps',
dest='fps',
type=int,
default=1,
help='Show FPS on detection/display visualization')
parser.add_argument(
'-src',
'--source',
dest='video_source',
default=0,
help='Device index of the camera.')
parser.add_argument(
'-wd',
'--width',
dest='width',
type=int,
default=320,
help='Width of the frames in the video stream.')
parser.add_argument(
'-ht',
'--height',
dest='height',
type=int,
default=180,
help='Height of the frames in the video stream.')
parser.add_argument(
'-ds',
'--display',
dest='display',
type=int,
default=1,
help='Display the detected images using OpenCV. This reduces FPS')
parser.add_argument(
'-num-w',
'--num-workers',
dest='num_workers',
type=int,
default=4,
help='Number of workers.')
parser.add_argument(
'-q-size',
'--queue-size',
dest='queue_size',
type=int,
default=5,
help='Size of the queue.')
args = parser.parse_args()

cap = cv2.VideoCapture(args.video_source)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, args.width)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, args.height)

start_time = datetime.datetime.now()
num_frames = 0
im_width, im_height = (cap.get(3), cap.get(4))
# max number of hands we want to detect/track
num_hands_detect = 2

cv2.namedWindow('Single-Threaded Detection', cv2.WINDOW_NORMAL)

while True:
# Expand dimensions since the model expects images to have shape: [1, None, None, 3]
ret, image_np = cap.read()
# image_np = cv2.flip(image_np, 1)
try:
image_np = cv2.cvtColor(image_np, cv2.COLOR_BGR2RGB)
except:
print("Error converting to RGB")

# Actual detection. Variable boxes contains the bounding box cordinates for hands detected,
# while scores contains the confidence for each of these boxes.
# Hint: If len(boxes) > 1 , you may assume you have found atleast one hand (within your score threshold)

boxes, scores = detector_utils.detect_objects(image_np,
detection_graph, sess)

# draw bounding boxes on frame
detector_utils.draw_box_on_image(num_hands_detect, args.score_thresh,
scores, boxes, im_width, im_height,
image_np)

# Calculate Frames per second (FPS)
num_frames += 1
elapsed_time = (datetime.datetime.now() - start_time).total_seconds()
fps = num_frames / elapsed_time

if (args.display > 0):
# Display FPS on frame
if (args.fps > 0):
detector_utils.draw_fps_on_image("FPS : " + str(int(fps)),
image_np)

cv2.imshow('Single-Threaded Detection',
cv2.cvtColor(image_np, cv2.COLOR_RGB2BGR))

if cv2.waitKey(25) & 0xFF == ord('q'):
cv2.destroyAllWindows()
break
else:
print("frames processed: ", num_frames, "elapsed time: ",
elapsed_time, "fps: ", str(int(fps)))
17 changes: 17 additions & 0 deletions report.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
TensorFlow 2.0 Upgrade Script
-----------------------------
Converted 1 files
Detected 0 issues that require attention
--------------------------------------------------------------------------------
================================================================================
Detailed log follows:

================================================================================
--------------------------------------------------------------------------------
Processing file 'detect_single_threaded_v1.py'
outputting to 'detect_single_threaded.py'
--------------------------------------------------------------------------------


--------------------------------------------------------------------------------

38 changes: 38 additions & 0 deletions reportfile.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
TensorFlow 2.0 Upgrade Script
-----------------------------
Converted 3 files
Detected 0 issues that require attention
--------------------------------------------------------------------------------
================================================================================
Detailed log follows:

================================================================================
================================================================================
Input tree: 'utils-v1/'
================================================================================
--------------------------------------------------------------------------------
Processing file 'utils-v1/detector_utils.py'
outputting to 'utils/detector_utils.py'
--------------------------------------------------------------------------------

41:23: INFO: Renamed 'tf.GraphDef' to 'tf.compat.v1.GraphDef'
42:13: INFO: Renamed 'tf.gfile.GFile' to 'tf.io.gfile.GFile'
46:15: INFO: Renamed 'tf.Session' to 'tf.compat.v1.Session'
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Processing file 'utils-v1/label_map_util.py'
outputting to 'utils/label_map_util.py'
--------------------------------------------------------------------------------

116:9: INFO: Renamed 'tf.gfile.GFile' to 'tf.io.gfile.GFile'
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Processing file 'utils-v1/__init__.py'
outputting to 'utils/__init__.py'
--------------------------------------------------------------------------------


--------------------------------------------------------------------------------

Empty file added utils-v1/__init__.py
Empty file.
Loading