diff --git a/README.md b/README.md index 81463bfd7..e1748a8da 100644 --- a/README.md +++ b/README.md @@ -1,17 +1,86 @@ # intel-oneAPI -#### Team Name - -#### Problem Statement - -#### Team Leader Email - +#### Team Name -Team Xion (Extreme Identification Of Navigation) +#### Problem Statement - Object Detection in Autonomous Vehicles +#### Team Leader Email - sarthakjoshisj93@gmail.com -## A Brief of the Prototype: - This section must include UML Daigrms and prototype description +#### Demo Video:- https://www.youtube.com/watch?v=ZCrOj5g974A + +## A Brief of the Prototype 🎦: + #### Developed a road sign 🛑 detection model using the oneAPI framework in conjunction with the openVINO toolkit. + #### Leveraged the oneAPI libraries and tools for efficient implementation and optimization. + #### Utilized a dataset containing German road sign images for training and evaluation. + #### Trained a deep learning model SSD using the dataset within the Intel oneAPI oneDAL(DATA ANALYTIC LIBRARY). + #### Fine-tuned the model to improve its accuracy in detecting Indian road signs and help autonomous vehicles. + #### Converted the trained model to the OpenVINO Intermediate Representation (IR) format for deployment. Utilized openVINO's inference engine to perform real-time road sign detection on different hardware platforms. + #### Achieved high accuracy and real-time performance in road sign detection tasks using the combined power of the oneAPI framework and OpenVINO toolkit, showcasing their synergy in computer vision applications. -## Tech Stack: - List Down all technologies used to Build the prototype **Clearly mentioning Intel® AI Analytics Toolkits, it's libraries and the SYCL/DCP++ Libraries used** + ### Diagram + ![Diagram](https://github.com/Craniace/intel-oneAPI/assets/100042684/be5c1803-083c-4879-8cde-c2d4ec154092) + + +## Tech Stack ⚙: + * Intel SYCL/C++ Library + * Intel Distribution for Python + * openVINO for Python + * Visual Studio Code + * Python 3.11 + * Intel DevCloud Platform -## Step-by-Step Code Execution Instructions: - This Section must contain set of instructions required to clone and run the prototype, so that it can be tested and deeply analysed +## Step-by-Step Code Execution Instructions 👨🏻‍💻: + ### Step 1: Install Required Libraries + +- Make sure you have Python installed on your system. +- Install the necessary libraries, such as OpenCV, openVINO, and os, using pip or conda. + +### Step 2: Gather and Preprocess Data + +- Collect or create a dataset for training your object detection model. +- Annotate the dataset by labelling the objects of interest with bounding boxes. +- Split the dataset into training and testing sets. + ![image](https://github.com/Craniace/intel-oneAPI/assets/100042684/24732fa4-96bd-4e73-9739-28273e372c65) + + +### Step 3: Trained model with **oneAPI** + +- Train the model using intel oneDAL to get better results and faster computation or Select a pre-trained object detection model that suits your requirements. +![Intel-toolkit-oneAPI-rendering-scaled-960x500_c](https://github.com/Craniace/intel-oneAPI/assets/100042684/ed06ad19-dcc6-4546-96be-1b1e72a5e914) + +### Step 4: Fine-tune the Model + +- Load the pre-trained model weights. +- Replace the classification head with a new head suitable for your specific objects. +- Freeze the initial layers to retain the pre-trained weights. +- Train the model using the annotated training dataset. +- Adjust hyperparameters, such as learning rate and batch size, to optimize performance. + +### Step 5: Evaluate the Model + +- Evaluate the performance of your trained model using the testing dataset. +- Measure metrics like precision, recall, and average precision to assess the model's accuracy. + +### Step 6: Implement Object Detection + +- Use the trained model to perform object detection on new images or videos. +- Preprocess the input by resizing, normalizing, and converting it to the appropriate format. +- Pass the preprocessed input through the model to obtain predicted bounding boxes and class labels. +- Apply non-maximum suppression to remove redundant overlapping bounding boxes. +- Visualize the detected objects by drawing bounding boxes and labels on the input image or video. + -## What I Learned: - Write about the biggest learning you had while developing the prototype +## What I Learned 💡: + **✅Image processing and computer vision techniques:** Road sign detection often involves applying various image processing and computer vision algorithms to identify and locate signs within images or video streams. This includes techniques like image segmentation, feature extraction, and object detection. + +**✅Data collection and preprocessing:** Gathering a diverse dataset of road sign images is crucial for training and evaluating the detection model. You would have learned how to collect and preprocess the data, including labeling the signs and handling data augmentation techniques. + +**✅Model selection and training:** Choosing an appropriate model architecture for road sign detection and training it using the collected dataset. This involves understanding different deep learning models and their suitability for the task, selecting loss functions, and optimizing hyperparameters. + +**✅Integration of oneAPI tools:** oneAPI provides a unified programming model for diverse hardware architectures. You would have learned how to leverage oneAPI tools, such as oneDNN (oneAPI Deep Neural Network Library) and oneVPL (oneAPI Video Processing Library), to optimize and accelerate the road sign detection pipeline on specific hardware platforms. + +**✅Performance optimization:** Road sign detection often requires real-time or near real-time processing, especially in autonomous driving applications. You would have explored optimization techniques to improve the inference speed and efficiency of the detection model, such as model quantization, pruning, and parallelization. + +**✅Evaluation and accuracy assessment:** Assessing the performance of the road sign detection model through evaluation metrics like precision, recall, and F1 score. This helps measure the accuracy and effectiveness of the model in correctly identifying road signs. + +**✅Deployment and integration:** Integrating the trained road sign detection model into larger systems or applications, such as autonomous vehicles or traffic management systems. This involves considering deployment requirements, performance constraints, and compatibility with existing software or hardware components. + + diff --git a/Video and PPT/Doc.docx b/Video and PPT/Doc.docx new file mode 100644 index 000000000..d5bdb09bf Binary files /dev/null and b/Video and PPT/Doc.docx differ diff --git a/Video and PPT/Video b/Video and PPT/Video new file mode 100644 index 000000000..05f602dc1 --- /dev/null +++ b/Video and PPT/Video @@ -0,0 +1 @@ +Demo Video link:- ### https://www.youtube.com/watch?v=ZCrOj5g974A #### diff --git a/Video and PPT/Video.mp4 b/Video and PPT/Video.mp4 new file mode 100644 index 000000000..48fa502d4 Binary files /dev/null and b/Video and PPT/Video.mp4 differ diff --git a/Video and PPT/XION.pdf b/Video and PPT/XION.pdf new file mode 100644 index 000000000..d5c4d421b Binary files /dev/null and b/Video and PPT/XION.pdf differ diff --git a/Video.mp4 b/Video.mp4 new file mode 100644 index 000000000..48fa502d4 Binary files /dev/null and b/Video.mp4 differ diff --git a/compressed.zip b/compressed.zip new file mode 100644 index 000000000..61b6a41fc Binary files /dev/null and b/compressed.zip differ diff --git a/data/ssdlite_mobilenet_v2_coco_2018_05_09/checkpoint b/data/ssdlite_mobilenet_v2_coco_2018_05_09/checkpoint new file mode 100644 index 000000000..febd7d546 --- /dev/null +++ b/data/ssdlite_mobilenet_v2_coco_2018_05_09/checkpoint @@ -0,0 +1,2 @@ +model_checkpoint_path: "model.ckpt" +all_model_checkpoint_paths: "model.ckpt" diff --git a/data/ssdlite_mobilenet_v2_coco_2018_05_09/frozen_inference_graph.pb b/data/ssdlite_mobilenet_v2_coco_2018_05_09/frozen_inference_graph.pb new file mode 100644 index 000000000..fa1f8d214 Binary files /dev/null and b/data/ssdlite_mobilenet_v2_coco_2018_05_09/frozen_inference_graph.pb differ diff --git a/data/ssdlite_mobilenet_v2_coco_2018_05_09/model.ckpt.data-00000-of-00001 b/data/ssdlite_mobilenet_v2_coco_2018_05_09/model.ckpt.data-00000-of-00001 new file mode 100644 index 000000000..c34d0225f Binary files /dev/null and b/data/ssdlite_mobilenet_v2_coco_2018_05_09/model.ckpt.data-00000-of-00001 differ diff --git a/data/ssdlite_mobilenet_v2_coco_2018_05_09/model.ckpt.index b/data/ssdlite_mobilenet_v2_coco_2018_05_09/model.ckpt.index new file mode 100644 index 000000000..fd4cfb08a Binary files /dev/null and b/data/ssdlite_mobilenet_v2_coco_2018_05_09/model.ckpt.index differ diff --git a/data/ssdlite_mobilenet_v2_coco_2018_05_09/model.ckpt.meta b/data/ssdlite_mobilenet_v2_coco_2018_05_09/model.ckpt.meta new file mode 100644 index 000000000..a6d92ff86 Binary files /dev/null and b/data/ssdlite_mobilenet_v2_coco_2018_05_09/model.ckpt.meta differ diff --git a/data/ssdlite_mobilenet_v2_coco_2018_05_09/pipeline.config b/data/ssdlite_mobilenet_v2_coco_2018_05_09/pipeline.config new file mode 100644 index 000000000..15e108f48 --- /dev/null +++ b/data/ssdlite_mobilenet_v2_coco_2018_05_09/pipeline.config @@ -0,0 +1,181 @@ +model { + ssd { + num_classes: 90 + image_resizer { + fixed_shape_resizer { + height: 300 + width: 300 + } + } + feature_extractor { + type: "ssd_mobilenet_v2" + depth_multiplier: 1.0 + min_depth: 16 + conv_hyperparams { + regularizer { + l2_regularizer { + weight: 3.99999989895e-05 + } + } + initializer { + truncated_normal_initializer { + mean: 0.0 + stddev: 0.0299999993294 + } + } + activation: RELU_6 + batch_norm { + decay: 0.999700009823 + center: true + scale: true + epsilon: 0.0010000000475 + train: true + } + } + use_depthwise: true + } + box_coder { + faster_rcnn_box_coder { + y_scale: 10.0 + x_scale: 10.0 + height_scale: 5.0 + width_scale: 5.0 + } + } + matcher { + argmax_matcher { + matched_threshold: 0.5 + unmatched_threshold: 0.5 + ignore_thresholds: false + negatives_lower_than_unmatched: true + force_match_for_each_row: true + } + } + similarity_calculator { + iou_similarity { + } + } + box_predictor { + convolutional_box_predictor { + conv_hyperparams { + regularizer { + l2_regularizer { + weight: 3.99999989895e-05 + } + } + initializer { + truncated_normal_initializer { + mean: 0.0 + stddev: 0.0299999993294 + } + } + activation: RELU_6 + batch_norm { + decay: 0.999700009823 + center: true + scale: true + epsilon: 0.0010000000475 + train: true + } + } + min_depth: 0 + max_depth: 0 + num_layers_before_predictor: 0 + use_dropout: false + dropout_keep_probability: 0.800000011921 + kernel_size: 3 + box_code_size: 4 + apply_sigmoid_to_scores: false + use_depthwise: true + } + } + anchor_generator { + ssd_anchor_generator { + num_layers: 6 + min_scale: 0.20000000298 + max_scale: 0.949999988079 + aspect_ratios: 1.0 + aspect_ratios: 2.0 + aspect_ratios: 0.5 + aspect_ratios: 3.0 + aspect_ratios: 0.333299994469 + } + } + post_processing { + batch_non_max_suppression { + score_threshold: 0.300000011921 + iou_threshold: 0.600000023842 + max_detections_per_class: 100 + max_total_detections: 100 + } + score_converter: SIGMOID + } + normalize_loss_by_num_matches: true + loss { + localization_loss { + weighted_smooth_l1 { + } + } + classification_loss { + weighted_sigmoid { + } + } + hard_example_miner { + num_hard_examples: 3000 + iou_threshold: 0.990000009537 + loss_type: CLASSIFICATION + max_negatives_per_positive: 3 + min_negatives_per_image: 3 + } + classification_weight: 1.0 + localization_weight: 1.0 + } + } +} +train_config { + batch_size: 24 + data_augmentation_options { + random_horizontal_flip { + } + } + data_augmentation_options { + ssd_random_crop { + } + } + optimizer { + rms_prop_optimizer { + learning_rate { + exponential_decay_learning_rate { + initial_learning_rate: 0.00400000018999 + decay_steps: 800720 + decay_factor: 0.949999988079 + } + } + momentum_optimizer_value: 0.899999976158 + decay: 0.899999976158 + epsilon: 1.0 + } + } + fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt" + num_steps: 200000 + fine_tune_checkpoint_type: "detection" +} +train_input_reader { + label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt" + tf_record_input_reader { + input_path: "PATH_TO_BE_CONFIGURED/mscoco_train.record" + } +} +eval_config { + num_examples: 8000 + max_evals: 10 + use_moving_averages: false +} +eval_input_reader { + label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt" + shuffle: false + num_readers: 1 + tf_record_input_reader { + input_path: "PATH_TO_BE_CONFIGURED/mscoco_val.record" + } +} diff --git a/data/ssdlite_mobilenet_v2_coco_2018_05_09/saved_model/saved_model.pb b/data/ssdlite_mobilenet_v2_coco_2018_05_09/saved_model/saved_model.pb new file mode 100644 index 000000000..d51804591 Binary files /dev/null and b/data/ssdlite_mobilenet_v2_coco_2018_05_09/saved_model/saved_model.pb differ diff --git a/intel2.py b/intel2.py new file mode 100644 index 000000000..c42954624 --- /dev/null +++ b/intel2.py @@ -0,0 +1,101 @@ +import cv2 +import numpy as np +from openvino.inference_engine import IECore +import sys + +# Load the road sign detection model +model_xml = 'path/to/road_sign_detection_model.xml' +model_bin = 'path/to/road_sign_detection_model.bin' +ie = IECore() +net = ie.read_network(model=model_xml, weights=model_bin) +exec_net = ie.load_network(network=net, device_name='CPU') + +# Define the classes corresponding to road signs +classes = ['Speed_limit_20_km/h' +,'Speed_limit_30_km/h' +,'Speed_limit__50_km/h' +,'Speed_limit_60_km/h' +,'Speed_limit_70_km/h' +,'Speed_limit_80_km/h' +,'End_of_speed_limit_(80 km/h)' +,'Speed_limit_100_km/h' +,'Speed_limit_120_km/h' +,'No_passing' +,'No_passing for vehicles over 3.5 metric tons' +,'Right-of-way_at_the next_intersection' +,'Priority_road' +,'Yield' +,'Stop' +,'No_vehicles' +,'Vehicles_over_3.5_metric_tons_prohibited' +,'No_entry' +,'General_caution' +,'Dangerous_curve_to_the_left' +,'Dangerous_curve-to_the_right' +,'Double_curve' +,'Bumpy_road' +,'Slippery_road' +,'Road_narrows_on_the_right' +,'Construction_zone' +,'Traffic_signal_ahead' +,'Pedestrian_crossing' +,'School_zone' +,'Bicycles_crossing' +,'Beware_of_ice/snow' +,'Wild_animals_crossing' +,'End_of_all_speed_and_passing_limits' +,'Turn_right_ahead' +,'Turn_left_ahead' +,'Ahead_only' +,'Go_straight_or_right' +,'Go_straight_or_left' +,'Keep_right' +,'Keep_left' +,'Roundabout_mandatory' +,'End_of_no_passing' +,'End_of_no_passing_by_vehicles_over_3.5_metric _tons'] + +# Initialize video capture +cap = cv2.VideoCapture(0) + +while True: + # Read frame from camera + ret, frame = cap.read() + if not ret: + break + + # Preprocess the frame + input_blob = cv2.dnn.blobFromImage(frame, size=(300, 300), ddepth=cv2.CV_8U) + input_blob = np.transpose(input_blob, (0, 3, 1, 2)) + + # Perform inference + outputs = exec_net.infer(inputs={net.input_info['data'].input_name: input_blob}) + detections = outputs[net.output_info['detection_out'].output_name][0][0] + + # Process the detections + for detection in detections: + confidence = detection[2] + if confidence > 0.5: + class_id = int(detection[1]) + class_name = classes[class_id] + + # Get the coordinates of the detected road sign + x1 = int(detection[3] * frame.shape[1]) + y1 = int(detection[4] * frame.shape[0]) + x2 = int(detection[5] * frame.shape[1]) + y2 = int(detection[6] * frame.shape[0]) + + # Draw bounding box and label + cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2) + cv2.putText(frame, class_name, (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (0, 255, 0), 2) + + # Display the frame + cv2.imshow('Road Sign Detection', frame) + + # Exit if 'q' is pressed + if cv2.waitKey(1) == ord('q'): + break + +# Release resources +cap.release() +cv2.destroyAllWindows() diff --git a/model.py b/model.py new file mode 100644 index 000000000..7aa37a4f8 --- /dev/null +++ b/model.py @@ -0,0 +1,283 @@ +import collections +import sys +import tarfile +import time +from pathlib import Path + +import cv2 +import numpy as np +import onednn +from IPython import display +from openvino import runtime as ov +from openvino.tools.mo.front import tf as ov_tf_front + +sys.path.append("../utils") +import notebook_utils as utils + +# A directory where the model will be downloaded. +base_model_dir = Path("model") + +# The name of the model from Open Model Zoo +model_name = "ssdlite_mobilenet_v2" + +archive_name = Path(f"{model_name}_coco_2018_05_09.tar.gz") +model_url = f"https://storage.openvinotoolkit.org/repositories/open_model_zoo/public/2022.1/{model_name}/{archive_name}" + +# Download the archive +downloaded_model_path = base_model_dir / archive_name +if not downloaded_model_path.exists(): + utils.download_file(model_url, downloaded_model_path.name, downloaded_model_path.parent) + +# Unpack the model +tf_model_path = base_model_dir / archive_name.with_suffix("").stem / "frozen_inference_graph.pb" +if not tf_model_path.exists(): + with tarfile.open(downloaded_model_path) as file: + file.extractall(base_model_dir) + +precision = "FP16" +# The output path for the conversion. +converted_model_path = Path("model") / f"{model_name}_{precision.lower()}.xml" + +# Convert it to IR if not previously converted +trans_config_path = Path(ov_tf_front.__file__).parent / "ssd_v2_support.json" +if not converted_model_path.exists(): + convert_command = f"mo " \ + f"--input_model {tf_model_path} " \ + f"--output_dir {base_model_dir} " \ + f"--model_name {model_name}_{precision.lower()} " \ + f"--compress_to_fp16 {True if precision == 'FP16' else False} " \ + f"--transformations_config={trans_config_path} " \ + f"--tensorflow_object_detection_api_pipeline_config {tf_model_path.parent}/pipeline.config " \ + f"--reverse_input_channels" + + +# Initialize OpenVINO Runtime. +ie_core = ov.Core() +# Read the network and corresponding weights from a file. +model = ie_core.read_model(model=converted_model_path) +# Compile the model for CPU (you can choose manually CPU, GPU, MYRIAD etc.) +# or let the engine choose the best available device (AUTO). +compiled_model = ie_core.compile_model(model=model, device_name="CPU") + +# Get the input and output nodes. +input_layer = compiled_model.input(0) +output_layer = compiled_model.output(0) + +# Get the input size. +height, width = list(input_layer.shape)[1:3] + +# https://tech.amikelive.com/node-718/what-object-categories-labels-are-in-coco-dataset/ +classes = [ + 'Speed_limit_20_km/h' +,'Speed_limit_30_km/h' +,'Speed_limit__50_km/h' +,'Speed_limit_60_km/h' +,'Speed_limit_70_km/h' +,'Speed_limit_80_km/h' +,'End_of_speed_limit_(80 km/h)' +,'Speed_limit_100_km/h' +,'Speed_limit_120_km/h' +,'No_passing' +,'No_passing for vehicles over 3.5 metric tons' +,'Right-of-way_at_the next_intersection' +,'Priority_road' +,'Yield' +,'Stop' +,'No_vehicles' +,'Vehicles_over_3.5_metric_tons_prohibited' +,'No_entry' +,'General_caution' +,'Dangerous_curve_to_the_left' +,'Dangerous_curve-to_the_right' +,'Double_curve' +,'Bumpy_road' +,'Slippery_road' +,'Road_narrows_on_the_right' +,'Construction_zone' +,'Traffic_signal_ahead' +,'Pedestrian_crossing' +,'School_zone' +,'Bicycles_crossing' +,'Beware_of_ice/snow' +,'Wild_animals_crossing' +,'End_of_all_speed_and_passing_limits' +,'Turn_right_ahead' +,'Turn_left_ahead' +,'Ahead_only' +,'Go_straight_or_right' +,'Go_straight_or_left' +,'Keep_right' +,'Keep_left' +,'Roundabout_mandatory' +,'End_of_no_passing' +,'End_of_no_passing_by_vehicles_over_3.5_metric _tons' +] + +# Colors for the classes above (Rainbow Color Map). +colors = cv2.applyColorMap( + src=np.arange(0, 255, 255 / len(classes), dtype=np.float32).astype(np.uint8), + colormap=cv2.COLORMAP_RAINBOW, +).squeeze() + + +def process_results(frame, results, thresh=0.6): + # The size of the original frame. + h, w = frame.shape[:2] + # The 'results' variable is a [1, 1, 100, 7] tensor. + results = results.squeeze() + boxes = [] + labels = [] + scores = [] + for _, label, score, xmin, ymin, xmax, ymax in results: + # Create a box with pixels coordinates from the box with normalized coordinates [0,1]. + boxes.append( + tuple(map(int, (xmin * w, ymin * h, (xmax - xmin) * w, (ymax - ymin) * h))) + ) + labels.append(int(label)) + scores.append(float(score)) + + # Apply non-maximum suppression to get rid of many overlapping entities. + # See https://paperswithcode.com/method/non-maximum-suppression + # This algorithm returns indices of objects to keep. + indices = cv2.dnn.NMSBoxes( + bboxes=boxes, scores=scores, score_threshold=thresh, nms_threshold=0.6 + ) + + # If there are no boxes. + if len(indices) == 0: + return [] + + # Filter detected objects. + return [(labels[idx], scores[idx], boxes[idx]) for idx in indices.flatten()] + + +def draw_boxes(frame, boxes): + for label, score, box in boxes: + # Choose color for the label. + color = tuple(map(int, colors[label])) + # Draw a box. + x2 = box[0] + box[2] + y2 = box[1] + box[3] + cv2.rectangle(img=frame, pt1=box[:2], pt2=(x2, y2), color=color, thickness=3) + + # Draw a label name inside the box. + cv2.putText( + img=frame, + text=f"{classes[label]} {score:.2f}", + org=(box[0] + 10, box[1] + 30), + fontFace=cv2.FONT_HERSHEY_COMPLEX, + fontScale=frame.shape[1] / 1000, + color=color, + thickness=1, + lineType=cv2.LINE_AA, + ) + + return frame +# Main processing function to run object detection. +def run_object_detection(source=0, flip=False, use_popup=False, skip_first_frames=0): + player = None + try: + # Create a video player to play with target fps. + player = utils.VideoPlayer( + source=source, flip=flip, fps=30, skip_first_frames=skip_first_frames + ) + # Start capturing. + player.start() + if use_popup: + title = "Press ESC to Exit" + cv2.namedWindow( + winname=title, flags=cv2.WINDOW_GUI_NORMAL | cv2.WINDOW_AUTOSIZE + ) + + processing_times = collections.deque() + while True: + # Grab the frame. + frame = player.next() + if frame is None: + print("Source ended") + break + # If the frame is larger than full HD, reduce size to improve the performance. + scale = 1280 / max(frame.shape) + if scale < 1: + frame = cv2.resize( + src=frame, + dsize=None, + fx=scale, + fy=scale, + interpolation=cv2.INTER_AREA, + ) + + # Resize the image and change dims to fit neural network input. + input_img = cv2.resize( + src=frame, dsize=(width, height), interpolation=cv2.INTER_AREA + ) + # Create a batch of images (size = 1). + input_img = input_img[np.newaxis, ...] + + # Measure processing time. + + start_time = time.time() + # Get the results. + results = compiled_model([input_img])[output_layer] + stop_time = time.time() + # Get poses from network results. + boxes = process_results(frame=frame, results=results) + + # Draw boxes on a frame. + frame = draw_boxes(frame=frame, boxes=boxes) + + processing_times.append(stop_time - start_time) + # Use processing times from last 200 frames. + if len(processing_times) > 200: + processing_times.popleft() + + _, f_width = frame.shape[:2] + # Mean processing time [ms]. + processing_time = np.mean(processing_times) * 1000 + fps = 1000 / processing_time + cv2.putText( + img=frame, + text=f"Inference time: {processing_time:.1f}ms ({fps:.1f} FPS)", + org=(20, 40), + fontFace=cv2.FONT_HERSHEY_COMPLEX, + fontScale=f_width / 1000, + color=(0, 0, 255), + thickness=1, + lineType=cv2.LINE_AA, + ) + + # Use this workaround if there is flickering. + if use_popup: + cv2.imshow(winname=title, mat=frame) + key = cv2.waitKey(1) + # escape = 27 + if key == 27: + break + else: + # Encode numpy array to jpg. + _, encoded_img = cv2.imencode( + ext=".jpg", img=frame, params=[cv2.IMWRITE_JPEG_QUALITY, 100] + ) + # Create an IPython image. + i = display.Image(data=encoded_img) + # Display the image in this notebook. + display.clear_output(wait=True) + display.display(i) + # ctrl-c + except KeyboardInterrupt: + print("Interrupted") + # any different error + except RuntimeError as e: + print(e) + finally: + if player is not None: + # Stop capturing. + player.stop() + if use_popup: + cv2.destroyAllWindows() +run_object_detection(source=0, flip=True, use_popup=False) + +# FOR WITHOUT WEBCAM +video_file = "video.path" + +run_object_detection(source=video_file, flip=False, use_popup=False)