-
Notifications
You must be signed in to change notification settings - Fork 0
Setup
This guide explains how to set up and configure the tiling-based neural network pipeline using DepthAI. The pipeline processes video frames, splits them into tiles, runs inference on each tile using a neural network, and merges the results.
- DepthAI Python library (
depthai
) - OAK camera
- Python 3.6+
-
Clone the repository and install the required dependencies.
git clone [email protected]:luxonis/nn-tiling.git cd nn-tiling pip install -r requirements.txt
-
Downloading Files: A script in
util/download.py
is provided to help you download the pre-trained YOLOv5 model. Skip this if you have your own blob.python util/download.py
Once the models are downloaded, you can configure and run the pipeline by adjusting the following parameters in your main.py
script. Here are some key parameters to consider:
-
neural network model (path to YOLOv5 model blob you just downloaded from Drive), e.g.:
nn_path = 'models/yolov5s_default_openvino_2021.4_6shave.blob'
-
input source
- OAK camera
- A prerecorded video using
ReplayVideo
node, e.g.:
replay.setReplayVideoFile(Path('videos/4k_traffic.mp4'))
-
confidence threshold (filter out neural network ouput based on its confidence), e.g.:
conf_thresh = 0.4
-
IoU threshold (Intersection over Union threshold), e.g.:
iou_thresh = 0.4
About this threshold, see this Medium post.
-
tiling settings
- grid_size - number of tiles horizontally and vertically, e.g.:
grid_size = (3, 2)
- overlap - amount of overlap between tiles (default is 0.2), e.g.:
overlap = 0.2
-
setting up the
grid_matrix
: 2D list that defines how the image tiles are split and optionally merged. It allows you to control how adjacent tiles are grouped together for inference by assigning the same integer to neighboring tiles.The
grid_matrix
must match the dimensions defined by thegrid_size
. For example, if the grid size is(3, 2)
(3 tiles horizontally, 2 tiles vertically), thegrid_matrix
will have 3 columns and 2 rows.grid_matrix_no_merging = [ [0, 1, 2], [3, 4, 5] ] # this matrix defines same tile setup grid_matrix_no_merging = [ [0, 1, 0], [1, 0, 1] ] # 1 merge grid_matrix_with_merging_1 = [ [0, 1, 0], [2, 2, 1] ] # 2 merges grid_matrix_with_merging_2 = [ [3, 3, 0], [2, 2, 1] ]
Note: the index of the tiles in the grid matrix does not matter. It does not have any meaning except for defining if its neighbour also has that integer.
-
IMG_SHAPE
: set the dimension of your video input, e.g.:IMG_SHAPE = (3840, 2160) # 4k video