C++ Demo - Object Detection (NanoDet) #232

ryan1288 · 2024-02-24T09:55:38Z

This PR adds the CPP equivalent version demo.cpp of the existing Python demo for the NanoDet object detection model. The README is correspondingly updated. The interface matches other cpp demos within the set of ML models. The C++ demo more than doubles the FPS of the equivalent Python demo for a video feed. Issue was pointed out in #135 (comment).

Average FPS on my laptop (AMD Ryzen 7 5800H) for a (640 x 480) webcam feed:

Python: 10 FPS
C++: 25 FPS

Testing

Run the demo for both single-image and video (with > 1 object) inputs
Confirm matching I/O and intermediate values

Run the Demo

Output Visualization Images from C++ and Python

Input

C++
Command: ./build/opencv_zoo_object_detection_nanodet -i=scene.jpg

Python
Command: python3 demo.py -i scene.jpg

Output Video from C++ and Python

C++
Command: ./build/opencv_zoo_object_detection_nanodet

Python
Command: python3 demo.py

Confirm matching I/O and intermediate values

C++

Mat infer(const Mat& sourceImage)
{
    cout << "sourceImage - shape | type | first value | 3rd value" << endl;
    cout << sourceImage.size() << " | " << typeToString(sourceImage.type()) << " | " << sourceImage.at<float>(0) << " | " << sourceImage.at<float>(2) << endl;
    Mat blob = this->preProcess(sourceImage);
    cout << "blob - shape | type | first value | 10th value" << endl;
    cout << blob.size() << " | " << typeToString(blob.type()) << " | " << blob.at<float>(0) << " | " << blob.at<float>(9) << endl;
    this->net.setInput(blob);
    vector<Mat> modelOutput;
    this->net.forward(modelOutput, this->net.getUnconnectedOutLayersNames());
    cout << "modelOutput - size || first value's shape | type | first value | 10th value" << endl;
    cout << modelOutput.size() << " || " << modelOutput[0].size() << " | " << typeToString(modelOutput[0].type()) << " | " << modelOutput[0].at<float>(0) << " | " << modelOutput[0].at<float>(9) << endl;
    Mat preds = this->postProcess(modelOutput);
    cout << "preds - shape | type | first value | 6th value" << endl;
    cout << preds.size() << " | " << typeToString(preds.type()) << " | " << preds.at<float>(0) << " | " << preds.at<float>(5) << endl;
    return preds;
}

Python

def infer(self, srcimg):
    print("sourceImage - shape | type | first value | 3rd value")
    print(srcimg.shape, '|', srcimg.dtype, '|', srcimg[0,0,0], '|', srcimg[0,0,2])
    blob = self.pre_process(srcimg)
    print("blob - shape | type | first value | 10th value")
    print(blob.shape, '|', blob.dtype, '|', blob[0,0,0,0], '|', blob[0,0,0,9])
    self.net.setInput(blob)
    outs = self.net.forward(self.net.getUnconnectedOutLayersNames())
    print("modelOutput - size || first value's shape | type | first value | 10th value")
    print(len(outs), '||', outs[0].shape, '|', outs[0].dtype, '|', outs[0][0,0,0], '|', outs[0][0,0,9])
    preds = self.post_process(outs)
    print("preds - shape | type | first value | 6th value")
    print(preds.shape, '|', preds.dtype, '|', preds[0,0], '|', preds[0,5])
    return preds

Test Summary: Both the visualizations and the intermediate values and I/O (values, shapes, and types) are identical between the Python and C++ demos.

ryan1288 · 2024-02-24T10:13:17Z

@fengyuentau I finally finished my research project so I'm super excited to finally start working on these projects!
This is my very first open-sourced contribution so any feedback is greatly appreciated 😄. It took a while to figure out all the OpenCV interfaces and proper cv::Mat usage (but it was fun to learn!).

It's somewhat refactored but I can definitely clean it up significantly more if you'd like, such as

Use namespaces properly without calling using namespace X
Separate the NanoDet class into a separate nanodet.hpp, nanodet.cpp library
Refactor helper functions to be more efficient and clean
More descriptive comments
Better error handling
Remove magic numbers

I read the guide from OpenCV Coding Style Guide but I may very well have missed some. Please let me know if it's not up to standards and I can clean it up further.

fengyuentau

I got two warnings building the demo:

cmake --build build
[ 50%] Building CXX object CMakeFiles/opencv_zoo_object_detection_nanodet.dir/demo.cpp.o
/workspace/opencv_zoo/models/object_detection_nanodet/demo.cpp:115:39: warning: left operand of comma operator has no effect [-Wunused-value]
        return projection.reshape(0, (4, projection.total() / 4));
                                      ^
/workspace/opencv_zoo/models/object_detection_nanodet/demo.cpp:238:14: warning: decomposition declarations are a C++17 extension [-Wc++17-extensions]
        auto [classIds, confidences] = getClassIdAndConfidences(scores);
             ^~~~~~~~~~~~~~~~~~~~~~~
2 warnings generated.
[100%] Linking CXX executable opencv_zoo_object_detection_nanodet
[100%] Built target opencv_zoo_object_detection_nanodet

We use C++11 standard as same as OpenCV 4.x.

ryan1288 · 2024-02-26T08:03:58Z

Appreciate the review, @fengyuentau! I've addressed the comments, but please flag any issues or suggest improvements.

Successfully tested on my system using the following commands:

cmake -B build -D OPENCV_INSTALLATION_PATH=/path/to/opencv/build -D CMAKE_CXX_STANDARD=11 .
cmake --build build
./build/opencv_zoo_object_detection_nanodet

On a side note, I observed a consistent format across all C++ demos. I refactored some functions, but if strict adherence is preferred, I can revert these changes. Let me know your thoughts.

models/object_detection_nanodet/CMakeLists.txt

models/object_detection_nanodet/demo.cpp

fengyuentau · 2024-02-26T13:18:54Z

models/object_detection_nanodet/demo.cpp

+        Mat confidences = std::get<1>(classIdAndConfidences);
+
+        vector<int> indices;
+        NMSBoxes(boxesXYXY, confidences, probThreshold, iouThreshold, indices);


You can also take a look at NMSBoxesBatched which may simply the code:

opencv_zoo/models/object_detection_yolox/yolox.py

Lines 59 to 64 in fd2da74

# get scores and class indices

scores = dets[:, 4:5] * dets[:, 5:]

max_scores = np.amax(scores, axis=1)

max_scores_idx = np.argmax(scores, axis=1)

keep = cv2.dnn.NMSBoxesBatched(boxes_xyxy.tolist(), max_scores.tolist(), max_scores_idx.tolist(), self.confThreshold, self.nmsThreshold)

Thank you for the suggestion! The code is now cleaner and more efficient, achieving a higher frame rate of 25-26 FPS (see PR description).

…ssID and confidence function, and use NMSBoxesBatched now.

ryan1288 · 2024-02-27T09:47:25Z

(hope it's ok to ask you here 😅)
What's the recommended channel for discussing OpenCV projects with GSoC mentors? Is it possible to receive an invite or schedule a brief 5-10 minute chat with you or another mentor? I applied to the Project Discussion List but noticed limited activity.

My interest in long-term open-source projects aligns well with (C++ or Python):

opencv_zoo: A White Paper on Neural Network Quantization was a fun read! I have experience validating ONNX models and converting them to TRT for onboard inference. I'm generally interested in any projects related to this repository.
Multi-camera calibration: I have a background in geometric/traditional computer vision and have calibrated cameras before (though not multi-camera setups).
SLAM/NeRF: I understand and have worked with probabilistic robotics, completed a SLAM course, and re-implemented the original NeRF paper.

Thanks for your time and guidance @fengyuentau!

fengyuentau · 2024-02-28T08:56:04Z

What's the recommended channel for discussing OpenCV projects with GSoC mentors?

You can find others in https://groups.google.com/g/opencv-gsoc-202x. Normally, mentors will be more interactive as time move to proposal submit stage.

fengyuentau

Thank you for the contribution 👍

ryan1288 added 6 commits February 24, 2024 03:11

Functional and Refactored demo.cpp for MobileNet

4f71599

Fix inference timer text, added timer reset.

e554eb2

Updated README.md, remove FPS text for single image processing

b937304

Add matching saved image message

b8e692f

Removing inference time printout for video inputs

d0c1147

Update FPS text to 2 decimal places

959a170

ryan1288 changed the title ~~CPP Demo - Object Detection (NanoDet)~~ C++ Demo - Object Detection (NanoDet) Feb 24, 2024

Update coding style for braces

1b021e0

ryan1288 mentioned this pull request Feb 24, 2024

Add C++ demos (Updated on 2024-06-03) #135

Open

ryan1288 marked this pull request as ready for review February 26, 2024 00:50

fengyuentau self-requested a review February 26, 2024 03:13

fengyuentau self-assigned this Feb 26, 2024

fengyuentau added the demo anything related to demo in Python / C++ label Feb 26, 2024

fengyuentau added this to the 4.10.0 milestone Feb 26, 2024

fengyuentau reviewed Feb 26, 2024

View reviewed changes

Address PR comments. Adjusted to C++11 standard.

ef0cf9b

fengyuentau reviewed Feb 26, 2024

View reviewed changes

models/object_detection_nanodet/CMakeLists.txt Show resolved Hide resolved

fengyuentau reviewed Feb 26, 2024

View reviewed changes

models/object_detection_nanodet/demo.cpp Outdated Show resolved Hide resolved

fengyuentau reviewed Feb 26, 2024

View reviewed changes

Addressed PR comments. Added C++11 cmake configuration, extracted cla…

9cafb71

…ssID and confidence function, and use NMSBoxesBatched now.

fengyuentau approved these changes Feb 28, 2024

View reviewed changes

fengyuentau merged commit f53754a into opencv:main Feb 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

C++ Demo - Object Detection (NanoDet) #232

C++ Demo - Object Detection (NanoDet) #232

Uh oh!

ryan1288 commented Feb 24, 2024 •

edited

Loading

Uh oh!

ryan1288 commented Feb 24, 2024

Uh oh!

fengyuentau left a comment •

edited

Loading

Uh oh!

ryan1288 commented Feb 26, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

fengyuentau Feb 26, 2024

Uh oh!

ryan1288 Feb 27, 2024 •

edited

Loading

Uh oh!

ryan1288 commented Feb 27, 2024

Uh oh!

fengyuentau commented Feb 28, 2024

Uh oh!

fengyuentau left a comment

Uh oh!

Uh oh!

	# get scores and class indices
	scores = dets[:, 4:5] * dets[:, 5:]
	max_scores = np.amax(scores, axis=1)
	max_scores_idx = np.argmax(scores, axis=1)

	keep = cv2.dnn.NMSBoxesBatched(boxes_xyxy.tolist(), max_scores.tolist(), max_scores_idx.tolist(), self.confThreshold, self.nmsThreshold)

C++ Demo - Object Detection (NanoDet) #232

C++ Demo - Object Detection (NanoDet) #232

Uh oh!

Conversation

ryan1288 commented Feb 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Testing

Run the Demo

Output Visualization Images from C++ and Python

Output Video from C++ and Python

Confirm matching I/O and intermediate values

Uh oh!

ryan1288 commented Feb 24, 2024

Uh oh!

fengyuentau left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ryan1288 commented Feb 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fengyuentau Feb 26, 2024

Choose a reason for hiding this comment

Uh oh!

ryan1288 Feb 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ryan1288 commented Feb 27, 2024

Uh oh!

fengyuentau commented Feb 28, 2024

Uh oh!

fengyuentau left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ryan1288 commented Feb 24, 2024 •

edited

Loading

fengyuentau left a comment •

edited

Loading

ryan1288 commented Feb 26, 2024 •

edited

Loading

ryan1288 Feb 27, 2024 •

edited

Loading