Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++ Demo - Object Detection (NanoDet) #232

Merged
merged 9 commits into from
Feb 28, 2024

Conversation

ryan1288
Copy link
Contributor

@ryan1288 ryan1288 commented Feb 24, 2024

This PR adds the CPP equivalent version demo.cpp of the existing Python demo for the NanoDet object detection model. The README is correspondingly updated. The interface matches other cpp demos within the set of ML models. The C++ demo more than doubles the FPS of the equivalent Python demo for a video feed. Issue was pointed out in #135 (comment).

Average FPS on my laptop (AMD Ryzen 7 5800H) for a (640 x 480) webcam feed:

  • Python: 10 FPS
  • C++: 25 FPS

Testing

  • Run the demo for both single-image and video (with > 1 object) inputs
  • Confirm matching I/O and intermediate values

Run the Demo

Output Visualization Images from C++ and Python

image
Input
scene
C++
Command: ./build/opencv_zoo_object_detection_nanodet -i=scene.jpg
result_cpp
Python
Command: python3 demo.py -i scene.jpg
result_python

Output Video from C++ and Python

C++
Command: ./build/opencv_zoo_object_detection_nanodet
video_cpp_new
Python
Command: python3 demo.py
video_python

Confirm matching I/O and intermediate values

C++

Mat infer(const Mat& sourceImage)
{
    cout << "sourceImage - shape | type | first value | 3rd value" << endl;
    cout << sourceImage.size() << " | " << typeToString(sourceImage.type()) << " | " << sourceImage.at<float>(0) << " | " << sourceImage.at<float>(2) << endl;
    Mat blob = this->preProcess(sourceImage);
    cout << "blob - shape | type | first value | 10th value" << endl;
    cout << blob.size() << " | " << typeToString(blob.type()) << " | " << blob.at<float>(0) << " | " << blob.at<float>(9) << endl;
    this->net.setInput(blob);
    vector<Mat> modelOutput;
    this->net.forward(modelOutput, this->net.getUnconnectedOutLayersNames());
    cout << "modelOutput - size || first value's shape | type | first value | 10th value" << endl;
    cout << modelOutput.size() << " || " << modelOutput[0].size() << " | " << typeToString(modelOutput[0].type()) << " | " << modelOutput[0].at<float>(0) << " | " << modelOutput[0].at<float>(9) << endl;
    Mat preds = this->postProcess(modelOutput);
    cout << "preds - shape | type | first value | 6th value" << endl;
    cout << preds.size() << " | " << typeToString(preds.type()) << " | " << preds.at<float>(0) << " | " << preds.at<float>(5) << endl;
    return preds;
}

image
Python

def infer(self, srcimg):
    print("sourceImage - shape | type | first value | 3rd value")
    print(srcimg.shape, '|', srcimg.dtype, '|', srcimg[0,0,0], '|', srcimg[0,0,2])
    blob = self.pre_process(srcimg)
    print("blob - shape | type | first value | 10th value")
    print(blob.shape, '|', blob.dtype, '|', blob[0,0,0,0], '|', blob[0,0,0,9])
    self.net.setInput(blob)
    outs = self.net.forward(self.net.getUnconnectedOutLayersNames())
    print("modelOutput - size || first value's shape | type | first value | 10th value")
    print(len(outs), '||', outs[0].shape, '|', outs[0].dtype, '|', outs[0][0,0,0], '|', outs[0][0,0,9])
    preds = self.post_process(outs)
    print("preds - shape | type | first value | 6th value")
    print(preds.shape, '|', preds.dtype, '|', preds[0,0], '|', preds[0,5])
    return preds

image

Test Summary: Both the visualizations and the intermediate values and I/O (values, shapes, and types) are identical between the Python and C++ demos.

@ryan1288 ryan1288 changed the title CPP Demo - Object Detection (NanoDet) C++ Demo - Object Detection (NanoDet) Feb 24, 2024
@ryan1288
Copy link
Contributor Author

@fengyuentau I finally finished my research project so I'm super excited to finally start working on these projects!
This is my very first open-sourced contribution so any feedback is greatly appreciated 😄. It took a while to figure out all the OpenCV interfaces and proper cv::Mat usage (but it was fun to learn!).

It's somewhat refactored but I can definitely clean it up significantly more if you'd like, such as

  1. Use namespaces properly without calling using namespace X
  2. Separate the NanoDet class into a separate nanodet.hpp, nanodet.cpp library
  3. Refactor helper functions to be more efficient and clean
  4. More descriptive comments
  5. Better error handling
  6. Remove magic numbers

I read the guide from OpenCV Coding Style Guide but I may very well have missed some. Please let me know if it's not up to standards and I can clean it up further.

@ryan1288 ryan1288 marked this pull request as ready for review February 26, 2024 00:50
@fengyuentau fengyuentau self-requested a review February 26, 2024 03:13
@fengyuentau fengyuentau self-assigned this Feb 26, 2024
@fengyuentau fengyuentau added the demo anything related to demo in Python / C++ label Feb 26, 2024
@fengyuentau fengyuentau added this to the 4.10.0 milestone Feb 26, 2024
Copy link
Member

@fengyuentau fengyuentau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got two warnings building the demo:

cmake --build build
[ 50%] Building CXX object CMakeFiles/opencv_zoo_object_detection_nanodet.dir/demo.cpp.o
/workspace/opencv_zoo/models/object_detection_nanodet/demo.cpp:115:39: warning: left operand of comma operator has no effect [-Wunused-value]
        return projection.reshape(0, (4, projection.total() / 4));
                                      ^
/workspace/opencv_zoo/models/object_detection_nanodet/demo.cpp:238:14: warning: decomposition declarations are a C++17 extension [-Wc++17-extensions]
        auto [classIds, confidences] = getClassIdAndConfidences(scores);
             ^~~~~~~~~~~~~~~~~~~~~~~
2 warnings generated.
[100%] Linking CXX executable opencv_zoo_object_detection_nanodet
[100%] Built target opencv_zoo_object_detection_nanodet

We use C++11 standard as same as OpenCV 4.x.

@ryan1288
Copy link
Contributor Author

ryan1288 commented Feb 26, 2024

Appreciate the review, @fengyuentau! I've addressed the comments, but please flag any issues or suggest improvements.

Successfully tested on my system using the following commands:

  1. cmake -B build -D OPENCV_INSTALLATION_PATH=/path/to/opencv/build -D CMAKE_CXX_STANDARD=11 .
  2. cmake --build build
  3. ./build/opencv_zoo_object_detection_nanodet

On a side note, I observed a consistent format across all C++ demos. I refactored some functions, but if strict adherence is preferred, I can revert these changes. Let me know your thoughts.

Mat confidences = std::get<1>(classIdAndConfidences);

vector<int> indices;
NMSBoxes(boxesXYXY, confidences, probThreshold, iouThreshold, indices);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also take a look at NMSBoxesBatched which may simply the code:

# get scores and class indices
scores = dets[:, 4:5] * dets[:, 5:]
max_scores = np.amax(scores, axis=1)
max_scores_idx = np.argmax(scores, axis=1)
keep = cv2.dnn.NMSBoxesBatched(boxes_xyxy.tolist(), max_scores.tolist(), max_scores_idx.tolist(), self.confThreshold, self.nmsThreshold)

Copy link
Contributor Author

@ryan1288 ryan1288 Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestion! The code is now cleaner and more efficient, achieving a higher frame rate of 25-26 FPS (see PR description).

…ssID and confidence function, and use NMSBoxesBatched now.
@ryan1288
Copy link
Contributor Author

(hope it's ok to ask you here 😅)
What's the recommended channel for discussing OpenCV projects with GSoC mentors? Is it possible to receive an invite or schedule a brief 5-10 minute chat with you or another mentor? I applied to the Project Discussion List but noticed limited activity.

My interest in long-term open-source projects aligns well with (C++ or Python):

  • opencv_zoo: A White Paper on Neural Network Quantization was a fun read! I have experience validating ONNX models and converting them to TRT for onboard inference. I'm generally interested in any projects related to this repository.
  • Multi-camera calibration: I have a background in geometric/traditional computer vision and have calibrated cameras before (though not multi-camera setups).
  • SLAM/NeRF: I understand and have worked with probabilistic robotics, completed a SLAM course, and re-implemented the original NeRF paper.

Thanks for your time and guidance @fengyuentau!

@fengyuentau
Copy link
Member

What's the recommended channel for discussing OpenCV projects with GSoC mentors?

You can find others in https://groups.google.com/g/opencv-gsoc-202x. Normally, mentors will be more interactive as time move to proposal submit stage.

Copy link
Member

@fengyuentau fengyuentau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution 👍

@fengyuentau fengyuentau merged commit f53754a into opencv:main Feb 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
demo anything related to demo in Python / C++
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants