only inferring a single image frame once for a huge performance boost… #301

tomlankhorst · 2021-02-13T14:44:01Z

Author: @zoombinis

When I first used darknet_ros, I discovered a single image frame from a video camera was being used as the same input for inference redundantly up to 70 times on my NVidia Quadro RTX 3000!

So to use far less processing resources unnecessarily, this PR simply keeps track of the last consumed image frame ID to ensure it doesn't infer with it again.

This resolves these issues:
#150
ZhiangChen/target_mapping#8

tomlankhorst · 2021-02-13T14:48:28Z

@zoombinis Some tests fail: https://github.com/leggedrobotics/darknet_ros/pull/301/checks?check_run_id=1893822361
The build status is 'Success' but that's because the GitHub action didn't recognize the failure yet.

…. Before the same image would run inference multiple times redundantly hogging resources

ghost · 2021-02-15T14:17:19Z

@tomlankhorst Should the unit test be setting a header on the image message (specifying timestamp, index especially)?

This change relies on the image having an index, which they always should, so it could just be due to the test setup. I'm away from my development environment for a while, sorry I can't verify right now.

Zorbing

The proposed changes do improve the performance, but the results are no longer accurate all the time.

Because of multithreading, the buffer works as follows:

(buffIndex_ + 3) % 3 (also short buffIndex_) is currently being fetched in the fetch_thread
(buffIndex_ + 2) % 3 is currently being processed in detect_thread
(buffIndex_ + 1) % 3 is a completely processed frame with detections from the previous loop iteration

When a new image has been fetched ("new" as in has a new sequence number) that needs to be processed by the detection thread, the last image is still the old image ("old" as in with the old sequence number) which did not get any detection.
Because when prevSeq_ != headerBuff_[(buffIndex_ + 2) % 3].seq yields true, the publishInThread function uses information from the previous iteration (from buff_[(buffIndex_ + 1) % 3]).
Therefore it takes three iterations before an image is fully processed and detections are published:

fetching the image
detecting objects
publishing the bounding boxes and the detection image

Either the publishing is done right after detect_thread.join(), or it needs to wait until the next iteration, which is suggested in the following changes:

// Fix0: keep track of what std_msgs::Header id this is (consecutively increasing)
std::uint32_t prevSeq_ = 0;
bool newImageForDetection = false;
bool hasDetectionsReady = false;
while (!demoDone_)
{
  buffIndex_ = (buffIndex_ + 1) % 3;
  // Fix1: check this isn't an image already seen
  newImageForDetection = (prevSeq_ != headerBuff_[(buffIndex_ + 2) % 3].seq);

  fetch_thread = std::thread(&YoloObjectDetector::fetchInThread, this);
  if (newImageForDetection)
  {
    // Fix2: only detect if this is an image we haven't see before
    detect_thread = std::thread(&YoloObjectDetector::detectInThread, this);
  }

  // only publish bounding boxes if detection has been done in the last iteration
  if (hasDetectionsReady)
  {
    if (!demoPrefix_)
    {
      fps_ = 1. / (what_time_is_it_now() - demoTime_);
      demoTime_ = what_time_is_it_now();
      if (viewImage_)
      {
        displayInThread(0);
      }
      else
      {
        generate_image(buff_[(buffIndex_ + 1) % 3], ipl_);
      }
      publishInThread();
    }
    else
    {
      char name[256];
      sprintf(name, "%s_%08d", demoPrefix_, count);
      save_image(buff_[(buffIndex_ + 1) % 3], name);
      ++count;
    }
    // state that the image has been published
    hasDetectionsReady = false;
  }

  fetch_thread.join();
  if (newImageForDetection)
  {
    // Fix3: increment the new sequence number to avoid detecting more than once
    prevSeq_ = headerBuff_[(buffIndex_ + 2) % 3].seq;
    // Fix4: no detection made, so let thread execution complete so that it can be destroyed safely
    detect_thread.join();
    // only after the detect thread is joined, set this flag to true
    hasDetectionsReady = true;
  }
  if (!isNodeRunning())
  {
    demoDone_ = true;
  }
}

ghost · 2021-03-21T17:02:38Z

@tomlankhorst I got tests passing and even included @Zorbing improvement.

As predicted, the issue was due to missing header information. Tests are now correctly passing that information through. Here is my commit diff:

feature/zoombinis-infer-single-image...zoombinis:feature/zoombinis-infer-single-image

However, there's still some inaccuracy (<300%) since we're no longer iterating detection on the same image. If you look at the output it's not unacceptable by some standards - good enough for me :) at least

Zorbing · 2021-03-22T10:50:37Z

I don't think you can change the source branch of a PR (the target branch works though). So you probably have to create a new PR and link and close this one.

ghost · 2021-03-22T14:13:31Z

@tomlankhorst I don't have the ability to close but here is a new PR: #305

tomlankhorst · 2021-03-22T19:53:52Z

Thanks for the new PR

tomlankhorst mentioned this pull request Feb 13, 2021

only inferring a single image frame once for a huge performance boost #299

Closed

only inferring a single image frame once for a huge performance boost…

02f2170

…. Before the same image would run inference multiple times redundantly hogging resources

tomlankhorst force-pushed the feature/zoombinis-infer-single-image branch from 365b80b to 02f2170 Compare February 13, 2021 15:01

Zorbing suggested changes Mar 5, 2021

View reviewed changes

ghost mentioned this pull request Mar 22, 2021

Feature/zoombinis infer single image #305

Open

tomlankhorst closed this Mar 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

only inferring a single image frame once for a huge performance boost… #301

only inferring a single image frame once for a huge performance boost… #301

tomlankhorst commented Feb 13, 2021

tomlankhorst commented Feb 13, 2021

ghost commented Feb 15, 2021

Zorbing left a comment •

edited

Loading

ghost commented Mar 21, 2021

Zorbing commented Mar 22, 2021

ghost commented Mar 22, 2021

tomlankhorst commented Mar 22, 2021

only inferring a single image frame once for a huge performance boost… #301

only inferring a single image frame once for a huge performance boost… #301

Conversation

tomlankhorst commented Feb 13, 2021

tomlankhorst commented Feb 13, 2021

ghost commented Feb 15, 2021

Zorbing left a comment • edited Loading

Choose a reason for hiding this comment

ghost commented Mar 21, 2021

Zorbing commented Mar 22, 2021

ghost commented Mar 22, 2021

tomlankhorst commented Mar 22, 2021

Zorbing left a comment •

edited

Loading