Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added multi-threaded scan matching using OpenMP #14

Open
wants to merge 15 commits into
base: master
Choose a base branch
from
10 changes: 9 additions & 1 deletion gridfastslam/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,9 +1,17 @@
find_package(OpenMP REQUIRED)
if(OPENMP_FOUND)
set(OpenMP_LIBS gomp)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${OpenMP_C_FLAGS}")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${OpenMP_CXX_FLAGS}")
endif()

add_library(gridfastslam
gfsreader.cpp
gridslamprocessor.cpp
gridslamprocessor_tree.cpp
motionmodel.cpp
)
target_link_libraries(gridfastslam scanmatcher sensor_range)

target_link_libraries(gridfastslam scanmatcher sensor_range ${OpenMP_LIBS})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should not be changed I believe.


install(TARGETS gridfastslam DESTINATION ${CATKIN_PACKAGE_LIB_DESTINATION})
5 changes: 3 additions & 2 deletions include/gmapping/gridfastslam/gridslamprocessor.h
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@
#include <gmapping/scanmatcher/scanmatcher.h>
#include "motionmodel.h"

#include <omp.h>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't be necessary.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's necessary because I'm calling theomp_get_wtime() function to measure execution time.



namespace GMapping {

Expand Down Expand Up @@ -305,11 +307,10 @@ namespace GMapping {
// stream in which to write the messages
std::ostream& m_infoStream;


// the functions below performs side effect on the internal structure,
//should be called only inside the processScan method
private:

std::vector<double> durations;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is that a member and not created before the for loop in the scanMatch function ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because I want to store the execution time of each function call. so it has to be valid outside the function's scope as well.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just use two members such as sumscore and update_count to compute the average execution time, since the execution time of each function call is of little use?

/**scanmatches all the particles*/
inline void scanMatch(const double *plainReading);
/**normalizes the particle weights*/
Expand Down
34 changes: 25 additions & 9 deletions include/gmapping/gridfastslam/gridslamprocessor.hxx
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,20 @@
If the scan matching fails, the particle gets a default likelihood.*/
inline void GridSlamProcessor::scanMatch(const double* plainReading){
// sample a new pose from each scan in the reference

double avg_duration = 0;
double sumScore=0;
for (ParticleVector::iterator it=m_particles.begin(); it!=m_particles.end(); it++){
double start = omp_get_wtime();

#pragma omp parallel for reduction(+:sumScore)
for (int i = 0; i < m_particles.size(); i++){
OrientedPoint corrected;
double score, l, s;
score=m_matcher.optimize(corrected, it->map, it->pose, plainReading);

score=m_matcher.optimize(corrected, m_particles[i].map, m_particles[i].pose, plainReading);
// it->pose=corrected;
if (score>m_minimumScore){
it->pose=corrected;
m_particles[i].pose=corrected;
// m_infoStream << "Pose was corrected!" << std::endl;
} else {
if (m_infoStream){
m_infoStream << "Scan Matching Failed, using odometry. Likelihood=" << l <<std::endl;
Expand All @@ -25,18 +30,29 @@ inline void GridSlamProcessor::scanMatch(const double* plainReading){
}
}

m_matcher.likelihoodAndScore(s, l, it->map, it->pose, plainReading);
m_matcher.likelihoodAndScore(s, l, m_particles[i].map, m_particles[i].pose, plainReading);
sumScore+=score;
it->weight+=l;
it->weightSum+=l;
m_particles[i].weight+=l;
m_particles[i].weightSum+=l;

//set up the selective copy of the active area
//by detaching the areas that will be updated
m_matcher.invalidateActiveArea();
Copy link

@topin89 topin89 Nov 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with these two lines:

m_matcher.invalidateActiveArea();
m_matcher.computeActiveArea(m_particles[i].map, m_particles[i].pose, plainReading)

They change map and matcher and they are designed to work sequentially. I suppose we should split this loop in two,
something like

//parallel
#pragma omp parallel for reduction(+:sumScore)
  for (int i = 0; i < m_particles.size(); i++){
    OrientedPoint corrected;
    double score, l, s;

    score=m_matcher.optimize(corrected, m_particles[i].map, m_particles[i].pose, plainReading);
    //    it->pose=corrected;
    if (score>m_minimumScore){
      m_particles[i].pose=corrected;
      // m_infoStream << "Pose was corrected!" << std::endl;
    } else {
	if (m_infoStream){
	  m_infoStream << "Scan Matching Failed, using odometry. Likelihood=" << l <<std::endl;
	  m_infoStream << "lp:" << m_lastPartPose.x << " "  << m_lastPartPose.y << " "<< m_lastPartPose.theta <<std::endl;
	  m_infoStream << "op:" << m_odoPose.x << " " << m_odoPose.y << " "<< m_odoPose.theta <<std::endl;
	}
    }

    m_matcher.likelihoodAndScore(s, l, m_particles[i].map, m_particles[i].pose, plainReading);
    sumScore+=score;
    m_particles[i].weight+=l;
    m_particles[i].weightSum+=l;


    m_matcher.invalidateActiveArea();
    m_matcher.computeActiveArea(m_particles[i].map, m_particles[i].pose, plainReading);
  }

//sequential
  for (int i = 0; i < m_particles.size(); i++){
    //set up the selective copy of the active area
    //by detaching the areas that will be updated
    m_matcher.invalidateActiveArea();
    m_matcher.computeActiveArea(m_particles[i].map, m_particles[i].pose, plainReading);
  }

Also, m_infoStream << ... will be garbled if two or more threads don't get enough score. I suggest either omp critical or plain removal of these debug lines.

EDIT:
Alternatively, in scanmatecher.cpp in registerScan you can place real allocation of memory inside a critical section.

#pragma omp critical(alloc_area)
    {
        map.storage().allocActiveArea();
    }

m_matcher.computeActiveArea(it->map, it->pose, plainReading);
m_matcher.computeActiveArea(m_particles[i].map, m_particles[i].pose, plainReading);
}
if (m_infoStream)

double duration = omp_get_wtime() - start;
durations.push_back(duration);
for(int i = 0; i < durations.size(); i++) {
avg_duration += durations[i];
}
if(durations.size() > 0) avg_duration /= durations.size();


if (m_infoStream) {
m_infoStream << "Average Scan Matching Score=" << sumScore/m_particles.size() << std::endl;
m_infoStream << "Average Scan Matching Time=" << avg_duration << "s" << std::endl;
}
}

inline void GridSlamProcessor::normalize(){
Expand Down