-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added multi-threaded scan matching using OpenMP #14
base: master
Are you sure you want to change the base?
Conversation
@@ -1,6 +1,6 @@ | |||
<package> | |||
<name>openslam_gmapping</name> | |||
<version>0.1.2</version> | |||
<version>0.1.3</version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't change that file please
@@ -6,6 +6,7 @@ find_package(catkin) | |||
catkin_package( | |||
INCLUDE_DIRS include | |||
LIBRARIES gridfastslam scanmatcher sensor_base sensor_range sensor_odometry utils | |||
DEPENDS OpenMP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not necessary
if(OPENMP_FOUND) | ||
message(STATUS "OPENMP FOUND") | ||
set(OpenMP_FLAGS ${OpenMP_CXX_FLAGS}) # or if you use C: ${OpenMP_C_FLAGS} | ||
set(OpenMP_LIBS gomp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not necessary, please have a look at https://cmake.org/cmake/help/v3.0/module/FindOpenMP.html and http://stackoverflow.com/questions/17633513/cmake-cannot-find-openmp
add_library(gridfastslam | ||
gfsreader.cpp | ||
gridslamprocessor.cpp | ||
gridslamprocessor_tree.cpp | ||
motionmodel.cpp | ||
) | ||
target_link_libraries(gridfastslam scanmatcher sensor_range) | ||
|
||
target_link_libraries(gridfastslam scanmatcher sensor_range ${OpenMP_LIBS}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should not be changed I believe.
@@ -243,6 +245,7 @@ namespace GMapping { | |||
/**minimum score for considering the outcome of the scanmatching good*/ | |||
PARAM_SET_GET(double, minimumScore, protected, public, public); | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unnecessary blank line
@@ -15,6 +15,8 @@ | |||
#include <gmapping/scanmatcher/scanmatcher.h> | |||
#include "motionmodel.h" | |||
|
|||
#include <omp.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't be necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's necessary because I'm calling theomp_get_wtime()
function to measure execution time.
// the functions below performs side effect on the internal structure, | ||
//should be called only inside the processScan method | ||
private: | ||
|
||
std::vector<double> durations; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is that a member and not created before the for loop in the scanMatch function ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because I want to store the execution time of each function call. so it has to be valid outside the function's scope as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just use two members such as sumscore and update_count to compute the average execution time, since the execution time of each function call is of little use?
Changed the things you mentioned. |
If I use an openmp function, I do have to link against it of course. |
I realized that I also need to keep the set(CXX_FLAGS) because no optimization is done otherwise. |
hi, i found using openmp would result in core dump such as "corrupted double-linked list" more often, did you ever met that? |
No, I never had that issue. Could you run valgrind on it and post the result? |
RESAMPLE** all processes on machine have died, roslaunch will exit |
Any follow ups? |
Hey everyone, |
it->weight+=l; | ||
it->weightSum+=l; | ||
m_particles[i].weight+=l; | ||
m_particles[i].weightSum+=l; | ||
|
||
//set up the selective copy of the active area | ||
//by detaching the areas that will be updated | ||
m_matcher.invalidateActiveArea(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with these two lines:
m_matcher.invalidateActiveArea();
m_matcher.computeActiveArea(m_particles[i].map, m_particles[i].pose, plainReading)
They change map and matcher and they are designed to work sequentially. I suppose we should split this loop in two,
something like
//parallel
#pragma omp parallel for reduction(+:sumScore)
for (int i = 0; i < m_particles.size(); i++){
OrientedPoint corrected;
double score, l, s;
score=m_matcher.optimize(corrected, m_particles[i].map, m_particles[i].pose, plainReading);
// it->pose=corrected;
if (score>m_minimumScore){
m_particles[i].pose=corrected;
// m_infoStream << "Pose was corrected!" << std::endl;
} else {
if (m_infoStream){
m_infoStream << "Scan Matching Failed, using odometry. Likelihood=" << l <<std::endl;
m_infoStream << "lp:" << m_lastPartPose.x << " " << m_lastPartPose.y << " "<< m_lastPartPose.theta <<std::endl;
m_infoStream << "op:" << m_odoPose.x << " " << m_odoPose.y << " "<< m_odoPose.theta <<std::endl;
}
}
m_matcher.likelihoodAndScore(s, l, m_particles[i].map, m_particles[i].pose, plainReading);
sumScore+=score;
m_particles[i].weight+=l;
m_particles[i].weightSum+=l;
m_matcher.invalidateActiveArea();
m_matcher.computeActiveArea(m_particles[i].map, m_particles[i].pose, plainReading);
}
//sequential
for (int i = 0; i < m_particles.size(); i++){
//set up the selective copy of the active area
//by detaching the areas that will be updated
m_matcher.invalidateActiveArea();
m_matcher.computeActiveArea(m_particles[i].map, m_particles[i].pose, plainReading);
}
Also, m_infoStream << ...
will be garbled if two or more threads don't get enough score. I suggest either omp critical
or plain removal of these debug lines.
EDIT:
Alternatively, in scanmatecher.cpp
in registerScan
you can place real allocation of memory inside a critical section.
#pragma omp critical(alloc_area)
{
map.storage().allocActiveArea();
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eybee could you create new PR against melodic-devel
branch? we have switched from master
to melodic-branch
when we changed licence to BSD
I profiled the gmapping process and found that the scan matching particle filter is very time consuming.
It easily forces even more modern CPU cores to 100% load on larger particle numbers utilizing only one core.
With the few changes I made, the scan matching process is now multi-threaded using the OpenMP library.
This results in a much better performance of gmapping in total.
I tested the package on three different platforms;