Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TransformListener hangs in destructor when using dedicated thread #517

Open
zygfrydw opened this issue Apr 14, 2022 · 3 comments
Open

TransformListener hangs in destructor when using dedicated thread #517

zygfrydw opened this issue Apr 14, 2022 · 3 comments
Assignees

Comments

@zygfrydw
Copy link

Bug report

Required Info:

  • Operating System:
    • all systems, detected on Docker + Ubuntu 20.04
  • Installation type:
    • from binaries
  • Version or commit hash:
    • galactic
  • DDS implementation:
    • cyclone dds
  • Client library (if applicable):
    • N/A

Steps to reproduce issue

I have noticed timeouts in my unit tests for node using TransformListener because o TransformListener hangs on destructor.
Here is a simple test:

#include <gtest/gtest.h>
#include <rclcpp/rclcpp.hpp>
#include <tf2_ros/transform_listener.h>
#include <tf2_ros/buffer.h>

class MyNode : public rclcpp::Node
{
public:
  explicit MyNode(const rclcpp::NodeOptions& options) 
  : rclcpp::Node::Node("MyNode"),
  tf_buffer_(this->get_clock()),
  tf_listener_(this->tf_buffer_)
  {
  }

  virtual ~MyNode(){}

private:
  tf2_ros::Buffer tf_buffer_;
  tf2_ros::TransformListener tf_listener_;
};

TEST_F(MyNodeTest, I_should_have_renamed_this_test) {
  if (!rclcpp::ok()) {
    rclcpp::init(0, nullptr);
  }
  {
    auto my_node = std::make_unique<MyNode>();
    // Do some test with MyNode
  }
  // The test will block here and wait indefinatelly because of bug in tf_listener destructor
}

Expected behaviour

Execution will not block on TransformListener destructor indefinitely.

Actual behavior

Execution blocks on TransformListener destructor indefinitely.

Additional information

This bug is caused by race condition with running and stopping SingleTreadedThreadSafeExecutor in TransformListener.
The current implementation is:

auto executor = std::make_shared<rclcpp::executors::SingleThreadedExecutor>();
auto run_func =
[executor](rclcpp::node_interfaces::NodeBaseInterface::SharedPtr node_base_interface) {
    executor->add_node(node_base_interface);
    executor->spin();
    executor->remove_node(node_base_interface);
};
dedicated_listener_thread_ = thread_ptr(
new std::thread(run_func, node_base_interface),
[executor](std::thread * t) {
    executor->cancel();
    t->join();
    delete t;
});

If the dedicated_listener_thread_ is not able to run before main thread enters destructor for TransformListener the destructor hangs because dedicated_listener_thread_ is spinning.
The order of execution is as follow:

  1. main thread creates dedicated_listener_thread_
  2. main thread perform all its jobs and TransformListener destructor
  3. Main thread in TransformListener destructor calls executor->cancel() which sets executor.spinning = false
  4. Main thread waits for dedicated_listener_thread_ to finish
  5. Dedicated thread starts running and calls executor-spin()
  6. Dedicated thread at the beginning of executor->spin() sets executor.spinning = true
  7. Dedicated thread is spinning waiting for messages or cancel which will not happen because main thread is blocked in t->join();
@zygfrydw
Copy link
Author

I have fixed this problem in this pull request.

@AlexeyMerzlyakov
Copy link

Hi! I have experienced the same problem on similar testcase.
The code has been changed, but patch still could be moved there. Applying the patch from #518 did help me:

diff --git a/tf2_ros/include/tf2_ros/transform_listener.h b/tf2_ros/include/tf2_ros/transform_listener.h
index 01620a8..e522a72 100644
--- a/tf2_ros/include/tf2_ros/transform_listener.h
+++ b/tf2_ros/include/tf2_ros/transform_listener.h
@@ -36,6 +36,7 @@
 #include <memory>
 #include <thread>
 #include <utility>
+#include <atomic>
 
 #include "tf2/buffer_core.h"
 #include "tf2/time.h"
@@ -142,8 +143,14 @@ private:
 
       // Create executor with dedicated thread to spin.
       executor_ = std::make_shared<rclcpp::executors::SingleThreadedExecutor>();
+      keep_running_ = true;
       executor_->add_callback_group(callback_group_, node_base_interface_);
-      dedicated_listener_thread_ = std::make_unique<std::thread>([&]() {executor_->spin();});
+      dedicated_listener_thread_ = std::make_unique<std::thread>(
+        [&]() {
+          while (keep_running_.load()) {
+            executor_->spin_once();
+          }
+        });
       // Tell the buffer we have a dedicated thread to enable timeouts
       buffer_.setUsingDedicatedThread(true);
     } else {
@@ -162,6 +169,7 @@ private:
   std::unique_ptr<std::thread> dedicated_listener_thread_;
   rclcpp::CallbackGroup::SharedPtr callback_group_{nullptr};
   rclcpp::executors::SingleThreadedExecutor::SharedPtr executor_;
+  std::atomic_bool keep_running_;
 
   rclcpp::Node::SharedPtr optional_default_node_ = nullptr;
   rclcpp::Subscription<tf2_msgs::msg::TFMessage>::SharedPtr message_subscription_tf_;
diff --git a/tf2_ros/src/transform_listener.cpp b/tf2_ros/src/transform_listener.cpp
index 3f59d5f..5c37f73 100644
--- a/tf2_ros/src/transform_listener.cpp
+++ b/tf2_ros/src/transform_listener.cpp
@@ -61,6 +61,7 @@ TransformListener::TransformListener(tf2::BufferCore & buffer, bool spin_thread)
 TransformListener::~TransformListener()
 {
   if (spin_thread_) {
+    keep_running_.store(false);
     executor_->cancel();
     dedicated_listener_thread_->join();
   }

Any updates on this issue?

@PurpleStephyr
Copy link

Are there any plans to fix this issue? I am seeing the same bug on humble, and can confirm that this patch fixes it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants