Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sporadically missing base_link tf frame for Marble HD2 robot #978

Open
malcolmst opened this issue Jul 18, 2021 · 10 comments
Open

Sporadically missing base_link tf frame for Marble HD2 robot #978

malcolmst opened this issue Jul 18, 2021 · 10 comments
Labels
bug Something isn't working

Comments

@malcolmst
Copy link

I have occasionally been seeing robots not starting correctly, and have been unsure about the cause.

I just had this issue reproduce locally with the Marble HD2 robot, and decided to dig in to take a closer look. With two identical robots (one started correctly and one did not), I ran rosrun tf view_frames in each solution container to compare the tf graphs. It turns out the robot which did not start correctly is missing the base_link tf frame.

I'll dig in a bit more if I can, as this is still running locally. I'll post an update here if I find out anything else.

missing_base_link

@malcolmst
Copy link
Author

It appears to be the pose_static topic that is missing on the bad robot:

"Bad robot" solution container

developer@59677bb4d5b4:~/subt_ws$ rostopic echo /X2N1/pose_static
WARNING: no messages received and simulated time is active.
Is /clock being published?

"Good robot" solution container:

developer@7692b3b77ba4:~/subt_ws$ rostopic echo /X2N2/pose_static
WARNING: no messages received and simulated time is active.
Is /clock being published?
transforms: 
  - 
    header: 
      seq: 0
      stamp: 
        secs: 368
        nsecs:         0
      frame_id: "X2N2/tilt_gimbal_link"
    child_frame_id: "X2N2/tilt_gimbal_link/camera_pan_tilt"
    transform: 
      translation: 
        x: 0.02
        y: 0.0
        z: 0.047725
      rotation: 
        x: 0.0
        y: 0.0
        z: 0.0
        w: 1.0
  - 
...

@malcolmst
Copy link
Author

malcolmst commented Jul 18, 2021

Couple more points while this is still running:

  • Looking at the bridge container, the corresponding ign topic is available and it does reference base_link:
developer@fae71645aca6:~/subt_ws$ ign topic -e -t /model/X2N1/pose_static
pose {
  header {
    stamp {
      sec: 443
    }
    data {
      key: "frame_id"
      value: "X2N1::tilt_gimbal_link"
    }
    data {
      key: "child_frame_id"
      value: "X2N1::tilt_gimbal_link::camera_pan_tilt"
    }
  }
  name: "X2N1::tilt_gimbal_link::camera_pan_tilt"
  position {
    x: 0.02
    z: 0.047725
  }
  orientation {
    w: 1
  }
}
pose {
  header {
    stamp {
      sec: 443
    }
    data {
      key: "frame_id"
      value: "X2N1::base_link"
    }
    data {
      key: "child_frame_id"
      value: "X2N1::base_link::imu_sensor"
    }
  }
  name: "X2N1::base_link::imu_sensor"
  position {
  }
  orientation {
    w: 1
  }
}
...
pose {
  header {
    stamp {
      sec: 443
    }
    data {
      key: "frame_id"
      value: "X2N1"
    }
    data {
      key: "child_frame_id"
      value: "X2N1::base_link"
    }
  }
  name: "X2N1::base_link"
  position {
  }
  orientation {
    w: 1
  }
}
...
  • The ros_ign_bridge_pose_static node is running and pingable (but is just not actively publishing):
developer@fae71645aca6:~/subt_ws$ rosnode info /X2N1/ros_ign_bridge_pose_static
--------------------------------------------------------------------------------
Node [/X2N1/ros_ign_bridge_pose_static]
Publications: 
 * /X2N1/pose_static [tf2_msgs/TFMessage]
 * /rosout [rosgraph_msgs/Log]

Subscriptions: 
 * /clock [rosgraph_msgs/Clock]

Services: 
 * /X2N1/ros_ign_bridge_pose_static/get_loggers
 * /X2N1/ros_ign_bridge_pose_static/set_logger_level


contacting node http://fae71645aca6:46313/ ...
Pid: 192
Connections:
 * topic: /rosout
    * to: /rosout
    * direction: outbound (54279 - 172.17.0.2:48428) [28]
    * transport: TCPROS
 * topic: /X2N1/pose_static
    * to: /X2N1/pose_tf_broadcaster
    * direction: outbound (54279 - 172.17.0.2:48582) [26]
    * transport: TCPROS
 * topic: /clock
    * to: /ros_ign_bridge_fae71645aca6_65_3753376215699130642 (http://fae71645aca6:39117/)
    * direction: inbound (54724 - fae71645aca6:42595) [27]
    * transport: TCPROS

developer@fae71645aca6:~/subt_ws$ rosnode ping /X2N1/ros_ign_bridge_pose_static
rosnode: node is [/X2N1/ros_ign_bridge_pose_static]
pinging /X2N1/ros_ign_bridge_pose_static with a timeout of 3.0s
xmlrpc reply from http://fae71645aca6:46313/	time=18.280983ms
xmlrpc reply from http://fae71645aca6:46313/	time=0.701904ms
xmlrpc reply from http://fae71645aca6:46313/	time=0.655890ms

I will need to stop this shortly since it is running in aws, but it appears that either the ros_ign_bridge_pose_static node is not correctly receiving the ign messages, or it is receiving them and is not publishing the ros topic for some reason.

@peci1
Copy link
Collaborator

peci1 commented Jul 18, 2021

I see the same behavior all the time sporadically with all kinds of topics. A good example is MARV robot with the 4 flippers, which are instructed to lift up in the beginning. Sometimes, only 3 or 2 of them lift up and the rest of them is "dead" for the whole simulation. I guess this behavior has something to do with the amount of nodes being started up at once - there's probably some bug in either ign-transport or ROS...

I was reproducing this quite reliably with 3 MARV robots - the 3rd one never moved... But I haven't tried this config for over a month, so maybe something has changed...

@peci1
Copy link
Collaborator

peci1 commented Jul 18, 2021

I'm pretty sure there's no time to correctly solve this issue. But could we ask DARPA for a workaround? Would you check after each simulation whether each robot has moved at least a tiny bit, and if there is one that hasn't, restart the run? It'd be really a pity to lose points just because ros-ign publishers not publishing...

@malcolmst
Copy link
Author

I agree, that sounds like a good workaround. It also should be possible to detect a case like this from the rostopic stats logs, which I believe includes ros topic frequency. Maybe something could automatically detect if one of the expected topics is not being published

@m3d
Copy link
Contributor

m3d commented Jul 19, 2021

We also observe situations where robot is not starting (robotika, b0572ed1-b723-4c95-9ca7-e88d3431ccd8, drone D). It was not receiving air_pressure topic - I am not sure how to check if it is identical to this reported topic? Re-run worked without any problem.

@m3d
Copy link
Contributor

m3d commented Jul 19, 2021

Probably another example 9fe7d49d-f9fd-4d47-a5c8-d68272e14861 drone H (still running). Note, that part of the strategy is that robots start in different times, but this one should start immediately ...

@malcolmst
Copy link
Author

malcolmst commented Jul 19, 2021

I believe I've been seeing this same issue periodically too, but haven't been taking the time to debug what is causing it until I saw it locally (well accessible to me on aws anyway). It does seem to be affect a variety of topics, actually also noticed in one run the other day a UAV was missing a depth camera topic.

For the missing pose topic, I didn't take the time to carefully debug the affected parameter_bridge in gdb, but if it helps at all I did dump the callstacks of each thread:

(gdb) info threads
  Id   Target Id         Frame 
* 1    Thread 0x7f6f414cdc00 (LWP 7033) "parameter_bridg" 0x00007f6f406faad3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7f6f406ecba8)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  2    Thread 0x7f6f387da700 (LWP 7146) "parameter_bridg" 0x00007f6f3f1d8a47 in epoll_wait (epfd=4, events=0x7f6f387d9620, maxevents=6, timeout=100) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
  3    Thread 0x7f6f2ffd9700 (LWP 7147) "parameter_bridg" 0x00007f6f3f1cbcb9 in __GI___poll (fds=0x7f6f28005740, nfds=1, timeout=100) at ../sysdeps/unix/sysv/linux/poll.c:29
  4    Thread 0x7f6f37fd9700 (LWP 7148) "parameter_bridg" 0x00007f6f406faad3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x55fb7fd5bf98)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  5    Thread 0x7f6f375c6700 (LWP 7161) "parameter_bridg" 0x00007f6f406fafb9 in futex_reltimed_wait_cancelable (private=<optimized out>, reltime=0x7f6f375c5650, expected=0, futex_word=0x55fb7fd61a98)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:142
  6    Thread 0x7f6f36795700 (LWP 7162) "ZMQbg/0" 0x00007f6f3f1d8a47 in epoll_wait (epfd=13, events=0x7f6f36793d30, maxevents=256, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
  7    Thread 0x7f6f35f94700 (LWP 7163) "ZMQbg/1" 0x00007f6f3f1d8a47 in epoll_wait (epfd=15, events=0x7f6f35f92d30, maxevents=256, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
  8    Thread 0x7f6f35793700 (LWP 7164) "parameter_bridg" 0x00007f6f3f1cbcb9 in __GI___poll (fds=0x7f6f357928e0, nfds=3, timeout=250) at ../sysdeps/unix/sysv/linux/poll.c:29
  9    Thread 0x7f6f34f92700 (LWP 7165) "parameter_bridg" 0x00007f6f3f1cbcb9 in __GI___poll (fds=0x7f6f34f918e0, nfds=1, timeout=50) at ../sysdeps/unix/sysv/linux/poll.c:29
  10   Thread 0x7f6f2f7d8700 (LWP 7166) "parameter_bridg" 0x00007f6f3f1cbcb9 in __GI___poll (fds=0x7f6f2f7d78e0, nfds=1, timeout=52) at ../sysdeps/unix/sysv/linux/poll.c:29
  11   Thread 0x7f6f2efd7700 (LWP 7167) "parameter_bridg" 0x00007f6f406fb065 in futex_abstimed_wait_cancelable (private=<optimized out>, abstime=0x7f6f2efd69a0, expected=0, futex_word=0x55fb7fd67178)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:205
  12   Thread 0x7f6f2e7d6700 (LWP 7169) "parameter_bridg" 0x00007f6f406fafb9 in futex_reltimed_wait_cancelable (private=<optimized out>, reltime=0x7f6f2e7d5650, expected=0, futex_word=0x55fb7fd58548)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:142

(gdb) bt
#0  0x00007f6f406faad3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7f6f406ecba8) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7f6f406ecbc0, cond=0x7f6f406ecb80) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x7f6f406ecb80, mutex=0x7f6f406ecbc0) at pthread_cond_wait.c:655
#3  0x00007f6f3f7778bc in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from target:/usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007f6f4048f80b in ignition::transport::v9::waitForShutdown() () from target:/usr/lib/x86_64-linux-gnu/libignition-transport9.so.9
#5  0x000055fb7f84537d in ?? ()
#6  0x00007f6f3f0d8bf7 in __libc_start_main (main=0x55fb7f844980, argc=5, argv=0x7fff87b8e928, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fff87b8e918)
    at ../csu/libc-start.c:310
#7  0x000055fb7f845b9a in ?? ()

(gdb) thread 1
[Switching to thread 1 (Thread 0x7f6f414cdc00 (LWP 7033))]
#0  0x00007f6f406faad3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7f6f406ecba8) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
88	in ../sysdeps/unix/sysv/linux/futex-internal.h
(gdb) bt
#0  0x00007f6f406faad3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7f6f406ecba8) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7f6f406ecbc0, cond=0x7f6f406ecb80) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x7f6f406ecb80, mutex=0x7f6f406ecbc0) at pthread_cond_wait.c:655
#3  0x00007f6f3f7778bc in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from target:/usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007f6f4048f80b in ignition::transport::v9::waitForShutdown() () from target:/usr/lib/x86_64-linux-gnu/libignition-transport9.so.9
#5  0x000055fb7f84537d in ?? ()
#6  0x00007f6f3f0d8bf7 in __libc_start_main (main=0x55fb7f844980, argc=5, argv=0x7fff87b8e928, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fff87b8e918)
    at ../csu/libc-start.c:310
#7  0x000055fb7f845b9a in ?? ()

(gdb) thread 2
[Switching to thread 2 (Thread 0x7f6f387da700 (LWP 7146))]
#0  0x00007f6f3f1d8a47 in epoll_wait (epfd=4, events=0x7f6f387d9620, maxevents=6, timeout=100) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
30	../sysdeps/unix/sysv/linux/epoll_wait.c: No such file or directory.
(gdb) bt
#0  0x00007f6f3f1d8a47 in epoll_wait (epfd=4, events=0x7f6f387d9620, maxevents=6, timeout=100) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1  0x00007f6f4102ab37 in ros::poll_sockets(int, pollfd*, unsigned long, int) () from target:/opt/ros/melodic/lib/libroscpp.so
#2  0x00007f6f410aed48 in ros::PollSet::update(int) () from target:/opt/ros/melodic/lib/libroscpp.so
#3  0x00007f6f4103a635 in ros::PollManager::threadFunc() () from target:/opt/ros/melodic/lib/libroscpp.so
#4  0x00007f6f3e874bcd in ?? () from target:/usr/lib/x86_64-linux-gnu/libboost_thread.so.1.65.1
#5  0x00007f6f406f46db in start_thread (arg=0x7f6f387da700) at pthread_create.c:463
#6  0x00007f6f3f1d871f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

(gdb) thread 3
[Switching to thread 3 (Thread 0x7f6f2ffd9700 (LWP 7147))]
#0  0x00007f6f3f1cbcb9 in __GI___poll (fds=0x7f6f28005740, nfds=1, timeout=100) at ../sysdeps/unix/sysv/linux/poll.c:29
29	../sysdeps/unix/sysv/linux/poll.c: No such file or directory.
(gdb) bt
#0  0x00007f6f3f1cbcb9 in __GI___poll (fds=0x7f6f28005740, nfds=1, timeout=100) at ../sysdeps/unix/sysv/linux/poll.c:29
#1  0x00007f6f3eea4280 in XmlRpc::XmlRpcDispatch::work(double) () from target:/opt/ros/melodic/lib/libxmlrpcpp.so
#2  0x00007f6f4101ce38 in ros::XMLRPCManager::serverThreadFunc() () from target:/opt/ros/melodic/lib/libroscpp.so
#3  0x00007f6f3e874bcd in ?? () from target:/usr/lib/x86_64-linux-gnu/libboost_thread.so.1.65.1
#4  0x00007f6f406f46db in start_thread (arg=0x7f6f2ffd9700) at pthread_create.c:463
#5  0x00007f6f3f1d871f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

(gdb) thread 4
[Switching to thread 4 (Thread 0x7f6f37fd9700 (LWP 7148))]
#0  0x00007f6f406faad3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x55fb7fd5bf98) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
88	../sysdeps/unix/sysv/linux/futex-internal.h: No such file or directory.
(gdb) bt
#0  0x00007f6f406faad3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x55fb7fd5bf98) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x55fb7fd5bf48, cond=0x55fb7fd5bf70) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x55fb7fd5bf70, mutex=0x55fb7fd5bf48) at pthread_cond_wait.c:655
#3  0x00007f6f4108de0f in ros::ROSOutAppender::logThread() () from target:/opt/ros/melodic/lib/libroscpp.so
#4  0x00007f6f3e874bcd in ?? () from target:/usr/lib/x86_64-linux-gnu/libboost_thread.so.1.65.1
#5  0x00007f6f406f46db in start_thread (arg=0x7f6f37fd9700) at pthread_create.c:463
#6  0x00007f6f3f1d871f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

(gdb) thread 5
[Switching to thread 5 (Thread 0x7f6f375c6700 (LWP 7161))]
#0  0x00007f6f406fafb9 in futex_reltimed_wait_cancelable (private=<optimized out>, reltime=0x7f6f375c5650, expected=0, futex_word=0x55fb7fd61a98) at ../sysdeps/unix/sysv/linux/futex-internal.h:142
142	in ../sysdeps/unix/sysv/linux/futex-internal.h
(gdb) bt
#0  0x00007f6f406fafb9 in futex_reltimed_wait_cancelable (private=<optimized out>, reltime=0x7f6f375c5650, expected=0, futex_word=0x55fb7fd61a98) at ../sysdeps/unix/sysv/linux/futex-internal.h:142
#1  __pthread_cond_wait_common (abstime=0x7f6f375c5940, mutex=0x55fb7fd61a48, cond=0x55fb7fd61a70) at pthread_cond_wait.c:533
#2  __pthread_cond_timedwait (cond=0x55fb7fd61a70, mutex=0x55fb7fd61a48, abstime=0x7f6f375c5940) at pthread_cond_wait.c:667
#3  0x00007f6f41051f6f in ros::internal::condition_variable_monotonic::do_wait_until(boost::unique_lock<boost::mutex>&, timespec const&) () from target:/opt/ros/melodic/lib/libroscpp.so
#4  0x00007f6f41050831 in ros::CallbackQueue::callAvailable(ros::WallDuration) () from target:/opt/ros/melodic/lib/libroscpp.so
#5  0x00007f6f410901a1 in ros::internalCallbackQueueThreadFunc() () from target:/opt/ros/melodic/lib/libroscpp.so
#6  0x00007f6f3e874bcd in ?? () from target:/usr/lib/x86_64-linux-gnu/libboost_thread.so.1.65.1
#7  0x00007f6f406f46db in start_thread (arg=0x7f6f375c6700) at pthread_create.c:463
#8  0x00007f6f3f1d871f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

(gdb) thread 6
[Switching to thread 6 (Thread 0x7f6f36795700 (LWP 7162))]
#0  0x00007f6f3f1d8a47 in epoll_wait (epfd=13, events=0x7f6f36793d30, maxevents=256, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
30	../sysdeps/unix/sysv/linux/epoll_wait.c: No such file or directory.
(gdb) bt
#0  0x00007f6f3f1d8a47 in epoll_wait (epfd=13, events=0x7f6f36793d30, maxevents=256, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1  0x00007f6f3d13ea71 in ?? () from target:/usr/lib/x86_64-linux-gnu/libzmq.so.5
#2  0x00007f6f3d17ddc4 in ?? () from target:/usr/lib/x86_64-linux-gnu/libzmq.so.5
#3  0x00007f6f406f46db in start_thread (arg=0x7f6f36795700) at pthread_create.c:463
#4  0x00007f6f3f1d871f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

(gdb) thread 7
[Switching to thread 7 (Thread 0x7f6f35f94700 (LWP 7163))]
#0  0x00007f6f3f1d8a47 in epoll_wait (epfd=15, events=0x7f6f35f92d30, maxevents=256, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
30	in ../sysdeps/unix/sysv/linux/epoll_wait.c
(gdb) bt
#0  0x00007f6f3f1d8a47 in epoll_wait (epfd=15, events=0x7f6f35f92d30, maxevents=256, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1  0x00007f6f3d13ea71 in ?? () from target:/usr/lib/x86_64-linux-gnu/libzmq.so.5
#2  0x00007f6f3d17ddc4 in ?? () from target:/usr/lib/x86_64-linux-gnu/libzmq.so.5
#3  0x00007f6f406f46db in start_thread (arg=0x7f6f35f94700) at pthread_create.c:463
#4  0x00007f6f3f1d871f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

(gdb) thread 8
[Switching to thread 8 (Thread 0x7f6f35793700 (LWP 7164))]
#0  0x00007f6f3f1cbcb9 in __GI___poll (fds=0x7f6f357928e0, nfds=3, timeout=250) at ../sysdeps/unix/sysv/linux/poll.c:29
29	../sysdeps/unix/sysv/linux/poll.c: No such file or directory.
(gdb) bt
#0  0x00007f6f3f1cbcb9 in __GI___poll (fds=0x7f6f357928e0, nfds=3, timeout=250) at ../sysdeps/unix/sysv/linux/poll.c:29
#1  0x00007f6f3d18ac7d in zmq_poll () from target:/usr/lib/x86_64-linux-gnu/libzmq.so.5
#2  0x00007f6f404afed8 in ignition::transport::v9::NodeShared::RunReceptionTask() () from target:/usr/lib/x86_64-linux-gnu/libignition-transport9.so.9
#3  0x00007f6f3f77d6df in ?? () from target:/usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007f6f406f46db in start_thread (arg=0x7f6f35793700) at pthread_create.c:463
#5  0x00007f6f3f1d871f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

(gdb) thread 9
[Switching to thread 9 (Thread 0x7f6f34f92700 (LWP 7165))]
#0  0x00007f6f3f1cbcb9 in __GI___poll (fds=0x7f6f34f918e0, nfds=1, timeout=28) at ../sysdeps/unix/sysv/linux/poll.c:29
29	in ../sysdeps/unix/sysv/linux/poll.c
(gdb) bt
#0  0x00007f6f3f1cbcb9 in __GI___poll (fds=0x7f6f34f918e0, nfds=1, timeout=28) at ../sysdeps/unix/sysv/linux/poll.c:29
#1  0x00007f6f3d18ac7d in zmq_poll () from target:/usr/lib/x86_64-linux-gnu/libzmq.so.5
#2  0x00007f6f4048c337 in ignition::transport::v9::pollSockets(std::vector<int, std::allocator<int> > const&, int) () from target:/usr/lib/x86_64-linux-gnu/libignition-transport9.so.9
#3  0x00007f6f404beab5 in ignition::transport::v9::Discovery<ignition::transport::v9::MessagePublisher>::RecvMessages() () from target:/usr/lib/x86_64-linux-gnu/libignition-transport9.so.9
#4  0x00007f6f3f77d6df in ?? () from target:/usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007f6f406f46db in start_thread (arg=0x7f6f34f92700) at pthread_create.c:463
#6  0x00007f6f3f1d871f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

(gdb) thread 10
[Switching to thread 10 (Thread 0x7f6f2f7d8700 (LWP 7166))]
#0  0x00007f6f3f1cbcb9 in __GI___poll (fds=0x7f6f2f7d78e0, nfds=1, timeout=37) at ../sysdeps/unix/sysv/linux/poll.c:29
29	in ../sysdeps/unix/sysv/linux/poll.c
(gdb) bt
#0  0x00007f6f3f1cbcb9 in __GI___poll (fds=0x7f6f2f7d78e0, nfds=1, timeout=37) at ../sysdeps/unix/sysv/linux/poll.c:29
#1  0x00007f6f3d18ac7d in zmq_poll () from target:/usr/lib/x86_64-linux-gnu/libzmq.so.5
#2  0x00007f6f4048c337 in ignition::transport::v9::pollSockets(std::vector<int, std::allocator<int> > const&, int) () from target:/usr/lib/x86_64-linux-gnu/libignition-transport9.so.9
#3  0x00007f6f404c1975 in ignition::transport::v9::Discovery<ignition::transport::v9::ServicePublisher>::RecvMessages() () from target:/usr/lib/x86_64-linux-gnu/libignition-transport9.so.9
#4  0x00007f6f3f77d6df in ?? () from target:/usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007f6f406f46db in start_thread (arg=0x7f6f2f7d8700) at pthread_create.c:463
#6  0x00007f6f3f1d871f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

(gdb) thread 11
[Switching to thread 11 (Thread 0x7f6f2efd7700 (LWP 7167))]
#0  0x00007f6f406fb065 in futex_abstimed_wait_cancelable (private=<optimized out>, abstime=0x7f6f2efd69a0, expected=0, futex_word=0x55fb7fd67178) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
205	../sysdeps/unix/sysv/linux/futex-internal.h: No such file or directory.
(gdb) bt
#0  0x00007f6f406fb065 in futex_abstimed_wait_cancelable (private=<optimized out>, abstime=0x7f6f2efd69a0, expected=0, futex_word=0x55fb7fd67178) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
#1  __pthread_cond_wait_common (abstime=0x7f6f2efd69a0, mutex=0x55fb7fd670d8, cond=0x55fb7fd67150) at pthread_cond_wait.c:539
#2  __pthread_cond_timedwait (cond=0x55fb7fd67150, mutex=0x55fb7fd670d8, abstime=0x7f6f2efd69a0) at pthread_cond_wait.c:667
#3  0x00007f6f404a4ec6 in ignition::transport::v9::NodeSharedPrivate::PublishThread() () from target:/usr/lib/x86_64-linux-gnu/libignition-transport9.so.9
#4  0x00007f6f3f77d6df in ?? () from target:/usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007f6f406f46db in start_thread (arg=0x7f6f2efd7700) at pthread_create.c:463
#6  0x00007f6f3f1d871f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

(gdb) thread 12
[Switching to thread 12 (Thread 0x7f6f2e7d6700 (LWP 7169))]
#0  0x00007f6f406fafb9 in futex_reltimed_wait_cancelable (private=<optimized out>, reltime=0x7f6f2e7d5650, expected=0, futex_word=0x55fb7fd58548) at ../sysdeps/unix/sysv/linux/futex-internal.h:142
142	in ../sysdeps/unix/sysv/linux/futex-internal.h
(gdb) bt
#0  0x00007f6f406fafb9 in futex_reltimed_wait_cancelable (private=<optimized out>, reltime=0x7f6f2e7d5650, expected=0, futex_word=0x55fb7fd58548) at ../sysdeps/unix/sysv/linux/futex-internal.h:142
#1  __pthread_cond_wait_common (abstime=0x7f6f2e7d5940, mutex=0x55fb7fd584f8, cond=0x55fb7fd58520) at pthread_cond_wait.c:533
#2  __pthread_cond_timedwait (cond=0x55fb7fd58520, mutex=0x55fb7fd584f8, abstime=0x7f6f2e7d5940) at pthread_cond_wait.c:667
#3  0x00007f6f41051f6f in ros::internal::condition_variable_monotonic::do_wait_until(boost::unique_lock<boost::mutex>&, timespec const&) () from target:/opt/ros/melodic/lib/libroscpp.so
#4  0x00007f6f41050831 in ros::CallbackQueue::callAvailable(ros::WallDuration) () from target:/opt/ros/melodic/lib/libroscpp.so
#5  0x00007f6f410a6f75 in ros::AsyncSpinnerImpl::threadFunc() () from target:/opt/ros/melodic/lib/libroscpp.so
#6  0x00007f6f3e874bcd in ?? () from target:/usr/lib/x86_64-linux-gnu/libboost_thread.so.1.65.1
#7  0x00007f6f406f46db in start_thread (arg=0x7f6f2e7d6700) at pthread_create.c:463
#8  0x00007f6f3f1d871f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

@nkoenig nkoenig self-assigned this Jul 19, 2021
@realdealneil
Copy link
Contributor

We are seeing this same type of issue on cloudsim sometimes. In some of the run logs from cloudsim, we have vehicles do a tf lookup and the vehicles frames don't exist at all. For example, I launched 6 vehicles, and one of them had this issue and was unable to initialize properly. Throughout the log file, I get the following error:

2217.564000000 WARN /A2/cartographer_node [/tmp/binarydeb/ros-melodic-cartographer-ros-1.0.0/cartographer_ros/ros_log_sink.cc:55(ScopedRosLogSink::send)] [topics: /rosout, /tf, /A2/submap_list, /A2/trajectory_node_list, /A2/landmark_poses_list, /A2/constraint_list, /A2/scan_matched_points2, /statistics] W0723 06:29:12.000000 93359 tf_bridge.cc:52] "A2" passed to lookupTransform argument target_frame does not exist.

"A2" is the vehicle's name, so it should be able to find itself in the tf tree. This vehicle never started moving because of this issue.

@wolfgangschwab
Copy link

We also have problems that robots are not moving at all. In our couldsim run with Name "ver89FQ" copter A4 is not starting. It got no imu data at all.
cloudsim_no-imu

Is there any chance to solve this issue?

@nkoenig nkoenig added the bug Something isn't working label Oct 11, 2021
@nkoenig nkoenig removed their assignment Oct 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants