Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nodelets respawn unexpectedly and fail reloading #1531

Open
furushchev opened this issue Jun 19, 2017 · 7 comments
Open

Nodelets respawn unexpectedly and fail reloading #1531

furushchev opened this issue Jun 19, 2017 · 7 comments
Assignees

Comments

@furushchev
Copy link
Member

Nodeletを動かしているとなんらかの理由で時折プロセスごと落ちてしまいます。
この時respawn="true"にしていると一定の確率で、respawnにも失敗してしまうようです。

process[restaurant_perception_nodelet_manager-1]: started with pid [29789]
process[people_detection/input_image_relay-2]: started with pid [29795]
process[people_detection/throttle-3]: started with pid [29803]
process[people_detection/face_detection-4]: started with pid [29817]
process[in_shelf_object_detection_nodelet_manager-5]: started with pid [29852]
process[in_shelf_object_detection/input_relay-6]: started with pid [29856]
process[in_shelf_object_detection/floor_removal-7]: started with pid [29875]
process[in_shelf_object_detection/multi_plane_segmentation-8]: started with pid [29913]
process[in_shelf_object_detection/plane_reasoner-9]: started with pid [29939]
process[in_shelf_object_detection/plane_reasoner_decomposer-10]: started with pid [29963]
process[in_shelf_object_detection/robot_workspace_tf_publisher-11]: started with pid [30006]
process[in_shelf_object_detection/plane_distance_likelihood-12]: started with pid [30009]
process[in_shelf_object_detection/plane_likelihood_filter-13]: started with pid [30018]
process[in_shelf_object_detection/plane_magnifier-14]: started with pid [30067]
process[in_shelf_object_detection/polygon_array_transformer-15]: started with pid [30094]
process[in_shelf_object_detection/bilateral_filter-16]: started with pid [30106]
process[in_shelf_object_detection/voxel_grid-17]: started with pid [30120]
process[in_shelf_object_detection/plane_extraction-18]: started with pid [30124]
process[in_shelf_object_detection/euclidean_clustering-19]: started with pid [30126]
process[in_shelf_object_detection/cluster_decomposer-20]: started with pid [30136]
process[take_from_table_nodelet_manager-21]: started with pid [30145]
process[tabletop_object_detector/input_relay-22]: started with pid [30170]
process[tabletop_object_detector/passthrough-23]: started with pid [30189]
process[tabletop_object_detector/multi_plane_estimate-24]: started with pid [30211]
process[tabletop_object_detector/table_extractor-25]: started with pid [30221]
process[tabletop_object_detector/table_extractor_decomposer-26]: started with pid [30233]
process[tabletop_object_detector/table_polygon_likelihood_filter-27]: started with pid [30257]
process[tabletop_object_detector/filtering_table_polygon-28]: started with pid [30311]
process[tabletop_object_detector/polygon_to_polygon_array-29]: started with pid [30315]
process[tabletop_object_detector/polygon_array_transformer-30]: started with pid [30353]
[ INFO] [1494833581.120339038]: Initializing nodelet with 8 worker threads.
[ INFO] [1494833581.179103479]: Initializing nodelet with 8 worker threads.
process[tabletop_object_detector/voxel_filter-31]: started with pid [30369]
process[tabletop_object_detector/table_surface_object_extraction-32]: started with pid [30459]
process[tabletop_object_detector/clustering-33]: started with pid [30495]
process[tabletop_object_detector/cluster_decomposer-34]: started with pid [30524]
process[tabletop_object_detector/bbox_array_to_bbox-35]: started with pid [30576]
process[tabletop_object_detector/publish_tf_bbox-36]: started with pid [30619]
[ INFO] [1494833581.839976191]: Initializing nodelet with 8 worker threads.
[INFO] [WallTime: 1494833584.410268] launch bbox tf publisher
[ INFO] [1494833584.777466170]: instantiating tf::TransformListener
[ INFO] [1494833586.425024214]: instantiating tf::TransformListener
[ WARN] [1494833587.943898020]: '/people_detection/face_detection' subscribes topics only with child subscribers.
[ WARN] [1494833588.781218196]: ~output%02d are not published before subscribed, you should subscribe ~debug_output in debuging.
[ WARN] [1494833589.290510048]: '/in_shelf_object_detection/plane_likelihood_filter' subscribes topics only with child subscribers.
[WARN] [WallTime: 1494833589.419369] [/tabletop_object_detector/publish_tf_bbox] subscribes topics only with child subscribers. Set '~always_subscribe' as True to have it subscribe always.
[ WARN] [1494833589.572431583]: '/in_shelf_object_detection/plane_extraction' subscribes topics only with child subscribers.
[ WARN] [1494833589.754981714]: '/tabletop_object_detector/multi_plane_estimate' subscribes topics only with child subscribers.
[ WARN] [1494833590.179708182]: '/in_shelf_object_detection/plane_reasoner_decomposer' subscribes topics only with child subscribers.
[ WARN] [1494833590.567020005]: '/in_shelf_object_detection/polygon_array_transformer' subscribes topics only with child subscribers.
[ WARN] [1494833590.629797604]: '/in_shelf_object_detection/plane_magnifier' subscribes topics only with child subscribers.
[ WARN] [1494833591.423378086]: '/tabletop_object_detector/polygon_array_transformer' subscribes topics only with child subscribers.
[ WARN] [1494833591.623547605]: '/in_shelf_object_detection/euclidean_clustering' subscribes topics only with child subscribers.
[ WARN] [1494833592.419575686]: '/tabletop_object_detector/cluster_decomposer' subscribes topics only with child subscribers.
[ WARN] [1494833592.742550809]: '/in_shelf_object_detection/cluster_decomposer' subscribes topics only with child subscribers.
[ WARN] [1494833593.052135433]: '/tabletop_object_detector/clustering' subscribes topics only with child subscribers.
[ WARN] [1494833593.466370398]: '/tabletop_object_detector/bbox_array_to_bbox' subscribes topics only with child subscribers.
[ WARN] [1494833593.595885797]: '/tabletop_object_detector/table_polygon_likelihood_filter' subscribes topics only with child subscribers.
[ WARN] [1494833593.669023446]: '/tabletop_object_detector/table_extractor_decomposer' subscribes topics only with child subscribers.
[ WARN] [1494833594.177390984]: '/tabletop_object_detector/table_surface_object_extraction' subscribes topics only with child subscribers.
[ WARN] [1494833594.661611797]: '/tabletop_object_detector/table_extractor' subscribes topics only with child subscribers.
[ WARN] [1494833594.946915592]: '/tabletop_object_detector/polygon_to_polygon_array' subscribes topics only with child subscribers.
[ WARN] [1494833595.049390388]: '/tabletop_object_detector/filtering_table_polygon' subscribes topics only with child subscribers.
[in_shelf_object_detection/floor_removal-7] process has finished cleanly
log file: /home/m-takeda/.ros/log/20170509-180013_e9c27bf2-3495-11e7-acce-00306444a934/in_shelf_object_detection-floor_removal-7*.log
[in_shelf_object_detection/floor_removal-7] restarting process
process[in_shelf_object_detection/floor_removal-7]: started with pid [6411]
[ERROR] [1494833780.963584485]: Cannot load nodelet /in_shelf_object_detection/floor_removal for one exists with that name already
[FATAL] [1494833780.964170440]: Failed to load nodelet '/in_shelf_object_detection/floor_removal` of type `pcl/PassThrough` to manager `/in_shelf_object_detection_nodelet_manager'
[in_shelf_object_detection/floor_removal-7] process has died [pid 6411, exit code 255, cmd /opt/ros/indigo/lib/nodelet/nodelet load pcl/PassThrough /in_shelf_object_detection_nodelet_manager ~input:=input_relay/output __name:=floor_removal __log:=/home/m-takeda/.ros/log/20170509-180013_e9c27bf2-3495-11e7-acce-00306444a934/in_shelf_object_detection-floor_removal-7.log].
log file: /home/m-takeda/.ros/log/20170509-180013_e9c27bf2-3495-11e7-acce-00306444a934/in_shelf_object_detection-floor_removal-7*.log
[in_shelf_object_detection/floor_removal-7] restarting process
process[in_shelf_object_detection/floor_removal-7]: started with pid [6440]
[ERROR] [1494833782.185400345]: Cannot load nodelet /in_shelf_object_detection/floor_removal for one exists with that name already
[FATAL] [1494833782.185789113]: Failed to load nodelet '/in_shelf_object_detection/floor_removal` of type `pcl/PassThrough` to manager `/in_shelf_object_detection_nodelet_manager'
[in_shelf_object_detection/floor_removal-7] process has died [pid 6440, exit code 255, cmd /opt/ros/indigo/lib/nodelet/nodelet load pcl/PassThrough /in_shelf_object_detection_nodelet_manager ~input:=input_relay/output __name:=floor_removal __log:=/home/m-takeda/.ros/log/20170509-180013_e9c27bf2-3495-11e7-acce-00306444a934/in_shelf_object_detection-floor_removal-7.log].
log file: /home/m-takeda/.ros/log/20170509-180013_e9c27bf2-3495-11e7-acce-00306444a934/in_shelf_object_detection-floor_removal-7*.log
[in_shelf_object_detection/floor_removal-7] restarting process
process[in_shelf_object_detection/floor_removal-7]: started with pid [6484]
[ERROR] [1494833782.565997253]: Cannot load nodelet /in_shelf_object_detection/floor_removal for one exists with that name already
[FATAL] [1494833782.566426413]: Failed to load nodelet '/in_shelf_object_detection/floor_removal` of type `pcl/PassThrough` to manager `/in_shelf_object_detection_nodelet_manager'
[in_shelf_object_detection/floor_removal-7] process has died [pid 6484, exit code 255, cmd /opt/ros/indigo/lib/nodelet/nodelet load pcl/PassThrough /in_shelf_object_detection_nodelet_manager ~input:=input_relay/output __name:=floor_removal __log:=/home/m-takeda/.ros/log/20170509-180013_e9c27bf2-3495-11e7-acce-00306444a934/in_shelf_object_detection-floor_removal-7.log].
log file: /home/m-takeda/.ros/log/20170509-180013_e9c27bf2-3495-11e7-acce-00306444a934/in_shelf_object_detection-floor_removal-7*.log
[in_shelf_object_detection/floor_removal-7] restarting process
process[in_shelf_object_detection/floor_removal-7]: started with pid [6532]
[ERROR] [1494833785.231884430]: Lookup would require extrapolation into the past.  Requested time 1494833783.202391739 but the earliest data is at time 1494833785.066103925, when looking up transform from frame [head_rgbd_sensor_rgb_frame] to frame [base_link]
[ERROR] [1494833785.245710413]: [/in_shelf_object_detection/floor_removal::input_indices_callback] Error converting input dataset from head_rgbd_sensor_rgb_frame to base_link.
[ERROR] [1494833785.719620659]: Lookup would require extrapolation into the past.  Requested time 1494833784.059441360 but the earliest data is at time 1494833785.066103925, when looking up transform from frame [head_rgbd_sensor_rgb_frame] to frame [base_link]
[ERROR] [1494833785.819443191]: [/in_shelf_object_detection/floor_removal::input_indices_callback] Error converting input dataset from head_rgbd_sensor_rgb_frame to base_link.
[ERROR] [1494833785.880720792]: Lookup would require extrapolation into the past.  Requested time 1494833784.775936540 but the earliest data is at time 1494833785.066103925, when looking up transform from frame [head_rgbd_sensor_rgb_frame] to frame [base_link]
[ERROR] [1494833785.880819505]: [/in_shelf_object_detection/floor_removal::input_indices_callback] Error converting input dataset from head_rgbd_sensor_rgb_frame to base_link.
[ERROR] [1494833786.059197347]: Lookup would require extrapolation into the past.  Requested time 1494833785.061303784 but the earliest data is at time 1494833785.066103925, when looking up transform from frame [head_rgbd_sensor_rgb_frame] to frame [base_link]
[ERROR] [1494833786.059992646]: [/in_shelf_object_detection/floor_removal::input_indices_callback] Error converting input dataset from head_rgbd_sensor_rgb_frame to base_link.
[in_shelf_object_detection/voxel_grid-17] process has finished cleanly
log file: /home/m-takeda/.ros/log/20170509-180013_e9c27bf2-3495-11e7-acce-00306444a934/in_shelf_object_detection-voxel_grid-17*.log
[in_shelf_object_detection/voxel_grid-17] restarting process
process[in_shelf_object_detection/voxel_grid-17]: started with pid [10372]
[ERROR] [1494833917.455049234]: Cannot load nodelet /in_shelf_object_detection/voxel_grid for one exists with that name already
[FATAL] [1494833917.456441748]: Failed to load nodelet '/in_shelf_object_detection/voxel_grid` of type `pcl/VoxelGrid` to manager `/in_shelf_object_detection_nodelet_manager'
[in_shelf_object_detection/voxel_grid-17] process has died [pid 10372, exit code 255, cmd /opt/ros/indigo/lib/nodelet/nodelet load pcl/VoxelGrid /in_shelf_object_detection_nodelet_manager ~input:=bilateral_filter/output __name:=voxel_grid __log:=/home/m-takeda/.ros/log/20170509-180013_e9c27bf2-3495-11e7-acce-00306444a934/in_shelf_object_detection-voxel_grid-17.log].
log file: /home/m-takeda/.ros/log/20170509-180013_e9c27bf2-3495-11e7-acce-00306444a934/in_shelf_object_detection-voxel_grid-17*.log
[in_shelf_object_detection/voxel_grid-17] restarting process
process[in_shelf_object_detection/voxel_grid-17]: started with pid [10433]
[ERROR] [1494833918.140655873]: Cannot load nodelet /in_shelf_object_detection/voxel_grid for one exists with that name already
[FATAL] [1494833918.141456785]: Failed to load nodelet '/in_shelf_object_detection/voxel_grid` of type `pcl/VoxelGrid` to manager `/in_shelf_object_detection_nodelet_manager'
[in_shelf_object_detection/voxel_grid-17] process has died [pid 10433, exit code 255, cmd /opt/ros/indigo/lib/nodelet/nodelet load pcl/VoxelGrid /in_shelf_object_detection_nodelet_manager ~input:=bilateral_filter/output __name:=voxel_grid __log:=/home/m-takeda/.ros/log/20170509-180013_e9c27bf2-3495-11e7-acce-00306444a934/in_shelf_object_detection-voxel_grid-17.log].
log file: /home/m-takeda/.ros/log/20170509-180013_e9c27bf2-3495-11e7-acce-00306444a934/in_shelf_object_detection-voxel_grid-17*.log
[in_shelf_object_detection/voxel_grid-17] restarting process
process[in_shelf_object_detection/voxel_grid-17]: started with pid [10468]
[ERROR] [1494833918.612126023]: Cannot load nodelet /in_shelf_object_detection/voxel_grid for one exists with that name already
[FATAL] [1494833918.612695311]: Failed to load nodelet '/in_shelf_object_detection/voxel_grid` of type `pcl/VoxelGrid` to manager `/in_shelf_object_detection_nodelet_manager'
[in_shelf_object_detection/voxel_grid-17] process has died [pid 10468, exit code 255, cmd /opt/ros/indigo/lib/nodelet/nodelet load pcl/VoxelGrid /in_shelf_object_detection_nodelet_manager ~input:=bilateral_filter/output __name:=voxel_grid __log:=/home/m-takeda/.ros/log/20170509-180013_e9c27bf2-3495-11e7-acce-00306444a934/in_shelf_object_detection-voxel_grid-17.log].
log file: /home/m-takeda/.ros/log/20170509-180013_e9c27bf2-3495-11e7-acce-00306444a934/in_shelf_object_detection-voxel_grid-17*.log
[in_shelf_object_detection/voxel_grid-17] restarting process
process[in_shelf_object_detection/voxel_grid-17]: started with pid [10502]
[in_shelf_object_detection/plane_reasoner-9] process has finished cleanly
log file: /home/m-takeda/.ros/log/20170509-180013_e9c27bf2-3495-11e7-acce-00306444a934/in_shelf_object_detection-plane_reasoner-9*.log
[in_shelf_object_detection/plane_reasoner-9] restarting process
process[in_shelf_object_detection/plane_reasoner-9]: started with pid [13954]
[ERROR] [1494834027.099067796]: Cannot load nodelet /in_shelf_object_detection/plane_reasoner for one exists with that name already
[FATAL] [1494834027.099525436]: Failed to load nodelet '/in_shelf_object_detection/plane_reasoner` of type `jsk_pcl_utils/PlaneReasoner` to manager `/in_shelf_object_detection_nodelet_manager'
[in_shelf_object_detection/plane_reasoner_decomposer-10] process has finished cleanly
log file: /home/m-takeda/.ros/log/20170509-180013_e9c27bf2-3495-11e7-acce-00306444a934/in_shelf_object_detection-plane_reasoner_decomposer-10*.log
[in_shelf_object_detection/plane_reasoner_decomposer-10] restarting process
process[in_shelf_object_detection/plane_reasoner_decomposer-10]: started with pid [13981]
[in_shelf_object_detection/plane_reasoner-9] process has died [pid 13954, exit code 255, cmd /opt/ros/indigo/lib/nodelet/nodelet load jsk_pcl_utils/PlaneReasoner /in_shelf_object_detection_nodelet_manager ~input:=input_relay/output ~input_inliers:=multi_plane_segmentation/output_refined ~input_polygons:=multi_plane_segmentation/output_refined_polygon ~input_coefficients:=multi_plane_segmentation/output_refined_coefficients __name:=plane_reasoner __log:=/home/m-takeda/.ros/log/20170509-180013_e9c27bf2-3495-11e7-acce-00306444a934/in_shelf_object_detection-plane_reasoner-9.log].
log file: /home/m-takeda/.ros/log/20170509-180013_e9c27bf2-3495-11e7-acce-00306444a934/in_shelf_object_detection-plane_reasoner-9*.log
[in_shelf_object_detection/plane_reasoner-9] restarting process
process[in_shelf_object_detection/plane_reasoner-9]: started with pid [14000]
[ERROR] [1494834027.415259800]: Cannot load nodelet /in_shelf_object_detection/plane_reasoner_decomposer for one exists with that name already
[FATAL] [1494834027.416571464]: Failed to load nodelet '/in_shelf_object_detection/plane_reasoner_decomposer` of type `jsk_pcl/ClusterPointIndicesDecomposer` to manager `/in_shelf_object_detection_nodelet_manager'
[ERROR] [1494834027.480746146]: Cannot load nodelet /in_shelf_object_detection/plane_reasoner for one exists with that name already
[FATAL] [1494834027.481500521]: Failed to load nodelet '/in_shelf_object_detection/plane_reasoner` of type `jsk_pcl_utils/PlaneReasoner` to manager `/in_shelf_object_detection_nodelet_manager'
[in_shelf_object_detection/plane_reasoner-9] process has died [pid 14000, exit code 255, cmd /opt/ros/indigo/lib/nodelet/nodelet load jsk_pcl_utils/PlaneReasoner /in_shelf_object_detection_nodelet_manager ~input:=input_relay/output ~input_inliers:=multi_plane_segmentation/output_refined ~input_polygons:=multi_plane_segmentation/output_refined_polygon ~input_coefficients:=multi_plane_segmentation/output_refined_coefficients __name:=plane_reasoner __log:=/home/m-takeda/.ros/log/20170509-180013_e9c27bf2-3495-11e7-acce-00306444a934/in_shelf_object_detection-plane_reasoner-9.log].
log file: /home/m-takeda/.ros/log/20170509-180013_e9c27bf2-3495-11e7-acce-00306444a934/in_shelf_object_detection-plane_reasoner-9*.log
[in_shelf_object_detection/plane_reasoner_decomposer-10] process has died [pid 13981, exit code 255, cmd /opt/ros/indigo/lib/nodelet/nodelet load jsk_pcl/ClusterPointIndicesDecomposer /in_shelf_object_detection_nodelet_manager ~input:=input_relay/output ~target:=plane_reasoner/output_inliers ~align_planes:=plane_reasoner/output_polygons ~align_planes_coefficients:=plane_reasoner/output_coefficients __name:=plane_reasoner_decomposer __log:=/home/m-takeda/.ros/log/20170509-180013_e9c27bf2-3495-11e7-acce-00306444a934/in_shelf_object_detection-plane_reasoner_decomposer-10.log].
log file: /home/m-takeda/.ros/log/20170509-180013_e9c27bf2-3495-11e7-acce-00306444a934/in_shelf_object_detection-plane_reasoner_decomposer-10*.log
@furushchev
Copy link
Member Author

furushchev commented Jun 19, 2017

プロセスごと落ちてしまう原因はわかっていませんが、ros全体のログを眺めているとJSKのNodeletだけでなく一般的にそうなっている気がします。(→nodelet_core or bondの問題?)

その時のgdbのログは以下のようで、nodeletのunloadに失敗していると思われます。
Nodeletのloaderは自分でloadしたnodeletの辞書を持っていて、loadをrequestされた時に参照しているようです。
私見ではunloadの時に失敗して辞書からnodeletが削除されずにrespawnするとこうなるのではないかと思っています。
https://github.com/ros/nodelet_core/blob/6c561224958a575b604a067e149a55feb07044dc/nodelet/src/loader.cpp#L269

(gdb) 
#0  0x00007ffff60bfc37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007ffff60c3028 in __GI_abort () at abort.c:89
#2  0x00007ffff66c7535 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x00007ffff66c56d6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007ffff66c5703 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007ffff66c5922 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007ffff7bb0601 in void boost::throw_exception<boost::lock_error>(boost::lock_error const&) () from /opt/ros/indigo/lib/libnodeletlib.so
#7  0x00007ffff7bb0705 in boost::unique_lock<boost::mutex>::lock() () from /opt/ros/indigo/lib/libnodeletlib.so
#8  0x00007fffd71e2205 in unique_lock (m_=..., this=0x7fffffffa0f0) at /usr/include/boost/thread/lock_types.hpp:124
#9  message_filters::Signal1<jsk_recognition_msgs::PolygonArray_<std::allocator<void> > >::removeCallback (this=0x1486608, helper=...)
    at /opt/ros/indigo/include/message_filters/signal1.h:102
#10 0x00007fffd72634ca in disconnectAll (this=0x148cc00) at /opt/ros/indigo/include/message_filters/synchronizer.h:351
#11 message_filters::Synchronizer<message_filters::sync_policies::ExactTime<jsk_recognition_msgs::PolygonArray_<std::allocator<void> >, jsk_recognition_msgs::ModelCoefficientsAr
ray_<std::allocator<void> >, message_filters::NullType, message_filters::NullType, message_filters::NullType, message_filters::NullType, message_filters::NullType, message_filte
rs::NullType, message_filters::NullType> >::~Synchronizer (this=0x148cc00, __in_chrg=<optimized out>) at /opt/ros/indigo/include/message_filters/synchronizer.h:228
#12 0x00007fffd7263689 in destroy (this=0x148cbf8) at /usr/include/boost/smart_ptr/make_shared_object.hpp:57
#13 operator() (this=0x148cbf8) at /usr/include/boost/smart_ptr/make_shared_object.hpp:87
#14 boost::detail::sp_counted_impl_pd<message_filters::Synchronizer<message_filters::sync_policies::ExactTime<jsk_recognition_msgs::PolygonArray_<std::allocator<void> >, jsk_rec
ognition_msgs::ModelCoefficientsArray_<std::allocator<void> >, message_filters::NullType, message_filters::NullType, message_filters::NullType, message_filters::NullType, messag
e_filters::NullType, message_filters::NullType, message_filters::NullType> >*, boost::detail::sp_ms_deleter<message_filters::Synchronizer<message_filters::sync_policies::ExactTi
me<jsk_recognition_msgs::PolygonArray_<std::allocator<void> >, jsk_recognition_msgs::ModelCoefficientsArray_<std::allocator<void> >, message_filters::NullType, message_filters::
NullType, message_filters::NullType, message_filters::NullType, message_filters::NullType, message_filters::NullType, message_filters::NullType> > > >::dispose (this=0x148cbe0)
    at /usr/include/boost/smart_ptr/detail/sp_counted_impl.hpp:153
#15 0x000000000040624e in boost::detail::sp_counted_base::release() ()
#16 0x00007fffd725aaf8 in ~shared_count (this=0x14865f8, __in_chrg=<optimized out>) at /usr/include/boost/smart_ptr/detail/shared_count.hpp:371
#17 ~shared_ptr (this=0x14865f0, __in_chrg=<optimized out>) at /usr/include/boost/smart_ptr/shared_ptr.hpp:328
#18 ~PolygonArrayLikelihoodFilter (this=0x1486470, __in_chrg=<optimized out>)
    at /home/m-takeda/catkin_ws/src/jsk-ros-pkg/jsk_recognition/jsk_pcl_ros_utils/include/jsk_pcl_ros_utils/polygon_array_likelihood_filter.h:53
#19 jsk_pcl_ros_utils::PolygonArrayLikelihoodFilter::~PolygonArrayLikelihoodFilter (this=0x1486470, __in_chrg=<optimized out>)
    at /home/m-takeda/catkin_ws/src/jsk-ros-pkg/jsk_recognition/jsk_pcl_ros_utils/include/jsk_pcl_ros_utils/polygon_array_likelihood_filter.h:53
#20 0x00007ffff7bb0831 in void class_loader::ClassLoader::onPluginDeletion<nodelet::Nodelet>(nodelet::Nodelet*) () from /opt/ros/indigo/lib/libnodeletlib.so
#21 0x000000000040624e in boost::detail::sp_counted_base::release() ()
#22 0x00007ffff7baac09 in nodelet::Loader::unload(std::string const&) () from /opt/ros/indigo/lib/libnodeletlib.so
#23 0x00007ffff7bb3efb in nodelet::LoaderROS::unload(std::string const&) () from /opt/ros/indigo/lib/libnodeletlib.so
#24 0x00007ffff77644ed in bond::Bond::flushPendingCallbacks() () from /opt/ros/indigo/lib/libbondcpp.so
#25 0x00007ffff776467b in bond::Bond::onHeartbeatTimeout() () from /opt/ros/indigo/lib/libbondcpp.so
#26 0x00007ffff748a0a0 in ros::TimerManager<ros::WallTime, ros::WallDuration, ros::WallTimerEvent>::TimerQueueCallback::call() () from /opt/ros/indigo/lib/libroscpp.so
#27 0x00007ffff74b3107 in ros::CallbackQueue::callOneCB(ros::CallbackQueue::TLS*) () from /opt/ros/indigo/lib/libroscpp.so
#28 0x00007ffff74b3c53 in ros::CallbackQueue::callAvailable(ros::WallDuration) () from /opt/ros/indigo/lib/libroscpp.so
#29 0x00007ffff74fc175 in ros::SingleThreadedSpinner::spin(ros::CallbackQueue*) () from /opt/ros/indigo/lib/libroscpp.so
#30 0x00007ffff74e3d9b in ros::spin() () from /opt/ros/indigo/lib/libroscpp.so
#31 0x000000000040494e in main ()

@furushchev
Copy link
Member Author

@k-okada @YoheiKakiuchi @mmurooka さん
DRCの時に認識でnodeletをよく使っていたと思いますが、この問題は起きていましたでしょうか?(なんとなく起きていた記憶がある気がする)

@mmurooka
Copy link
Member

cc @wkentaro

起動時ではなく起動は終わって実行している途中に突然落ちることがあるということでしょうか.
DRCのときにもその問題はあって,
どうしても解決できないのでstandalone_complexed_nodeletというのを@garaemonさんが作ってそれを使っていました.
https://github.com/jsk-ros-pkg/jsk_demos/blob/master/jsk_2015_06_hrp_drc/drc_task_common/launch/fc/valve_recognition.launch
がDRCのバルブ認識のlaunchでstandalone_complexed_nodeletを使っています.
http://jsk-docs.readthedocs.io/en/latest/jsk_common/doc/jsk_topic_tools/lib/standalone_complexed_nodelet.html
にちょっと長いですが普通のnodeletで落ちる理由が書いてあります.

DRCではこのようにして対応していましたが,その後あまり引き継がれていませんし,
ベストは普通のnodeletを落ちないようにすることだとは思います.

@YoheiKakiuchi
Copy link
Member

DRCではこのようにして対応していましたが,その後あまり引き継がれていませんし,
ベストは普通のnodeletを落ちないようにすることだとは思います.

メンテはできていませんが、multisense関連の点群等でnodeletを使っているものはほぼstandaloneになっていますね。
https://github.com/jsk-ros-pkg/jsk_common/blob/master/jsk_tilt_laser/launch/multisense_laser_pipeline.launch
https://github.com/jsk-ros-pkg/jsk_robot/blob/master/jsk_robot_common/jsk_robot_startup/launch/multisense_local.launch

@YoheiKakiuchi
Copy link
Member

問題は2つあるような気がするが、これは分けられない問題だったのだろうか?

  1. boundに起因する unload/load が起こる
  2. unload/load 時に落ちる

2.が解決すればunload/loadでたまにトピックが途切れるがなんとなく動き続けるようにならないのかな。
あと、@garaemon の文書にあるhartbeatが途切れたと判断する時間を十分に大きくするようにはできないのだろうか。

@furushchev
Copy link
Member Author

@mmurooka @YoheiKakiuchi コメントありがとうございます。
@mmurooka さんに貼っていただいたリンクに書いてあったことを踏まえると、ご指摘の通り問題は2つになりそうです。

hartbeatが途切れたと判断する時間を十分に大きくするようにはできないのだろうか

ソースコードを見ると、nodeletパッケージを再コンパイルすれば可能のようです。
それで一度様子を見てみます。
(デフォルトはタイムアウトが1秒)

unload/load 時に落ちる

こちらはプログラムの何処かで、エラー(リークとか?)が起きて、unloadができなくなったのか、unload処理自体に問題があるのかをまず切り分けてわかりそうな範囲でデバックしていこうと思います。

@k-okada
Copy link
Member

k-okada commented Apr 15, 2019

こちらはプログラムの何処かで、エラー(リークとか?)が起きて、unloadができなくなったのか、unload処理自体に問題があるのかをまず切り分けて

@furushchev 切り分けてテストコードを作りましょう.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants