Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nodelet runs out of memory and dies #295

Open
varun-dhar opened this issue Feb 13, 2024 · 11 comments
Open

Nodelet runs out of memory and dies #295

varun-dhar opened this issue Feb 13, 2024 · 11 comments
Assignees
Labels
bug Something isn't working

Comments

@varun-dhar
Copy link

Describe the bug
Nodelet runs out of memory, consuming 14G (out of 16G) of ram and 10G of swap before being killed by the OOM killer.
Exact message:

[FATAL] [1707787340.579826749]: Failed to load nodelet '/ouster/os_driver` of type `ouster_ros/OusterDriver` to manager `os_nodelet_mgr'                                        
================================================================================REQUIRED process [ouster/os_nodelet_mgr-2] has died!
process has died [pid 8738, exit code -9, cmd /opt/ros/noetic/lib/nodelet/nodelet manager __name:=os_nodelet_mgr __log:=/root/.ros/log/280f4d1c-ca0e-11ee-8e5b-743af4340b7f/ouster-os_nodelet_mgr-2.log].
log file: /root/.ros/log/280f4d1c-ca0e-11ee-8e5b-743af4340b7f/ouster-os_nodelet_mgr-2*.log
Initiating shutdown!                         
================================================================================

The log /root/.ros/log/280f4d1c-ca0e-11ee-8e5b-743af4340b7f/ouster-os_nodelet_mgr-2.log does not exist.

To Reproduce
Steps to reproduce the behavior (steps below are just an example):

  1. source ros environment
  2. set viz in launch/driver.launch to false
  3. compile the project workspace
  4. source the project workspace
  5. ros launch ouster_ros sensor or replay
  6. open another terminal and the project workspace
  7. observe the issue ...

Screenshots
If applicable, add screenshots to help explain your problem.

Platform (please complete the following information):

  • Ouster Sensor? OS-0-32-U1
  • Ouster Firmware Version? v2.4.0
  • ROS version/distro? noetic
  • Operating System? Linux
  • Machine Architecture? x64
  • git commit hash bb2ab24ac6b0ea480bce1d371b0e5e06a3d17b87
@varun-dhar varun-dhar added the bug Something isn't working label Feb 13, 2024
@Samahu Samahu self-assigned this Feb 13, 2024
@Samahu
Copy link
Contributor

Samahu commented Feb 13, 2024

@doggo4242 Thanks for reporting the problem? But how long it takes before you run into the nodelet runs out of memory?

@varun-dhar
Copy link
Author

About 2 minutes.

@Samahu
Copy link
Contributor

Samahu commented Feb 26, 2024

Do you have other nodes running besides Ouster's driver nodelets?
Could you try to disable the generation of point cloud via by removing the flag PCL and see if the problem still occurs.

@varun-dhar
Copy link
Author

varun-dhar commented Feb 27, 2024

No. I disabled it, same issue. Output is as follows:

... logging to /root/.ros/log/87bdaa00-d50e-11ee-8c9c-743af4340b7f/roslaunch-Varun-Laptop-1571.log
Checking log directory for disk usage. This may take a while.
Press Ctrl-C to interrupt
Done checking log file disk usage. Usage is <1GB.

started roslaunch server http://localhost:33965/

SUMMARY
========

PARAMETERS
 * /ouster/os_driver/imu_frame: os_imu
 * /ouster/os_driver/imu_port: 0
 * /ouster/os_driver/lidar_frame: os_lidar
 * /ouster/os_driver/lidar_mode: 
 * /ouster/os_driver/lidar_port: 0
 * /ouster/os_driver/metadata: 
 * /ouster/os_driver/point_cloud_frame: 
 * /ouster/os_driver/point_type: original
 * /ouster/os_driver/proc_mask: IMG|IMU|SCAN
 * /ouster/os_driver/ptp_utc_tai_offset: -37.0
 * /ouster/os_driver/scan_ring: 0
 * /ouster/os_driver/sensor_frame: os_sensor
 * /ouster/os_driver/sensor_hostname: os-122219004213.l...
 * /ouster/os_driver/tf_prefix: 
 * /ouster/os_driver/timestamp_mode: 
 * /ouster/os_driver/udp_dest: 
 * /ouster/os_driver/udp_profile_lidar: 
 * /rosdistro: noetic
 * /rosversion: 1.16.0

NODES
  /ouster/
    os_driver (nodelet/nodelet)
    os_nodelet_mgr (nodelet/nodelet)

auto-starting new master
process[master]: started with pid [1583]
ROS_MASTER_URI=http://localhost:11311

setting /run_id to 87bdaa00-d50e-11ee-8c9c-743af4340b7f
process[rosout-1]: started with pid [1595]
started core service [/rosout]
process[ouster/os_nodelet_mgr-2]: started with pid [1600]
process[ouster/os_driver-3]: started with pid [1601]
[ INFO] [1708996912.827699460]: Loading nodelet /ouster/os_driver of type ouster_ros/OusterDriver to manager os_nodelet_mgr with the following remappings:
[ INFO] [1708996912.836409742]: waitForService: Service [/ouster/os_nodelet_mgr/load_nodelet] has not been advertised, waiting...
[ INFO] [1708996913.353790782]: Initializing nodelet with 16 worker threads.
[ INFO] [1708996913.380715792]: waitForService: Service [/ouster/os_nodelet_mgr/load_nodelet] is now available.
[ WARN] [1708996913.483988351]: lidar port set to zero, the client will assign a random port number!
[ WARN] [1708996913.484007926]: imu port set to zero, the client will assign a random port number!
[ INFO] [1708996913.484017874]: Will use automatic UDP destination
[FATAL] [1708996967.972998309]: Failed to load nodelet '/ouster/os_driver` of type `ouster_ros/OusterDriver` to manager `os_nodelet_mgr'
================================================================================REQUIRED process [ouster/os_nodelet_mgr-2] has died!
process has died [pid 1600, exit code -11, cmd /opt/ros/noetic/lib/nodelet/nodelet manager __name:=os_nodelet_mgr __log:=/root/.ros/log/87bdaa00-d50e-11ee-8c9c-743af4340b7f/ouster-os_nodelet_mgr-2.log].
log file: /root/.ros/log/87bdaa00-d50e-11ee-8c9c-743af4340b7f/ouster-os_nodelet_mgr-2*.log
Initiating shutdown!
================================================================================
[ouster/os_driver-3] killing on exit
[ouster/os_nodelet_mgr-2] killing on exit
[rosout-1] killing on exit
[master] killing on exit
shutting down processing monitor...
... shutting down processing monitor complete
done

@Samahu
Copy link
Contributor

Samahu commented Feb 27, 2024

No that's a different issue, I don't see an out of memory problem in the last log

@Samahu
Copy link
Contributor

Samahu commented Feb 27, 2024

Where did you get the idea that the process is dying because of an out of memory problem? I don't see that in the logs you provided.

There are multiple case where the following issue occurs:

FATAL] [1708996967.972998309]: Failed to load nodelet '/ouster/os_driver` of type `ouster_ros/OusterDriver` to manager `os_nodelet_mgr'
================================================================================REQUIRED process [ouster/os_nodelet_mgr-2] has died!

I would gdb or verbose logging and see what that may yield.

@varun-dhar
Copy link
Author

My kernel logs say as much.

@Samahu
Copy link
Contributor

Samahu commented Feb 27, 2024

My kernel logs say as much.

You need to build the driver with Debug or RelWithDebugInfo build type and then run it with GDB enabled for the nodelet / nodelet manger

@varun-dhar
Copy link
Author

Couldn't get the driver to run properly under debug mode, but here's a screenshot of the kernel logs showing the OOM killer killing nodelet.
image

Mar 25 15:55:12 arch kernel: Out of memory: Killed process 4072 (nodelet) total-vm:8739116kB, anon-rss:8389832kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:16556kB oom_score_adj:0

@varun-dhar
Copy link
Author

varun-dhar commented Mar 26, 2024

Never mind, got it running under gdb. Same output, no logs:

================================================================================REQUIRED process [ouster/os_driver-3] has died!
process has died [pid 2493, exit code -9, cmd /opt/ros/noetic/lib/nodelet/nodelet load ouster_ros/OusterDriver os_nodelet_mgr __name:=os_driver __log:=/root/.ros/log/3db0b168-eaea-11ee-99de-047c1656c365/ouster-os_driver-3.log].
log file: /root/.ros/log/3db0b168-eaea-11ee-99de-047c1656c365/ouster-os_driver-3*.log
Initiating shutdown!
================================================================================

@Samahu
Copy link
Contributor

Samahu commented Apr 16, 2024

@doggo4242 This problem seem to only to happen on your environment, I can't reproduce and I don't others reporting the same problem. Please consider opening a support ticket so that they might able to assist better with the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants