Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOME Notifier INFO log generated when custom script is provided for -XX:OnOutOfMemoryError #7738

Open
brentm5 opened this issue Oct 4, 2024 · 2 comments
Assignees
Labels
comp: crash tracking Crash tracking

Comments

@brentm5
Copy link
Contributor

brentm5 commented Oct 4, 2024

Recently we recently started seeing Datadog agent logs regarding the OOME notifier script. Below is an example log

OOME notifier script value (/service/upload_heap_dump.sh) does not follow the expected format: <path>/dd_ome_notifier.(sh|bat) %p. OOME tracking is disabled.

We actually define our own script for the -XX:OnOutOfMemoryError action which is used to upload heap dumps and other cleanup tasks. This has caused some confusion with our teams as they believe our own implementation is then turned off even though this is simply an info log. I understand logging if something is not setup correctly, however it does not appear that this takes into account where a JVM defines its own action. Additionally, I do not see an easy way to turn off this functionality to remove the log.

@jbachorik
Copy link
Contributor

Hi @brentm5 - this line should be emitted only on startup and prefixed with a Datadog specific component.
However, if this is causing inconvenience, we will move the notification to 'debug' level (it is still useful to have a notification about why the Datadog OOME notification might not be working in case of misconfiguration).

@jbachorik jbachorik self-assigned this Oct 5, 2024
@jbachorik jbachorik added the comp: crash tracking Crash tracking label Oct 5, 2024
@brentm5
Copy link
Contributor Author

brentm5 commented Oct 8, 2024

For us it would ideally be a debug level log as its current log level makes it appear that there is something wrong causing OOME handling to be disabled.

One thing that would be nice is if we could better hook our own OOME handling into this so we can get the best of both worlds. Not sure if that is something others have expressed an interest in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp: crash tracking Crash tracking
Projects
None yet
Development

No branches or pull requests

2 participants