Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support JFR emergency dumps on out of memory #10600

Open
roberttoyonaga opened this issue Jan 30, 2025 · 2 comments
Open

Support JFR emergency dumps on out of memory #10600

roberttoyonaga opened this issue Jan 30, 2025 · 2 comments

Comments

@roberttoyonaga
Copy link
Collaborator

roberttoyonaga commented Jan 30, 2025

Is your feature request related to a problem? Please describe.
Currently it's possible to receive heap dumps on out of memory (OOM) but this is not yet possible for JFR. OpenJDK has this feature and we should try to implement it as well in Native Image. One of JFR's primary goals is to provide insight in the event of a crash like OOME.

Describe the solution you'd like.
The JFR implementation in Native Image should support emergency dumping like in OpenJDK.

Describe who do you think will benefit the most.
GraalVM users would be most likely to benefit. Heap dumps are probably the most important report in the event of OOM, but insights from JFR could also be very beneficial. For example, JFR's CPU and allocation profiling can help locate where problem areas might be occurring. JFR's garbage collection events and thread data could also be helpful with diagnosing problems.

Describe alternatives you've considered.
The alternative is just leaving this feature unimplemented.

Express whether you'd like to help contributing this feature
I can help contribute this.

Update #5410 when completed

Implementation details
Doing an emergency dump would require:

  1. Flushing in-flight data to the JFR chunk repository on disk.
  2. Opening a new file where we will write the emergency dump.
  3. Copying the contents of all the chunk files in the disk repository to the emergency dump.

In order to do this, JFR flushing will have to be made fully allocation free (most of it already is). A handful of places like the JfrTypeRepository will need to be redone.

@roberttoyonaga roberttoyonaga self-assigned this Jan 30, 2025
@roberttoyonaga roberttoyonaga changed the title Support JFR emergency dumps on OOM Support JFR emergency dumps on out of memory Feb 3, 2025
@roberttoyonaga
Copy link
Collaborator Author

Hi @christianhaeubl, do you foresee any issues this could run into?

@christianhaeubl
Copy link
Member

christianhaeubl commented Feb 4, 2025

I think you already summarized it nicely. The JFR implementation already avoids Java heap allocations most of the time but I am sure that you will encounter a few allocations that are problematic for the emergency dump (i.e., a bit more code needs to be @Uninterruptible).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants