Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #1186: improve recycled buffer release logic #1187

Merged
merged 1 commit into from
Jan 14, 2024

Conversation

cowtowncoder
Copy link
Member

No description provided.

@cowtowncoder cowtowncoder merged commit ec4ef5e into 2.17 Jan 14, 2024
5 checks passed
@cowtowncoder cowtowncoder deleted the tatu/2.17/1186-buffer-recycler-optimization branch January 14, 2024 00:41
@pjfanning
Copy link
Member

pjfanning commented Jan 14, 2024

@cowtowncoder maybe I'm wrong but I still think we have a GC issue. FasterXML/jackson#204 mentions a GC issue. The new code still means that we need to GC an array - just possibly a different array (the smaller array). This still means GC. This code change may still be helpful generally but may not be enough.

I think something more is at play. Without a reproducible test case, this is all supposition. Should we not ask ourselves why this user is getting so many extra arrays created in the first place? Ideally, we should only create so many arrays and we should have free slots to add them back to buffer recycler. I suppose that we might need to create a new array if the buffer recycler has no free entries (they are all in use).

Maybe we need the ability to choose a recycler that blocks when all its arrays are in use. This would be optional - most users won't want this but users who are worried about object creation and GC might choose to block and wait for an array to be returned to the recycler.

@pjfanning
Copy link
Member

@cowtowncoder I think we are stuck if the OP won't give us more detail or a reproducible case. If you are using jackson-databind to deserialize lots of input files then you will be creating lots of Java instances, not just byte and char arrays. I see nothing in what the OP has shown that they have anything more than guesswork to point fingers specifically at the buffer recycling. There are no stats from Profiling tools. No heap dumps. Nada.

@cowtowncoder
Copy link
Member Author

@pjfanning right, I added similar note to #1186 (wrt this not avoiding possible GC for "extra" buffers).
And I agree that there is little evidence here. However, I do think that the change I made here is net positive.

But beside anything done in BufferRecycler container, there is much more important work to be done for #1117: changing RecyclerPool default implementation.

The issue at that level is that with ThreadLocal-based pooling, there will only be single BufferRecycler instance per Thread. So if and when there are more than 1 instance of JsonParser / JsonGenerator used within same thread, reycling only avoids one of allocations per buffer-type.
But with all other (new) ReyclerPool implementations there will be distinct BufferRecycler per parser/generator.
This may lead to more buffer retention, but it should make recycling itself much more effective,

(sidenote: this is where exposing metrics for hit/recycle rate would be useful.. but that's much more work).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants