Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ResteasyJackson2Provider: fix ConcurrentHashMap contention #1984

Closed

Conversation

jvmccarthy
Copy link

First post here, so let me thank you all for your hard work on an excellent project! 👍

Replace Jackson's AnnotationBundleKey with an implementation that uses .equals() instead of == to compare annotations. AnnotationBundleKey assumed annotation references were unique which breaks because the JRE doesn't cache parameter annotations, returning new references each time parameter annotations are queried. This was causing a severe performance degradation over time since matching ClassAnnotationKey instances had the same hashCode while failing equality checks, leading to a large and growing single bucket with identical objects and lock contention.

This issue was causing our APIs to eventually become CPU bound and unresponsive. I will file an issue with the Jackson team as well. We initially looked to fix AnnotationBundleKey but noticed in their tests that reference equality was intentional and could cause other issues if == was changed to .equals() in their code. Still, they need to be informed that there are some cases where equality fails in that class.

I'm not sure if this is the right branch for this code. We're currently using 3.6.3.Final so we needed a patch on the 3.6 branch. Let me know if there's anything I can do to help on this fix and if you need an issue filed as well. Thanks!

Replace Jackson's AnnotationBundleKey with an implementation that uses `.equals()` instead of `==` to compare annotations. AnnotationBundleKey assumed annotation references were unique which breaks because the JRE doesn't cache parameter annotations, returning new references each time parameter annotations are queried. This was causing a severe performance degradation over time since matching ClassAnnotationKey instances had the same `hashCode` while failing equality checks, leading to a large and growing single bucket with identical objects and lock contention.
@jvmccarthy
Copy link
Author

Jackson issue filed in jackson-jaxrs-providers#111 with a snippet showing the equality issue.

@asoldano
Copy link
Member

asoldano commented Apr 6, 2019

@jvmccarthy , very interesting report, thanks. Now, first of all we need:

  1. a PR against master, which I've just created cherrypicking your commit: ResteasyJackson2Provider: fix ConcurrentHashMap contention #1985
  2. you need to agree to the fact that all contributions to the project are considered to be under the terms of the ASL2.0 License: http://www.apache.org/licenses/LICENSE-2.0 . Is that ok?

We'll be reviewing your changes soon.

@asoldano asoldano added the 3.6 label Apr 6, 2019
@jvmccarthy
Copy link
Author

@asoldano thank you for the quick response and filling the PR against master!

I agree to the terms of contribution under the Apache 2.0 license. I have checked with my employer (HP Inc.) about contributions of this nature in the past and have received approval to make them.

I looked briefly at one of the build failures and don't believe they're related (the first failure I could find was having trouble unzipping a file, perhaps the build system disk is full). Let me know what else I can do to move this along. We're finding the issue very reproducible over here. If a minimally reproducible example is required, we can get one together for you.

@cowtowncoder
Copy link

A fix on Jackson side (FasterXML/jackson-jaxrs-providers#111) will go in 2.10.0. Not sure what release date would be, hope to get first pre-release/RC out by end of May.

@asoldano
Copy link
Member

asoldano commented Jun 5, 2019

@jvmccarthy , first of all, sorry for the late reply, I got pulled on other tasks. These days I eventually spent some time on this and figured out that I can't reproduce the problem on server side when running in WildFly. While I agree this should be fixed (as a matter of fact, Jackson 2.10.x will include the fix and RESTEasy will eventually consume that), I would be interested in knowing more about your scenario (the container on which RESTEasy is running, the resource methods signatures, whether you're suffering from the problem on client side, server side or both, etc.).
Can you share more info?
Btw I've also performed some preliminary perf tests with your patch and with a patch doing what @cowtowncoder has done in Jackson 2 (basically I copied his AnnotationBundleKey in RESTEasy), both seem to cause ~3% perf (throughput) regression in my testing scenario (which is not suffering from the flaw here)... this is probably because of the additional cost of equals() compared to ==

@cowtowncoder
Copy link

Hmmh. Measurable performance regression is bit unexpected since although I can see that equality is more expensive, it'd seem odd that in absolute terms it would be enough to show (assuming it's not just lookup code but with actual reads or writes).

@jvmccarthy
Copy link
Author

@asoldano thank you for reviewing this issue. I've been on extended vacation and only noticed your update today. Sorry for the delay.

I'm surprised you couldn't reproduce this issue. We saw this in Tomcat 8.5.40 with JRE 1.8.0_212. We were seeing this with resteasy's client API calling an API having this general form, basically a post request that accepts a cookie and uses a simple data object for the body. Stack traces were pointing to ResteasyJackson2Provider.writeTo() as a slowdown, and we tracked that down to the _writers map having hash collisions, turning the backing map into a growing and slowing array.

public interface SampleRestApi {
  @POST
  @Consumes(MediaType.APPLICATION_JSON)
  @Produces(MediaType.APPLICATION_JSON)
  public void someMethod(@CookieParam("Some-Cookie") String someCookie, SomePojo body);
}

The following snippet shows the underlying cause of the issue. Are you getting false output for a == b? We see that parameter annotations aren't cached, which leads to growing maps of _readers and _writers in ResteasyJackson2Provider. I'm also seeing this behavior of non-cached method parameters in JRE 11.0.3.

import javax.annotation.Nonnull;
import java.lang.annotation.Annotation;
import java.lang.reflect.Method;

public class AnnotationEqualitySnippet {
  @Nonnull
  public void annotatedMethod(@Nonnull String a) {}

  public static void main(String[] args) throws Exception {
    Method method = AnnotationEqualitySnippet.class.getMethod("annotatedMethod", String.class);
    
    // Observed method parameter annotations not cached and so don't have referential equality
    Annotation a = method.getParameterAnnotations()[0][0];
    Annotation b = method.getParameterAnnotations()[0][0];
    System.out.println(a == b); // false
    System.out.println(a.equals(b)); // true

    // Observed method annotations cached and have referential equality
    Annotation c = method.getAnnotations()[0];
    Annotation d = method.getAnnotations()[0];
    System.out.println(c == d); // true
    System.out.println(c.equals(d)); // true
  }
}

Thanks again for all the time and effort investigating this with me!

@asoldano
Copy link
Member

@jvmccarthy , apologise again for the late follow-up. I knew I had to write a specific test for this and that it would have taken a while... so I had to wait for the right time to do that. Anyway, I eventually managed to reproduce the issue in my environment, even if it's been hard. Few comments:

  • I believe that with a client built with the SampleRestApi mentioned above, the problem can't be reproduced, as the @CookieParam annotation is not on the entity parameter that's eventually processed by the ResteasyJackson2Provider (it would be passed a zero-length array as annotations parameter in the writeTo(..) invocation). So an annotation has to be applied on the entity parameter, I used @Valid for my testing.

  • In my tests I was originally performing multiple invocations using the same RESTEasy client; after the changes mentioned above, I was always getting the same annotation object instance. Eventually I thought you're probably re-building the client every time and the clients are all relying on the same provider instances (likely because the same ResteasyProviderFactory is resolved; note something has been changed here recently, but one way or the other you must be in the scenario where the inspection of the proxy interface is performed multiple times, while the same resteasy jackson provider is used). When I changed my test to create new clients, I eventually managed to reproduce the problem.

OK, having said this, I believe the fix should be merged. I'm sorry it has taken so much, but I really wanted to see this and understand. Later tonight I'll think if it's practical to write a test for this, then I'll merge. Thanks again, also for the patience.

@cowtowncoder
Copy link

On Jackson side: 2.10.0.pr1 still not out yet, but is in "Any Day Now" state, now that Java module info seems to be working for all components, dependencies.

@asoldano
Copy link
Member

Thanks for the update @cowtowncoder .
@jvmccarthy , I'm going to close this PR as the 3.6 branch is not maintaned anymore these days. The same fix has just been merged on master in #1985 and I'm going to merge it also on branch 3.9 (#2109).
As for the tests it looks like the only decent way for checking this would be have one of our performance tests cover this scenario too, I'm creating a jira about that soon. Thanks again.

@asoldano asoldano closed this Jul 18, 2019
@asoldano
Copy link
Member

Btw, the 3.9 series is meant to be fully backward compatible with 3.6, in case you want to upgrade as soon as it's available

@jvmccarthy
Copy link
Author

@asoldano and @cowtowncoder, thank you both for your efforts here. This was a subtle defect that was giving us grief, and I'm glad we could work together on getting fixes in place for it. Thanks again for the work in these projects!

@cowtowncoder
Copy link

@jvmccarthy +1 !

Also, 2.10.0.pr1 of Jackson is now available if anyone wants to check how behavior has been changed on Jackson side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants