Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a cache to LaunchedURLClassloader to improve startup performance #30016

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

zjulbj
Copy link

@zjulbj zjulbj commented Mar 1, 2022

Motivation

URLClassLoader shows awful performance when loading Classes/Resources which doesn't exist . URLClassPath takes linear time to find a resource and it will traverse all over the classpath even if the resource doesn't exist. This will obviously slow down the startup , which is even worse when running from an executable fat jar or if the application's class path is a long list.

 public Resource getResource(String name, boolean check) {
        if (DEBUG) {
            System.err.println("URLClassPath.getResource(\"" + name + "\")");
        }

        Loader loader;
        for (int i = 0; (loader = getLoader(i)) != null; i++) {
            Resource res = loader.getResource(name, check);
            if (res != null) {
                return res;
            }
        }
        return null;
    }

Scenarios which may lead to this issue are as follows.

1.Find a class repeatedly which apparently doesn't exist

  • java.beans.Introspector.findCustomizerClass
private static Class<?> findCustomizerClass(Class<?> type) {
      String name = type.getName() + "Customizer";
      try {
          type = ClassFinder.findClass(name, type.getClassLoader());
          // Each customizer should inherit java.awt.Component and implement java.beans.Customizer
          // according to the section 9.3 of JavaBeans specification
          if (Component.class.isAssignableFrom(type) && Customizer.class.isAssignableFrom(type)) {
              return type;
          }
      }
      catch (Exception exception) {
          // ignore any exceptions
      }
      return null;
  }

related issue:
spring-projects/spring-framework#13653

Here are the top 10 calls of loadClass in the startup of my application

Calls ClassName
480  java.lang.ObjectCustomizer
 249  org.springframework.beans.factory.AwareCustomizer
  99  org.springframework.beans.factory.InitializingBeanCustomizer
  93  org.springframework.core.OrderedCustomizer
  87  org.springframework.context.ApplicationContextAwareCustomizer
  61  org.springframework.beans.factory.BeanFactoryAwareCustomizer
  49  org.springframework.context.annotation.ConfigurationClassEnhancer$EnhancedConfigurationCustomizer
  41  org.springframework.context.ApplicationListenerCustomizer
  31  org.springframework.web.context.ServletContextAwareCustomizer
  29  org.springframework.beans.factory.BeanNameAwareCustomizer
  • org.apache.logging.log4j.core.lookup.Interpolator
 if (Constants.IS_WEB_APP) {
            try {
                strLookupMap.put(LOOKUP_KEY_WEB,
                    Loader.newCheckedInstanceOf("org.apache.logging.log4j.web.WebLookup", StrLookup.class));
            } catch (final Exception ignored) {
                handleError(LOOKUP_KEY_WEB, ignored);
            }
        } else {
            LOGGER.debug("Not in a ServletContext environment, thus not loading WebLookup plugin.");
        }
        try {
            strLookupMap.put(LOOKUP_KEY_DOCKER,
                Loader.newCheckedInstanceOf("org.apache.logging.log4j.docker.DockerLookup", StrLookup.class));
        } catch (final Exception ignored) {
            handleError(LOOKUP_KEY_DOCKER, ignored);
        }
        try {
            strLookupMap.put(LOOKUP_KEY_SPRING,
                    Loader.newCheckedInstanceOf("org.apache.logging.log4j.spring.boot.SpringLookup", StrLookup.class));
        } catch (final Exception ignored) {
            handleError(LOOKUP_KEY_SPRING, ignored);
        }
        try {
            strLookupMap.put(LOOKUP_KEY_KUBERNETES,
                    Loader.newCheckedInstanceOf("org.apache.logging.log4j.kubernetes.KubernetesLookup", StrLookup.class));
        } catch (final Exception | NoClassDefFoundError error) {
            handleError(LOOKUP_KEY_KUBERNETES, error);
        }

Here are the top 4 calls of loadClass in the startup of my application

Calls ClassName
225 org.apache.logging.log4j.docker.DockerLookup
225 org.apache.logging.log4j.kubernetes.KubernetesLookup
225 org.apache.logging.log4j.spring.cloud.config.client.SpringLookup
225 org.apache.logging.log4j.web.WebLookup
  • Many other techniques or frameworks which use loadClass to check class's existence
    For example,org.springframework.http.converter.json.Jackson2ObjectMapperBuilder
private void registerWellKnownModulesIfAvailable(MultiValueMap<Object, Module> modulesToRegister) {
        Class kotlinModuleClass;
        Module kotlinModule;
        try {
            kotlinModuleClass = ClassUtils.forName("com.fasterxml.jackson.datatype.jdk8.Jdk8Module", this.moduleClassLoader);
            kotlinModule = (Module)BeanUtils.instantiateClass(kotlinModuleClass);
            modulesToRegister.set(kotlinModule.getTypeId(), kotlinModule);
        } catch (ClassNotFoundException var6) {
        }

2.Find resource repeatedly which doesn't exist

Here are the top 4 calls of findResource/findResources in the startup of my application

Calls  Resource
106 META-INF/services/org.apache.logging.log4j.core.util.WatchEventService
  92 log4j2.system.properties
  64 META-INF/services/javax.xml.parsers.DocumentBuilderFactory
  29 META-INF/spring.components
  28 META-INF/services/javax.xml.transform.TransformerFactory
  18 java/lang/com.class
  18 Throwable.class
  18 Object.class
  18 Class.class
  17 String.class
   8 META-INF/services/javax.xml.datatype.DatatypeFactory
   5 config/application.yml
   5 config/application.yaml
   5 config/application.xml
   5 application.yml
   5 application.yaml
   5 application.xml

3.Find resource repeatedly

For example

Calls  Resource
  13 org/springframework/boot/autoconfigure/dao/PersistenceExceptionTranslationAutoConfiguration.class
  13 org/springframework/boot/actuate/autoconfigure/health/HealthEndpointConfiguration.class
  12 org/springframework/orm/ibatis/support/SqlMapClientDaoSupport.class
  12 org/springframework/dao/support/DaoSupport.class
  12 org/springframework/core/io/ByteArrayResource.class
  12 org/springframework/core/io/AbstractResource.class
  12 com/github/javaparser/ast/visitor/VoidVisitorAdapter.class
   8 config/application.properties
   7 application.properties
   5 config/application-local.properties
   3 org/apache/http/client/version.properties

4.Reflection repeatedly triggers sun.reflect.GeneratedMethodAccessor<N> class loading in runtime

All the cases listed above happen at startup time, while there are cases in runtime. This will make runtime performance worst. Here is a case which triggers sun.reflect.GeneratedMethodAccessor<N> class loading : https://blog.actorsfit.com/a?ID=00001-e5b50c75-c8a5-4cdb-9b0c-5558fb985a60

  at java.lang.ClassLoader.loadClass(ClassLoader.java:404)
        - waiting to lock <0x0000000088104b60> (a java.lang.Object)
        at org.springframework.boot.loader.LaunchedURLClassLoader.loadClass(LaunchedURLClassLoader.java:94)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at org.apache.logging.log4j.core.impl.ThrowableProxy.loadClass(ThrowableProxy.java:539)
        at org.apache.logging.log4j.core.impl.ThrowableProxy.toExtendedStackTrace(ThrowableProxy.java:660)
        at org.apache.logging.log4j.core.impl.ThrowableProxy.<init>(ThrowableProxy.java:137)
        at org.springframework.boot.loader.LaunchedURLClassLoader.loadClass(LaunchedURLClassLoader.java:94)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at org.apache.logging.log4j.core.impl.ThrowableProxy.loadClass(ThrowableProxy.java:539)
        at org.apache.logging.log4j.core.impl.ThrowableProxy.toExtendedStackTrace(ThrowableProxy.java:660)
        at org.apache.logging.log4j.core.impl.ThrowableProxy.<init>(ThrowableProxy.java:137)
        at org.apache.logging.log4j.core.impl.ThrowableProxy.<init>(ThrowableProxy.java:121)
        at org.apache.logging.log4j.core.impl.Log4jLogEvent.getThrownProxy(Log4jLogEvent.java:555)
        at org.apache.logging.log4j.core.pattern.ExtendedThrowablePatternConverter.format(ExtendedThrowablePatternConverter.java:61)

We generally solve these issues mentioned above case by case and solutions are normally tricky. Is there any final solution to solve this fundamentally?

Approach

This commit trying to fundamentally solve this issue by adding a ClassLoaderCache to LaunchedURLClassLoader .

  • Cache ClassNotFoundException and fast throw it for the next time
  • Cache getResource result directly
  • Cache getResources NOT-EXIST result
  • Set system property loader.cache.enable=true to enable cache

Benefit

Here are the startup performance of our applications after this enhancement, which shows a 50% acceleration on average

App No Cache Cache Improvement
app1 75s 39s 48%
app2 83s 32s 61%
app3 152s 87s 43%
app4 318s 137s 57%

Enviroment:2C4G , OpenJDK 1.8.0_172, springboot 2.7.0

This commit adds a ClassLoaderCache to LaunchedURLClassLoader  to improve startup performance. URLClassLoader shows awful performance when find Classes/Resources which do not exist.
@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged label Mar 1, 2022
尚之 added 2 commits March 1, 2022 19:23
This commit adds a ClassLoaderCache to LaunchedURLClassLoader  to improve startup performance. URLClassLoader shows awful performance when find Classes/Resources which do not exist.
This commit adds a ClassLoaderCache to LaunchedURLClassLoader  to improve startup performance. URLClassLoader shows awful performance when find Classes/Resources which do not exist.
@wilkinsona
Copy link
Member

This looks very promising. Thank you, @zjulbj. Have you done any comparisons of heap usage with and without the cache?

@wilkinsona wilkinsona added the status: waiting-for-feedback We need additional information before we can continue label Mar 2, 2022
@zjulbj
Copy link
Author

zjulbj commented Mar 4, 2022

This looks very promising. Thank you, @zjulbj. Have you done any comparisons of heap usage with and without the cache?

Hi wilkinsona. Thanks for your reply. Talking about heap usage , there are two ways to control it

  • All caches are instances of fixed size LRU cache, and cache size can be configured by loader.cache.size (default 3000)
protected <K, V> Map<K, V> createCache(int maxSize) {
    return Collections.synchronizedMap(new LinkedHashMap<K, V>(maxSize, 0.75f, true) {
	    @Override
	    protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
		    return size() >= maxSize;
	    }
    });
}
  • All caches are cleared after the startup, which is invoked by org.springframework.boot.ClearCachesApplicationListener
/**
* Clear URL caches.
*/
public void clearCache() {
    this.loaderCache.clearCache();
    if (this.exploded) {
	    return;
    }
    for (URL url : getURLs()) {
	    try {
		    URLConnection connection = url.openConnection();
		    if (connection instanceof JarURLConnection) {
			    clearCache(connection);
		    }
	    }
	    catch (IOException ex) {
		    // Ignore
	    }
    }
}

Here is a comparison of memory usage, and I think it's acceptable
without cache

HEAP-MEMORY-USAGE                                  init : 268435456(256.0 MiB)
 [memory in bytes]                                  used : 1181133080(1.1 GiB)
                                                    committed : 1799880704(1.7 GiB)
                                                    max : 3817865216(3.6 GiB)
 NO-HEAP-MEMORY-USAGE                               init : 2555904(2.4 MiB)
 [memory in bytes]                                  used : 171428168(163.5 MiB)
                                                    committed : 176160768(168.0 MiB)
                                                    max : -1(-1 B)
 PENDING-FINALIZE-COUNT                             0

with cache

 HEAP-MEMORY-USAGE                                  init : 268435456(256.0 MiB)
 [memory in bytes]                                  used : 1036303928(988.3 MiB)
                                                    committed : 1913126912(1.8 GiB)
                                                    max : 3817865216(3.6 GiB)
 NO-HEAP-MEMORY-USAGE                               init : 2555904(2.4 MiB)
 [memory in bytes]                                  used : 171065096(163.1 MiB)
                                                    committed : 176160768(168.0 MiB)
                                                    max : -1(-1 B)
 PENDING-FINALIZE-COUNT                             0

@spring-projects-issues spring-projects-issues added status: feedback-provided Feedback has been provided and removed status: waiting-for-feedback We need additional information before we can continue labels Mar 4, 2022
@mhalbritter
Copy link
Contributor

Thanks for the heap comparisons! I'm going to bring this up on our next team call.

@mhalbritter mhalbritter added for: team-attention An issue we'd like other members of the team to review for: team-meeting An issue we'd like to discuss as a team to make progress status: pending-design-work Needs design work before any code can be developed type: enhancement A general enhancement and removed for: team-attention An issue we'd like other members of the team to review labels Mar 7, 2022
@mhalbritter mhalbritter added this to the General Backlog milestone Mar 9, 2022
@mhalbritter mhalbritter removed status: waiting-for-triage An issue we've not yet triaged for: team-meeting An issue we'd like to discuss as a team to make progress labels Mar 9, 2022
@mhalbritter mhalbritter modified the milestones: General Backlog, 2.x Mar 9, 2022
@mhalbritter mhalbritter removed the status: feedback-provided Feedback has been provided label Mar 9, 2022
@mhalbritter
Copy link
Contributor

Hey, thanks for your contribution and your work. We want to run more benchmarks on it and want to explore other paths (like indexing the classpath etc.) and don't want to commit yet on the cache solution. I'll put it on hold until we have more information / ideas.

@zjulbj
Copy link
Author

zjulbj commented Mar 9, 2022

Hey, thanks for your contribution and your work. We want to run more benchmarks on it and want to explore other paths (like indexing the classpath etc.) and don't want to commit yet on the cache solution. I'll put it on hold until we have more information / ideas.

Many thanks for your reply. If I understand correctly,you want to explore other paths to solve this issue in a more widely view? That is also what I am going to do next. My rough idea is generating a index file mapping class/resource/packages to jar urls in the compile time, a little bit like INDEX.LIST,which I believe can imporve the time complexity to O(1). I not sure whether this is what 'indexing the classpath' meant with,and whether am I on a proper path

@mhalbritter
Copy link
Contributor

Yes, that was meant with "indexing the classpath" :)

@HzjNeverStop
Copy link

@mhalbritter Is there any progress in "indexing the classpath"?will it be realize in 2.7.x or only support in 3.x?

@wilkinsona wilkinsona modified the milestones: 2.x, 3.x Aug 1, 2022
@wilkinsona
Copy link
Member

This will be 3.0.x at the earliest, but more likely later in the 3.x cycle. I have updated the milestone accordingly.

@wuyupengwoaini
Copy link

@wilkinsona @mhalbritter
Is there any progress in "indexing the classpath"?I really want to know how this is going, because it seems to really help us shorten the startup time of the application. Because some of our big applications have thousands of dependencies

@wilkinsona
Copy link
Member

Not at the moment, unfortunately. We have other, higher-priority work at the moment. Perhaps you could build the changes in this PR and share some before and after numbers for your apps?

@wilkinsona
Copy link
Member

wilkinsona commented Jan 31, 2024

Those interested in this PR, it would be really useful to know if you've tried running your app using class data sharing (CDS), particularly with Java 21. We've seen some very promising results with Spring Boot apps and would like to know if you have a similar experience. You can learn more in this blog post from @sdeleuze that includes some information about a script that unpacks a Spring Boot executable jar into a format that's as CDS-friendly as possible.

@mmoayyed
Copy link
Contributor

I wrote about my experience with CAS here.

I did not try the script that does the unpacking, but in a nutshell, what I saw (with a WAR file) was:

jar -xf build/libs/cas.war
cd build/libs
java org.springframework.boot.loader.launch.JarLauncher
...
...
...
<Started CasWebApplication in 7.693 seconds (process running for 9.285)>

Then:

jar -xf build/libs/cas.war
cd build/libs
java -cp "WEB-INF/classes:WEB-INF/lib/*" org.apereo.cas.web.CasWebApplication
...
...
...
<Started CasWebApplication in 6.873 seconds (process running for 7.469)>

and then with CDS:

java -XX:+AutoCreateSharedArchive -XX:SharedArchiveFile=cas.jsa \
  org.springframework.boot.loader.launch.JarLauncher
...
...
...
<Started CasWebApplication in 6.766 seconds (process running for 7.582)>

Not that much difference, but I can see what I might have missed with the script.

@sdeleuze
Copy link
Contributor

sdeleuze commented Jan 31, 2024

Please check the guidelines printed when using the unpack-executable-jar.sh script.

It is expected they you get no significant gain when using the JarLauncher.

To start the application from that directory:

$ java -jar run-app.jar

TIP: To improve startup performance, you can create a CDS Archive with a single training run:

$ java -XX:ArchiveClassesAtExit=application.jsa -Dspring.context.exit=onRefresh -jar run-app.jar

Then use the generated cache:

$ java -XX:SharedArchiveFile=application.jsa -jar run-app.jar

See https://docs.spring.io/spring-framework/reference/integration/class-data-sharing.html for more details.

@mmoayyed
Copy link
Contributor

mmoayyed commented Jan 31, 2024

Thanks! I made small modifications to the script so it can run against a WAR file (replaced BOOT-INF with WEB-INF). With that in mind, I am not sure if this is still decent feedback and a good test, but here we go:

unpack-executable-jar.sh -d tmp build/libs/cas.war

Then:

$ java -jar tmp/run-app.jar
.
.
- <Started CasWebApplication in 6.637 seconds (process running for 8.178)>

Then:

java -XX:ArchiveClassesAtExit=application.jsa -Dspring.context.exit=onRefresh -jar tmp/run-app.jar
java -XX:SharedArchiveFile=application.jsa -jar tmp/run-app.jar
.
.
- <Started CasWebApplication in 6.284 seconds (process running for 6.693)>

@sdeleuze
Copy link
Contributor

sdeleuze commented Feb 7, 2024

I had a deeper look, this CAS application seems to perform a lot of processing at startup. To put things in perspective, on my powerful MacBook Pro M2, the CAS application takes 5.285 s to start (after disabling key generation at startup) while Petclinic JDBC takes 1.448 s to start. CDS is improving the loading of the classes, it can't optimize custom application processing at startup.

If we analyze the data points more in details (unpacked variant is done with the unpack-executable-jar.sh script which replicate #38276 behavior):

JARexe: Started CasWebApplication in 7.854 seconds (process running for 8.709)
Unpacked: Started CasWebApplication in 4.778 seconds (process running for 5.285)
Unpacked with CDS:Started CasWebApplication in 4.36 seconds (process running for 4.712)

JARexe: Started PetClinicApplication in 1.586 seconds (process running for 1.842)
Unpacked: Started PetClinicApplication in 1.299 seconds (process running for 1.448)
Unpacked with CDS: Started PetClinicApplication in 0.859 seconds (process running for 0.978)
Unpacked with CDS + AOT: Started PetClinicApplication in 0.678 seconds (process running for 0.81)
Project Leyden + AOT: even more

So CDS seems to optimize pretty consistently the loading of the classes on the 2 samples (0,573 s reduction versus 0,47 s). This half of a second on a powerful machine turns to be multiple seconds when deployed on a cheap Cloud server.

I think the layout used by the script and proposed by #38276 is immune even without CDS to the performance issue discussed in this PR, since it does not use LaunchedURLClassloader. So using both system classloader + CDS provide a significant performance boost.

AOT could be used to optimize even more the Spring Boot configuration processing of this application, but I was not able to make it work with CAS due to some double bean registration (feel free to raise a bug on https://github.com/spring-projects/spring-framework if you think there is something to fix on Spring Framework side).

Cache Data Store (Project Leyden premain) will be able to optimize with a wider scope than just class loading (optionally combined with AOT). It will for example include JVM AOT cache and maybe some classpath check caches. For the rest, that will be to the application itself to make careful use of the processing steps happening before startup.

@wilkinsona
Copy link
Member

Thanks very much, @sdeleuze.

@mmoayyed
Copy link
Contributor

mmoayyed commented Feb 9, 2024

Thank you very much @sdeleuze.

(At the moment AOT processing of the CAS application is blocked due to a bug in Spring Data. I will revisit this once the fix is out, but even at the time I was able to build native images, startup time of the native image was still around 2~3 seconds)

I hope this isn't hijacking the thread here, but does the team have any recommendations on how to analyze the startup data and performance beyond the likes of VisualVM, etc? One suspicion I have (at least in the case of this CAS app) is that given it's a Spring Cloud app and uses @RefreshScope, I wonder if all those proxied beans are slowing down the startup. I don't suppose PetClinicApplication uses Spring Cloud, or put differently, have you also done CDS analysis of Spring Cloud apps? Is that not in scope of this exercise?

@sdeleuze
Copy link
Contributor

sdeleuze commented Feb 9, 2024

@mmoayyed You can probably use FlightRecorderApplicationStartup. @RefreshScope can indeed have an impact but the 3 seconds in native tends to show this is more application specific. Now let's maybe indeed focus again on the original purpose of that PR ;-)

@mmoayyed
Copy link
Contributor

mmoayyed commented Feb 9, 2024

Sure thing, thank you very much!

@qixiaobo

This comment was marked as off-topic.

@qsLI
Copy link

qsLI commented Oct 18, 2024

Thanks for your pull request. With this cache, our app's start time has decreased by 50%.

@qsLI
Copy link

qsLI commented Oct 18, 2024

The AppCDS technology relies on generating an archive file every time the JVM exits.
How is this issue addressed during the initial startup? Moreover, applications are iterative, with different code each time, necessitating the regeneration of the archive file.

AppCDS for custom classloader has a more complex loading process: the JVM uniquely identifies a class using a (classloader, className) tuple.

  • For the BootClassloader and AppClassloader, they are unique in every run, thus maintaining a consistent identity across multiple runs.

  • For custom classloaders, apart from their type, there is no obvious unique identifier. Consequently, AppCDS cannot locate the required InstanceKlass entry in the shared archive during the class loading phase using just the classloader object and type.

To conclude, CDS is not the key to solve the startup time problem

@sdeleuze
Copy link
Contributor

sdeleuze commented Oct 18, 2024

@qsLI CDS indeed does not apply to custom classloaders, but I am wondering if you take the perspective of running your Spring Boot application directly from the executable JAR on production.

The AppCDS technology relies on generating an archive file every time the JVM exits. How is this issue addressed during the initial startup?

It is kind of addressed via a more integrated feature that automatically perform the training run without starting fully your application, see https://docs.spring.io/spring-boot/how-to/class-data-sharing.html for more details. There are still side effects like early database interaction but that could evolve and we have documented how to avoid that easily.

With this cache, our app's start time has decreased by 50%.

Could you please confirm this is compared to the startup time of an executable JAR. Can you measure against an application extracted with what is described in https://docs.spring.io/spring-boot/reference/packaging/efficient.html?

@qsLI
Copy link

qsLI commented Oct 23, 2024

@sdeleuze
First of all, thank you for your detailed response.

Due to historical reasons, we have been using the Spring Boot executable JAR mode for deployment, so most class loaders are LaunchedURLClassLoader, which makes it impossible to use the CDS feature. Switching to the extracted JAR mode should resolve this issue.

Additionally, our project is filled with various lazy implementations, which may require actual traffic to trigger the corresponding class loading. Generating the CDS file during the compile phase might not fully resolve the issues we are encountering.

The 50% performance improvement data is indeed based on the executable JAR. We will try the extracted mode when we have the time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: pending-design-work Needs design work before any code can be developed type: enhancement A general enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants