One of the most effective and important parts of the Java Virtual Machine is the Just-In-Time (JIT) compiler. However, many applications are not written to take full advantage of the high-performance capabilities of the JIT. Many developers are not even aware of how effectively their applications are using JIT.
In this article, we'll provide a boot camp guide to some simple techniques that you can use to identify issues that your application may have that make it unfriendly to JIT. We won't cover full details about how JIT compilation works, just some basic checks and ways to improve your code that can provide an easy way to get started with making your application JIT-friendly.
The key point about JIT compilation is that Hotspot automatically monitors which methods are being executed by the interpreter. Once a method has been called often enough it is marked for compilation into machine code. These "hot methods" are compiled by a JVM thread in the background. Until this compilation finishes, the JVM keeps running - using the original interpreted version of the method. Only once the method is fully compiled does Hotspot patch the method dispatch table to point to the new form.
Hotspot has a large number of different optimization techniques for JIT compilation - but one of the most important for our purposes is inlining. This is the process of removing virtual method calls, by effectively hoisting the body of methods into the caller's scope. For example, consider this piece of code:
public int add(int x, int y) { return x + y; } int result = add(a, b);
When inlining occurs, this code is effectively transformed into:
int result = a + b;
The values a and b have been substituted for the method parameters, and the code comprising the body of the method has been copied into the caller's scope. There are a number of benefits of inlining for the programmer. For example:
- No performance penalty for good coding practices
- Reduced pointer indirections
- Elimination of virtual method lookups
Furthermore, by bringing more code into view, inlining opens the possibility of further optimisations and more inlining, as the JIT compiler now has more code to work with.
Inlining is dependent on the size of the method. By default methods comprising 35 or fewer bytes of bytecode are eligible. For "hot" methods that are being called often, this threshold rises to 325 bytes. The main threshold can be controlled with a switch -XX:MaxInlineSize=# (and the hot threshold with ‑XX:FreqInlineSize=#) but these should not be changed without doing proper analysis, as blindly changing it can have unexpected impact on the performance of applications.
As inlining can have such a huge effect on the performance of code, it's important that as many methods as possible are eligible for inlining. Let's use a tool called Jarscan to check up on how many methods are inline-friendly.
The Jarscan tool is part of the JITWatch suite of open-source tools for analysing JIT compilation. Unlike the main tool, which analyses JIT logs generated from a running application, Jarscan is a static analysis tool that can work with jar files. It produces a CSV report of which methods are over the "hot" threshold. JITWatch and Jarscan are tools written as part of the AdoptOpenJDK project, under project lead Chris Newland.
To use the tool and generate the large methods report, download a binary from the AdoptOpenJDK Jenkins site (Java 7 binary or Java 8 binary).
Then it can be run simply by:
./jarScan.sh <jars to analyse>
More details about Jarscan can be found on the AdoptOpenJDK wiki.
The large methods report can be very useful a development team can use it to check that their application does not have critical path methods that are too large to JIT. This is still a manual process, however. In order to further automate it, we can use the -XX:+PrintCompilation switch. This will generate log lines like this:
37 1 java.lang.String::hashCode (67 bytes) 124 2 s! java.lang.ClassLoader::loadClass (58 bytes)
The first column shows the time in milliseconds since the process started when JIT compilation took place. The next column is the compile id, which indicates which method is being compiled (a method can be deoptimized and recompiled by Hotspot several times). Next, the output displays additional information in the form of flags (such as s for synchronized and ! for "has exception handlers"). The final two columns show the name and size (in number of bytes of bytecode) of the method being compiled.
For further details on PrintCompilation output, Stephen Colebourne has written a blog post that goes into much more detail about the exact meaning of the various fields that appear in the output.
The PrintCompilation output provides useful dynamic information about the methods that are actually compiled on a given run, which can be combined with the static analysis Jarscan provides, for a clear picture of what is and isn't being compiled. PrintCompilation can be left on in production, as it doesn't meaningfully impact the performance of the JIT compiler.
However, there are two minor annoyances with PrintCompilation that make it less useful than it might be:
- The signatures for methods are not printed out in the output, making it hard to distinguish overloaded methods
- Hotspot does not currently provide a way to redirect the output for PrintCompilation to a separate file. As it stands, output from PrintCompilation can only be sent to stdout.
The second problem has the effect that compilation output ends up mixed in with regular application output. For most server applications this then necessitates a filtering process to separate the compilation output in a separate log. The simplest approach to determining if methods are JIT-friendly is to follow a simple process:
- Identify application methods that are on the critical path for transactions
- Check that these methods do not appear in the Jarscan output
- Check that the methods do appear in the PrintCompilation output
If methods are above the inlining threshold then in most cases the standard approach is to split the important method into smaller pieces that will inline. This will usually yield better performance, but as with all performance optimizations, the original system should be measured, and compared with the modified version; performance driven changes should never be applied blindly.
Almost all Java applications rely upon a stack of libraries to provide key functionality that they rely upon. Jarscan can also help developers by reporting on which library or framework methods are above the hot inlining threshold. As an extreme example, let's examine rt.jar - the main runtime library of the JVM itself.
To make it more interesting, let's compare Java 7 and Java 8 side-by-side and see how things have changed. Let's suppose we have both a Java 7 and a Java 8 JDK installed. First of all, let's run Jarscan over their respective rt.jar files and generate reports, which we save for further analysis:
$ ./jarScan.sh /Library/Java/JavaVirtualMachines/jdk1.7.0_71.jdk/Contents/Home/jre/lib/rt.jar > large_jre_methods_7u71.txt $ ./jarScan.sh /Library/Java/JavaVirtualMachines/jdk1.8.0_25.jdk/Contents/Home/jre/lib/rt.jar > large_jre_methods_8u25.txt
Now we have 2 CSV files, one for 7u71 and one for 8u25. Let's do some comparisons and see how inlining behaviour has changed between the versions. First off, a very simple metric - how many methods are inline-unfriendly in each JRE?
$ wc -l large_jre_methods_* 3684 large_jre_methods_7u71.txt 3576 large_jre_methods_8u25.txt
We can see that over 100 fewer methods are inline-unfriendly in Java 8, as compared to Java 7. Let's dig a bit deeper, and look at the stability (or otherwise) of some key packages. To understand how to do this, let's recall the format of the report that Jarscan outputs. The report consists of 3 fields:
"<package>","<method name and signature>",<num of bytes>
From this, we can use simple Unix text processing tools to investigate the reports. For example, let's see what changes to inline-friendly methods occurred in java.lang between Java 7 and Java 8:
$ cat large_jre_methods_7u71.txt large_jre_methods_8u25.txt | grep -i ^\"java.lang | sort | uniq -c
This uses the grep command to return just lines that start with "java.lang from either report, basically restricting the results to any inline-unfriendly methods present in any class in the java.lang package. The "sort | uniq -c" clause is an old Unix trick - it basically means: Sort the lines (so any identical lines will be next to each other) and then de-duplicate them, but maintain a count (inserted at start of line) of how many copies of each line were seen. Let's look at the actual output as it applies to our Jarscan results:
$ cat large_jre_methods_7u71.txt large_jre_methods_8u25.txt | grep -i ^\"java.lang | sort | uniq -c 2 "java.lang.CharacterData00","int getNumericValue(int)",835 2 "java.lang.CharacterData00","int toLowerCase(int)",1339 2 "java.lang.CharacterData00","int toUpperCase(int)",1307 // ... skipped output 2 "java.lang.invoke.DirectMethodHandle","private static java.lang.invoke.LambdaForm makePreparedLambdaForm(java.lang.invoke.MethodType,int)",613 1 "java.lang.invoke.InnerClassLambdaMetafactory","private java.lang.Class spinInnerClass()",497 // ... more output ----
The entries that start with 2 (remember that this is the count of identical lines as reported by "uniq -c") indicate that between Java 7 and Java 8, these methods did not change at all in bytecode size. This is, of course, not as strong a guarantee as knowing that the bytecodes are exactly the same, but is still a decent indicator of stability. The methods where we have a row count of 1 indicate that either:
a) The bytecode has definitely changed (if two rows with the same signature and row count of 1 appear) or
b) These are new methods.
Let's see what we have for our reports:
1 "java.lang.invoke.AbstractValidatingLambdaMetafactory","void validateMetafactoryArgs()",864 1 "java.lang.invoke.InnerClassLambdaMetafactory","private java.lang.Class spinInnerClass()",497 1 "java.lang.reflect.Executable","java.lang.String sharedToGenericString(int,boolean)",329
These 3 methods that are inline-unfriendly all come from Java 8, so they are new methods. The first two are related to the implementation of lambda expressions, and the third comes from an adjusting of the inheritance hierarchy in the reflection subsystem. In this case, the change is the introduction of a common base class that both Method and Constructor inherit from in Java 8.
Finally, let's look at a surprising feature of the core JDK libraries:
$ grep -i ^\"java.lang.String large_jre_methods_8u25.txt "java.lang.String","public java.lang.String[] split(java.lang.String,int)",326 "java.lang.String","public java.lang.String toLowerCase(java.util.Locale)",431 "java.lang.String","public java.lang.String toUpperCase(java.util.Locale)",439
This tells us that even in Java 8 several key methods of java.lang.String remain inline-unfriendly. In particular, it seems bizarre that toLowerCase() and toUpperCase() are both too large for inlining. However, as these methods must deal with general UTF-8 data rather than ASCII, this does increase the size and complexity of the methods, and unfortunately tips it over the threshold needed for being inline-friendly.
For very high-performance applications that know they are restricted to ASCII data, it is quite common to implement a proprietary StringUtils class, that contains static methods that do a similar job to the inline-unfriendly methods mentioned, but which retain compactness, and the ability to be inlined.
The improvements that we have discussed have mostly focused on static analysis. We can go further than this by using the full power of the main JITWatch tool. This requires compilation logs generated with the -XX:+LogCompilation flag. These logs are XML-based (rather than the simple text output of PrintCompilation) and they are very large, often reaching several hundred MB in size. They also impact the running application (from the impact of writing out the log if nothing else), and so the switch is not usually suitable for use in a production environment.
The combination of PrintCompilation and Jarscan is not as sophisticated, but it can provide a useful first step for application teams who are just starting to understand the JIT behaviour of their applications. In many cases, a quick analysis will also yield some low-hanging fruit, in terms of improved application performance.
About the Author
Ben Evans is the CEO of jClarity, a Java/JVM performance analysis startup. In his spare time he is one of the leaders of the London Java Community and holds a seat on the Java Community Process Executive Committee. His previous projects include performance testing the Google IPO, financial trading systems, writing award-winning websites for some of the biggest films of the 90s, and others.