BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles Build Great Native CLI Apps in Java with Graalvm and Picocli

Build Great Native CLI Apps in Java with Graalvm and Picocli

Leia em Português

Key Takeaways

  • Developers want to distribute their command line applications as a single native executable.
  • GraalVM can compile your Java applications to single native images, but it has some limitations.
  • Picocli is a modern library for writing CLI apps on the JVM which can help tackle GraalVM’s limitations including on Windows.
  • Setting up the GraalVM toolchain for creating native images on Windows is not well documented.

The Dream: Java Executables

The Go programming language has become popular for writing command line applications. There may be many reasons for this, but one aspect where Go shines is the ability to compile a program to a single native executable file. This makes the program much easier to distribute.

Java programs have traditionally been hard to distribute because they require a Java Virtual Machine to be installed on the target machine. It is possible to bundle a recent JVM with the application but that adds roughly 200MB to the package size. Things are moving in the right direction: the Java Module System (JPMS), introduced in Java 9, includes the jlink utility that allows an application to create a custom, minimized, JRE, which can be as small as 30-40MB, and Java 14 will include the jpackage utility, which can create an installer that includes this minimum JRE with your application.

Still, for command line applications an installer is not ideal. Ideally, we want to distribute our CLI utility as a "real" native executable without a packaged runtime. GraalVM allows us to do this with programs written in Java.

GraalVM

GraalVM is a universal virtual machine that can run applications written in JavaScript, Python, Ruby, R, JVM-based languages like Java, Scala, Clojure, Kotlin, and LLVM-based languages such as C and C++. One interesting aspect is that GraalVM allows you to mix programming languages: programs partially written in JavaScript, R, Python, or Ruby can be called from Java and can share data with each other. Another feature is the ability to create a native image, and this is what we will explore in this article.

GraalVM Native Images

GraalVM Native Image allows you to ahead-of-time compile Java code to a standalone executable, called a native image. This executable includes the application, the libraries, the JDK and does not run on the Java VM, but includes necessary components like memory management and thread scheduling from a different virtual machine, called "Substrate VM". Substrate VM is the name for the runtime components (like the deoptimizer, garbage collector, thread scheduling etc.). The resulting program has faster startup time and lower runtime memory overhead compared to a Java VM.

Native Image Limitations

To keep the implementation small and concise, and also to allow aggressive ahead-of-time optimizations, Native Image does not support all features of Java. The full set of limitations is documented on the GitHub project.

Two limitations are of particular interest:

Basically, to create a self-contained binary, the native image compiler needs to know up-front all the classes of your application, their dependencies, and the resources they use. Reflection and resource bundles often require configuration. We will see an example of this later on.

Picocli

Picocli is a modern library and framework for building command line applications on the JVM. It supports Java, Groovy, Kotlin and Scala. It is less than 3 years old but has become quite popular with over 500,000 downloads per month. The Groovy language uses picocli to implement its CliBuilder DSL.

Picocli aims to be "the easiest way to create rich command line applications that can run on and off the JVM". It offers colored output, TAB autocompletion, subcommands, and some unique features compared to  other JVM CLI libraries such as  negatable options, repeating composite argument groups, repeating subcommands and sophisticated handling of quoted arguments. Its source code is in a single file so it can optionally be included as source to avoid adding a dependency. Picocli prides itself on its extensive and meticulous documentation.

Picocli uses reflection, so it is vulnerable to GraalVM’s Java native image limitations, but it offers an annotation processor that generates the configuration files that address this limitation at compile time.

A Concrete Use Case

Let’s take a concrete example of a command line utility that we will write in Java and compile to a single native executable. Along the way we will look at some features of the picocli library that help make our utility easy to use.

We will build a checksum CLI utility, that takes a named option -a or --algorithm, and a positional parameter, which is the file whose checksum to compute.

We want our users to be able to use our Java checksum utility just like they use applications written in C++ or other languages. Something like this:

$ echo hi > hi.txt

$ checksum -a md5 hi.txt
764efa883dda1e11db47671c4a3bbd9e

$ checksum -a sha1 hi.txt
55ca6286e3e4f4fba5d0448333fa99fc5a404a73

This is the minimum we expect from a command line application, but we are not going to be satisfied with a lowest common denominator app, we want to create a great CLI application that delights our users. What does that mean and how do we do it?

Great CLI Apps are Helpful

We made a trade-off: by choosing a command line interface (CLI) instead of a graphical user interface (GUI), our application is less easy to learn to use for new users. We can partially make up for that by providing good online help.

Our application should show a usage help message when the user requests help with the -h or --help option, or when invalid user input is specified. It should also show version information when requested with -V or --version. We will see how picocli makes it easy to do this.

User Experience

We can make our application more user-friendly by using colors on supported platforms. This doesn’t just look good, it also reduces the cognitive load on the user: the contrast makes the important information like commands, options, and parameters stand out from the surrounding text.

The usage help message generated by a picocli-based application uses colors by default. Our checksum example looks something like this:

In general, applications should only output colors when they are used interactively; when executing a script we don’t want the log file cluttered with ANSI escape codes. Fortunately, picocli takes care of this automatically. This brings us to the next topic: good CLI apps are designed to be combined with other commands.

Great CLI Apps Work Well with Others

Stdout vs stderr

Many CLI utilities use the standard I/O streams so they can be combined with other utilities. The devil is often in the details. When the user requested help, the application should print the usage help message to standard output. This allows users to pipe the output to another tool like grep or less.

On the other hand, on invalid input, the error message and usage help message should be printed to the standard error stream: in case the output of our program is used as input for another program, we don’t want our error message to disrupt things.

Exit Code

When your program ends, it returns an exit status code. An exit code of zero is often used to indicate success, and a non-zero exit code often indicates a failure of some kind.

This allows users to chain together a number of commands using &&, knowing that if any command in the sequence fails, the whole sequence will stop.

By default, picocli returns 2 for invalid user input, 1 when an exception occurred in the business logic of the application, and zero otherwise (when everything went well). Of course it is easy to configure other exit codes in your application, but for our checksum example the default values are fine.

Note that the picocli library will not call System.exit; it just returns an integer and it is up to the application to call System.exit or not.

Compact Code

The above sections describe quite a bit of functionality. You would think that this would require a lot of code to accomplish, but most of the "standard CLI behavior" is provided by the picocli library. In our application, all we need to do is define our options and positional parameters, and implement the business logic by making our class Callable or Runnable. We can bootstrap the application in our main method in one line of code:

import picocli.CommandLine;
import picocli.CommandLine.Command;
import picocli.CommandLine.Option;
import picocli.CommandLine.Parameters;

import java.io.File;
import java.math.BigInteger;
import java.nio.file.Files;
import java.security.MessageDigest;
import java.util.concurrent.Callable;

@Command(name = "checksum", mixinStandardHelpOptions = true,
      version = "checksum 4.0",
  description = "Prints the checksum (MD5 by default) of a file to STDOUT.")
class CheckSum implements Callable<Integer> {

  @Parameters(index = "0", arity = "1",
        description = "The file whose checksum to calculate.")
  private File file;

  @Option(names = {"-a", "--algorithm"},
    description = "MD5, SHA-1, SHA-256, ...")
  private String algorithm = "MD5";

  // this example implements Callable, so parsing, error handling
  // and handling user requests for usage help or version help
  // can be done with one line of code.
  public static void main(String... args) {
    int exitCode = new CommandLine(new CheckSum()).execute(args);
    System.exit(exitCode);
  }

  @Override
  public Integer call() throws Exception { // the business logic...
    byte[] data = Files.readAllBytes(file.toPath());
    byte[] digest = MessageDigest.getInstance(algorithm).digest(data);
    String format = "%0" + (digest.length*2) + "x%n";
    System.out.printf(format, new BigInteger(1, digest));
    return 0;
  }
}

We have an example of a  realistic Java utility program. Next, let’s take a look at turning it into a native executable.

Native Image

Reflection Configuration

We mentioned earlier that the native image compiler has some limitations: reflection is supported but requires configuration.

This impacts picocli-based applications: at runtime, picocli uses reflection to discover any @Command-annotated subcommands, and the @Option and @Parameters-annotated command options and positional parameters.

Therefore, we need to provide GraalVM with a configuration file specifying all annotated classes, methods and fields. Such a configuration file looks something like this:

[
  {
    "name" : "CheckSum",
    "allDeclaredConstructors" : true,
    "allPublicConstructors" : true,
    "allDeclaredMethods" : true,
    "allPublicMethods" : true,
    "fields" : [
      { "name" : "algorithm" },
      { "name" : "file" }
    ]
  },
  {
    "name" : "picocli.CommandLine$AutoHelpMixin",
    "allDeclaredConstructors" : true,
    "allPublicConstructors" : true,
    "allDeclaredMethods" : true,
    "allPublicMethods" : true,
    "fields" : [
      { "name" : "helpRequested" },
      { "name" : "versionRequested" }
    ]
  }
]

This quickly becomes quite cumbersome for utilities with many options, but fortunately we don’t need to do this by hand.

Picocli Annotation Processor

The picocli-codegen module includes an annotation processor that can build a model from the picocli annotations at compile time rather than at runtime.

The annotation processor generates Graal configuration files under META-INF/native-image/picocli-generated/$project during compilation, to be included in the application jar. This includes configuration files for reflection, resources and dynamic proxies. By embedding these configuration files, your jar is instantly Graal-enabled. In most cases no further configuration is needed when generating a native image.

As a bonus, the annotation processor shows errors for invalid annotations or attributes immediately when you compile, instead of during testing at runtime, resulting in shorter feedback cycles.

So, all we need to do is compile our CheckSum.java source file with the picocli-codegen jar on the classpath:

Compiling CheckSum.java and creating a checksum.jar on Linux. Replace the : path separator with ; for these commands to work on Windows.

mkdir classes
javac -cp .:picocli-4.2.0.jar:picocli-codegen-4.2.0.jar -d classes CheckSum.java
cd classes && jar -cvef CheckSum ../checksum.jar * && cd ..

You can see the generated configuration files are in the META-INF/native-image/picocli-generated/ directory inside the jar:

jar -tf checksum.jar

META-INF/
META-INF/MANIFEST.MF
CheckSum.class
META-INF/native-image/
META-INF/native-image/picocli-generated/
META-INF/native-image/picocli-generated/proxy-config.json
META-INF/native-image/picocli-generated/reflect-config.json
META-INF/native-image/picocli-generated/resource-config.json

We are done with our application. Let’s make a native image as the next step!

GraalVM Native Image Toolchain

To create a native image, we need to install GraalVM, ensure the native-image utility is installed, and install the C/C++ compiler toolchain for the OS we are building on. I had some trouble doing this, so hopefully the steps below clarify things for other developers.

Development is 10% inspiration and 90% getting your environment set up.
— Unknown developer

Install GraalVM

First, install the latest version of GraalVM, 20.0 at the time of this writing. The GraalVM Getting Started page is the best place to get the most up-to-date instructions for installing GraalVM in various operating systems and containers.

Install the Native Image Utility

GraalVM comes with a native-image generator utility. In recent versions of GraalVM, this needs to be downloaded first and installed separately with the Graal Updater tool:

Installing the native-image generator utility for Java 11 on Linux
 

gu install -L /path/to/native-image-installable-svm-java11-linux-amd64-20.0.0.jar

This step also became necessary with the Windows version of GraalVM since 20.0.

For more details, see the Native Image section of the GraalVM Reference Manual.

Install Compiler Toolchain

Linux and MacOS Compiler Toolchain

For compilation native-image depends on the local toolchain, so on Linux and MacOS we need glibc-devel, zlib-devel (header files for the C library and zlib) and gcc to be available on our system.

To accomplish this on Linux: sudo dnf install gcc glibc-devel zlib-devel or sudo apt-get install build-essential libz-dev.

On macOS, execute xcode-select --install.

Windows Compiler Toolchain for Java 8

GraalVM started to offer experimental support for Windows native images since release 19.2.0.

Windows support is still experimental, and the official documentation is sparse on details regarding native images on Windows. From version 19.3, GraalVM supports both Java 8 and Java 11, and on Windows these require different tool chains.

To build native images using the Java 8 version of GraalVM, you need the Microsoft Windows SDK for Windows 7 and .NET Framework 4 as well as the C compilers from KB2519277. You can install these using chocolatey:

choco install windows-sdk-7.1 kb2519277

Then (from the cmd prompt), activate the sdk-7.1 environment:

call "C:\Program Files\Microsoft SDKs\Windows\v7.1\Bin\SetEnv.cmd"

This starts a new Command Prompt, with the sdk-7.1 environment enabled. Run all subsequent commands in this Command Prompt window. This works for all Java 8 versions of GraalVM from 19.2.0 to 20.0.

Windows Compiler Toolchain for Java 11

To build native images using the Java 11 version of GraalVM (19.3.0 and greater), you can either install the Visual Studio 2017 IDE (making sure to include Visual C++ tools for CMake), or you can install the Visual C Build Tools Workload for Visual Studio 2017 Build Tools using chocolatey:

choco install visualstudio2017-workload-vctools

After installation, set up the environment from the cmd prompt with this command:

call "C:\Program Files (x86)\Microsoft Visual Studio\2017\BuildTools\VC\Auxiliary\Build\vcvars64.bat"
 

TIP If you installed the Visual Studio 2017 IDE, replace BuildTools in the above command with either Community or Enterprise, depending on your version of Visual Studio.

 
Then run native-image in that Command Prompt window.

Creating a Native Image

The native-image utility can take a Java application and compile it to a native image that can run as a native executable on the platform that it is compiled on. On Linux this can look like this:

Creating a native image on Linux

$ /usr/lib/jvm/graalvm/bin/native-image \
    -cp classes:picocli-4.2.0.jar --no-server \
    --static -H:Name=checksum  CheckSum

The native-image utility will take about a minute to complete on my laptop, and produces output like this:

[checksum:1073]    classlist:   3,124.74 ms,  1.14 GB
[checksum:1073]        (cap):   2,885.31 ms,  1.14 GB
[checksum:1073]        setup:   4,767.19 ms,  1.14 GB
[checksum:1073]   (typeflow):   8,733.59 ms,  1.94 GB
[checksum:1073]    (objects):   6,073.44 ms,  1.94 GB
[checksum:1073]   (features):     313.28 ms,  1.94 GB
[checksum:1073]     analysis:  15,384.41 ms,  1.94 GB
[checksum:1073]     (clinit):     322.84 ms,  1.94 GB
[checksum:1073]     universe:     793.02 ms,  1.94 GB
[checksum:1073]      (parse):   2,191.69 ms,  1.94 GB
[checksum:1073]     (inline):   2,064.62 ms,  2.13 GB
[checksum:1073]    (compile):  14,960.43 ms,  2.73 GB
[checksum:1073]      compile:  20,040.78 ms,  2.73 GB
[checksum:1073]        image:   1,272.17 ms,  2.73 GB
[checksum:1073]        write:     722.20 ms,  2.73 GB
[checksum:1073]      [total]:  46,743.28 ms,  2.73 GB

At the end, we have a native Linux executable. Interestingly, the native binary created with the Java 11 version of GraalVM is a bit bigger than the one created with the Java 8 version of GraalVM:

-rwxrwxrwx 1 remko remko 14744296 Feb 19 09:51 java11-20.0/checksum*
-rwxrwxrwx 1 remko remko 12393600 Feb 19 09:48 java8-20.0/checksum*

We can see the binary is 12.4 - 14.7 MB in size. We can consider that big or small, depending on what we compare it with. For me it is an acceptable size.

Let’s run the application to verify that it works. While we are at it we may as well compare the startup times of running the application on a normal JIT-based JVM to that of the native image:

$ time java -cp classes:picocli-4.2.0.jar CheckSum hi.txt
764efa883dda1e11db47671c4a3bbd9e

real    0m0.415s   ← startup is 415 millis with normal Java
user    0m0.609s
sys     0m0.313s
$ time ./checksum hi.txt
764efa883dda1e11db47671c4a3bbd9e

real    0m0.004s   ← native image starts up in 4 millis
user    0m0.002s
sys     0m0.002s

So, on Linux at least, we can now distribute our Java application as a single native executable. What is the story on Windows?

Native Image on Windows

Native image support on Windows has some gotchas, so we will look at this in more detail.

Creating Native Images on Windows

Creating the native image itself is not a problem. For example:

Creating a native image on Windows

C:\apps\graalvm-ce-java8-20.0.0\bin\native-image ^
    -cp picocli-4.2.0.jar --static -jar checksum.jar

We get similar output from the native-image.cmd utility on Windows as what we saw on Linux, taking a comparable amount of time, and resulting in a slightly smaller executable of 11.3 MB for the Java 8 version of GraalVM, and 14.2 MB for a binary created with the Java 11 version of GraalVM.

The binaries work fine, with one difference: we don’t see ANSI colors on the console. Let’s look at fixing that.

Windows Native Images with Colored Output

To get ANSI colors in the Windows command prompt, we need to use the Jansi library. Unfortunately, Jansi (as of version 1.18) has some problems that mean it fails to produce colored output in a GraalVM native image. To work around this, picocli offers a Jansi companion library, picocli-jansi-graalvm, that allows the Jansi library to work correctly in GraalVM native images on Windows.

We change the main method to tell Jansi to enable rendering ANSI escape codes on Windows, like this:

    //...
import picocli.jansi.graalvm.AnsiConsole;
//...
public class CheckSum implements Callable<Integer> {

  // ...

  public static void main(String[] args) {
    int exitCode = 0;
    // enable colors on Windows
    try (AnsiConsole ansi = AnsiConsole.windowsInstall()) {
      exitCode = new CommandLine(new CheckSum()).execute(args);
    }
    System.exit(exitCode);
  }
}

And build a new native image with this command (note that from GraalVM 19.3, it became necessary to quote the jars on the classpath):
 

set GRAALVM_HOME=C:\apps\graalvm-ce-java11-20.0.0

%GRAALVM_HOME%\bin\native-image ^
  -cp "picocli-4.2.0.jar;jansi-1.18.jar;picocli-jansi-graalvm-1.1.0.jar;checksum.jar" ^
  picocli.nativecli.demo.CheckSum checksum

And we have colors in our DOS console application:

It takes a little extra effort, but now our native Windows CLI app can use color contrast to provide a similar user experience as on Linux.

The size of the resulting binaries did not change much with the addition of the Jansi libraries: building with Java 11 GraalVM gave a 14.3 MB binary, building with Java 8 GraalVM gave a 11.3 MB binary.

Running Native Images on Windows

We are almost done, but there is one more gotcha that is not immediately apparent.

The native binary we just created works fine on the machine where we just built it, but when you run it on a different Windows machine, you may see the following error:

It turns out that our native image needs the msvcr100.dll from VS C++ Redistributable 2010. This dll can be placed in the same directory as the exe, or in C:\Windows\System32. There is work in progress to try to improve on this.

With GraalVM for Java 11, we get a similar error, except that it reports a different missing DLL, the VCRUNTIME140.dll:

For now, we will have to distribute these DLLs with our application, or tell our users to download and install the Microsoft Visual C++ 2015 Redistributable Update 3 RC to get the VCRUNTIME140.dll for Java 11-based native images, or Microsoft Visual C++ 2010 SP1 Redistributable Package (x64) to get the msvcr100.dll for Java 8-based native images.

GraalVM does not support cross-compilation, although it may in the future. For now, we need to compile on Linux to get a Linux executable, compile on MacOS to get a MacOS executable, and compile on Windows to get a Windows executable.

Conclusion

Command line applications are the canonical use case for GraalVM native images: we can now develop in Java (or another JVM language) and distribute our CLI applications as a single, relatively small, native executable. (Except on Windows, where we may need to distribute an additional runtime DLL.) The fast startup and reduced memory footprint are nice bonuses.

GraalVM native images have some limitations, and applications may need to do some work before they can be turned into a native image.

Picocli makes it easy to write command line applications in many JVM-based languages, and provides several extras to painlessly turn your CLI applications into native images.

Give Picocli and GraalVM a try for your next command line application!

About the Author

Remko Popma does Algo Development at SMBC Nikko Securities by day and he's the author of picocli by night. Popma is also Log4j 2 performance hacker. He made Log4j 2 garbage-free, and contributed Async Loggers, which have 10x more throughput and orders of magnitude lower latency in multi-threaded scenarios compared to current market-leading logging packages log4j-1.x and logback. Popma is responsible for most of the performance improvements since Log4j 2.0-beta4. For fun, he likes to learn new stuff: performance, concurrency, scalability, low latency, AI, machine learning, cryptography, bitcoin and cryptocurrency technologies. Popma can be found by @RemkoPopma

Rate this Article

Adoption
Style

BT