BellSoft has released versions 17 and 21 Liberica JDK, their downstream distribution of OpenJDK, with Coordinated Restore at Checkpoint (CRaC). This feature allows developers to create a snapshot of a running application at any point in time (checkpoint). This snapshot is then used to start the application in milliseconds by restoring the state of the application.
CRaC is based on the Linux feature Checkpoint and Restore in Userspace (CRIU), which means that builds are only available for the x86_64 and AArch64 CPU architectures running the Linux operating system. CRIU offers checkpoint and restore functionality and is used by various solutions such as Docker and Podman.
CRaC stores the state of a running application, including the Java heap, JIT-compiled code, native memory and settings. Developers should ensure that no sensitive data like passwords are present in the stored state. During initialization, a seed is generated via the Java Random
class, which means that the numbers after a snapshot restore are predictable. A new seed should be created in the afterRestore()
method in order to achieve randomness after a restore. Using the Java SecureRandom
class is an even better solution to clean the seed and lock the random operations before the snapshot and subsequently remove the lock in the afterRestore()
method.
The coordinated checkpoint and restore makes sure the application is aware of the fact it's being paused and restarted. It makes sure the network connections and open file descriptors are closed to make the process more reliable. The process also allows the application to cancel checkpoints when it's not yet ready because user data is being saved, for example.
The following command can be used to start MyApplication
and specify the checkpoint-data directory, which will eventually contain the JVM data, whenever a snapshot is created:
$ java -XX:CRaCCheckpointTo=checkpoint-data MyApplication
Now, the jcmd command may be used to create a snapshot:
$ jcmd MyApplication JDK.checkpoint
Afterwards, the application can be started by restoring the state of the snapshot in the checkpoint-data directory:
$ java -XX:CRaCRestoreFrom=checkpoint-data
Other solutions that allow faster startup of applications, such as Ahead of Time (AOT) compilation, used by GraalVM and Application Class Data Sharing (AppCDS) used by Quarkus, for example, also offer fast startups. However, those solutions don't support further optimizations with the JIT compiler during runtime.
Bellsoft advises CRaC be mainly used for applications with the following characteristics: short running, low CPU limits, replicated and frequently restarted.
CRaC, originally developed by Azul, has become one of the OpenJDK projects. Azul has also included CRaC in Zulu, their own downstream distribution of OpenJDK. CRaC is becoming more mainstream and supported by tools such as Spring Boot, Quarkus, Micronaut and AWS Lambda SnapStart.
More information about using CRaC with a regular Java application and a Spring Boot application can be found in the How to use CRaC with Java applications blog, written by Dmitry Chuyko, performance architect at Bellsoft.