BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News OpenJDK Project Valhalla Releases LW2 Prototype

OpenJDK Project Valhalla Releases LW2 Prototype

This item in japanese

The OpenJDK team at Oracle has announced the release of Early Access (EA) builds for the LW2 prototype of Project Valhalla (aka "inline classes", previously called "value types").

The prototype can be downloaded here and the intent is to periodically refresh the binaries with bug fixes and performance updates over the coming weeks.

The team is actively looking for feedback on the user model, but cautions that there a number of major aspects of the implementation that are not ready for scrutiny yet.

The LW2 designation refers to the fact that the implementation for the inline classes feature has reached the second milestone in the so-called "L-World" design

The current prototype incorporates inline types into the existing type system by making them as similar as possible to existing objects and interfaces, i.e. the L-Type system or "L-World".

This prototype was released July 5th 2019 and supercedes the previous LW1 milestone. It puts initial experimentation with Valhalla and inline classes within reach of more developers, although it is still extremely experimental.

It includes the first steps towards a new look at generics - with a syntax that allows nullable, reference projections of inline classes to be used as generic type argunments.

As this is an early prototype, there are a number of limitations, which include:

  • Only available on x64 Linux, x64 Mac OS X, and x64 Windows
  • No support for atomic fields containing indirect types
  • No support for @Contended inline type fields
  • Only supported in interpreter and C2. No C1, no tiered compilation and no Graal
  • No unsafe field and array accessor APIs for inline types
  • Interpreter is not yet optimized, the focus is on C2 JIT optimization

InfoQ spoke to Dan Heidinga (Eclipse OpenJ9 project lead at IBM) about the LW2 release.

InfoQ: What do you think the most important new features in LW2 are?

Dan Heidinga: The LW2 early access build brings a lot to the table, the most important being the user model for inline types. Previous prototypes were defined in terms of MethodHandles which had a high bar to entry.

Writing serious code with previous prototypes was too hard, even for MethodHandle experts, which made it difficult to provide feedback. LW2 is different. It's intended to be user friendly so that developers can test out the model for how inline types work and provide feedback to the expert group.  

One particularly important feature is the introduction of the "?" operator for inline types to allow users to explicitly choose whether an inlined type is a candidate to be inlined or whether an indirection should be used.

It's also an enabler for using inline types with Java's Collections libraries which again improves the developer experience with the prototype.  By only allowing generics to use the "?", or nullable version of the type in LW2, we leave the door open to future improvements like reified generics for inlined types.

InfoQ: Can you explain how inline classes connect with Escape Analysis and why this is important for performance?

Heidinga: Escape analysis is a JIT optimization that tries to prove that an object's lifetime is entirely contained in the current compilation unit and doesn't "escape" to the heap, another thread, or even out of the scope of the current set of inlined methods. 

If the JIT can prove that object doesn't escape, it can split the object into set of independent fields. These can then be put in registers for better optimization opportunities, or the object can be allocated on the stack rather than the heap. Both of these provide for better optimization opportunities and reduce garbage collection pressure as the object never ends up on the heap.

While this sounds great, it isn't guaranteed. The optimization might fail for any number of reasons - such as not having inlined enough of the calls in a method to see that the object doesn't really escape, or it may not run on a given compile operation. This could be because lower tier compilations may skip some expensive to perform optimizations to get to compiled code faster. Not only that, but small changes to the code may cause an object to escape, preventing the optimization from succeeding, resulting in worse performance without an obvious culprit.

Inline classes are both immutable and have no identity. This makes the types ideal candidates for escape analysis as the JIT no longer has to prove they don't escape. It's at liberty to split them apart, put them in registers, and optimize them, however it needs to as it can always reconstitute the inline type at any point where it might escape. The key to this is that because inline types don't have an identity, there's no way to tell if they were recreated or not. This removes a lot of the brittleness from traditional escape analysis.

In most programs, there are a lot of small classes that act as wrappers on other data that would benefit from this guaranteed escape analysis. Think of the places in your code where you've wrapped an int, a long, or a String to give them extra semantic meaning. Wouldn't it be nice if the JIT was able to stack allocate all those instances? That's part of the win of inline types.

InfoQ: Instances of inline classes are expected to be immutable, so can you explain why this can cause issues with atomicity of updates? Why can't we just use an optimistic copy and a compare-and-swap (CAS) to swap the pointer?

Heidinga: Remember the slogan for the project is "codes like a class, works like an int". When writing a primitive int into a local variable, or a field, or even an array, we don't modify the int.  Instead, we overwrite the whole contents. LW2's inline classes work like primitive types in this way.

While this is a clean model conceptually, it re-introduces a problem Java developers have mostly been able to ignore since 64bit systems became the norm: tearing, aka non-atomic updates.

Looking back at how early 32bit Java implementations handled longs or doubles helps to understand the issue. CPU's guarantee that writes of their native word size happen atomically. Writing 32 bits on a 32 bit system can't tear. But a long is 64 bits which means it can't, without some special pleading to the hardware, be written atomically on a 32 bit system.

The Java Language Spec recognizes this in Ch "17.7. Non-atomic Treatment of double and long":

For the purposes of the Java programming language memory model, a single write to a non-volatile long or double value is treated as two separate writes: 

one to each 32-bit half. This can result in a situation where a thread sees the first 32 bits of a 64-bit value from one write, and the second 32 bits from another write.

Inline classes will bring the problem of tearing back into the set of concerns users need to think about, as reads and writes of inlines types must copy the entire contents of the type.

Here's an example to help see why tearing of inline classes will be a new problem for developers.  

Consider an inline class like Customer:

    inline class Customer {
        String firstName;
        String lastName;
        long customerID;
    }

and an array used to track the top three customers:

    Customer topCustomers[3];

which is accessed by two threads concurrently.  The first thread attempts to write a new Customer into the array:

    Customer c = getTopCustomer();
    topCustomers[0] = c;

While the second thread is attempting to read from the array:

    Customer b = topCustomers[0];

If the read and write are happening at the same time, it may be possible to read a customer that is an amalgamation of both the old and new customers, resulting in an impossible Customer object. This is a data race but has previously been (mostly) benign as the runtime has been replacing one pointer with another atomically in the array.

Once an inline type is larger than the largest atomic update the processor provides (typically 2x the word size so 128bit on a 64-bit system), then tearing becomes a potential problem - if there are any data races on the inline type. The inline types will be too large for a CAS to succeed as it needs to update the entire contents, not just a pointer to them.  As the entire contents of the type are inlined in the container, there is no pointer to conveniently update.

The Valhalla expert group is still looking at ways of allowing inline types to be marked as "atomic" so they can only be written in non-tearable ways which will address some of these concerns, but the proposal isn't currently in LW2.

Tearing is one more reason that the design for inline types suggests they should be small aggregates of data - with the key word here being "small".

InfoQ: What aspect of inline classes do you think is currently most misunderstood by the developer community?

Heidinga: I'm going to give two answers to this. The first common misconception is that inline types are mutable - like C structs are. While earlier proposals like IBM's PackedObjects or Azul's ObjectLayout supported mutable types, LW2's inline types are strictly immutable. This is part of the general trend in new Java features to favour immutability to make it easier to write correct concurrent applications.

The second common misconception is that inline types give the user explicit control of the layout of the object. Inline types allow users to ask the JVM to inline the data directly into the container - either object or array - but don't allow users to control the layout of the fields in the type. The layout algorithms continue to be completely under the control of the JVM which may reorder fields to group them in ways that make garbage collection more efficient.

InfoQ: Anything else you'd like to share with our readers?

Heidinga: There's a lot in the LW2 early access binaries and in the updated JVM specification. Check it out and provide us with feedback. There are a number of open questions in the design where we're looking for experience reports on how the design works for your use cases.

Some of the areas we're looking for feedback are on the behaviour of == for inline types, the introduction of array covariance between Object[] and inline type arrays, and general feedback on how experimenting with inline types in your code base has worked out for you.

The LW2 binaries for Project Valhalla are out now, and feedback from ordinary Java developers (in the areas that are ready, such as the user model) is actively being sought.

Rate this Article

Adoption
Style

BT