In Part 1 of our interview with Gil Tene, we discussed Azul's new releases and products (including their Zulu build of OpenJDK) as well as the need for certified JVMs for application developers. In Part 2, we move on to look at some new features that Azul are prototyping with an eye to the future.
InfoQ: Let's talk about the new features you're planning for Java 9.
Gil: I gave a talk at the Melbourne Java User Group about Faster Java Objects (video), and a more in-depth version at the JVM Language Summit. For people that aren't aware of it, the JVM Language Summit is a small conference for JVM language implementers, so it's a very high common denominator audience, so my talk jumps straight into the weeds. The subject of my talk is a project we've been working on for 18 months called ObjectLayout (or org.ObjectLayout) and it's focused on closing a speed gap between Java and C which comes from memory layout.
I should make it clear, though, that this is not value types, or IBM's PackedObjects. Both of those are very good ideas, and satisfy needs in their own right, but ObjectLayout is serving a different, and potentially more ambitious need. ObjectLayout is about improving access speed to regular on-heap Java objects. It doesn't deal with off-heap memory or the stack (which would be value types).
An example would be reducing the number of pointer indirections needed to access a specific field of an object stored in a data structure. The key question is how many times do you have to dereference to find the data you are actually looking for. The other great example is data structure striding (or streaming). If you have your low-level memory laid out with a regular stride, then the hardware memory pre-fetchers in modern CPUs can do a great job of getting that data for you in minimal CPU cycles. Ultimately, the goal is to replace memory latency bottlenecks with memory bandwidth bottlenecks.
The fundamental difference here is between Java's Object[] and an array of struct in C. Java will store references to objects, which means an additional pointer indirection, and that kills off the ability to stride (using pointer arithmetic to calculate index locations on fixed-size arrays - Editor). It's actually worse than that, because Java's arrays are covariant, so all an array of objects really guarantees is that all the objects are of the same base type. The actual objects themselves may be of different types, which again destroys the possibility of striding. By contrast, with an array of struct, the structs are all guaranteed to be of the exact same size and layout, and because this is array of struct, rather than array of pointer to struct, then we have no additional indirections and the stride can be of a fixed size.
ObjectLayout declares a new collection class, called StructuredArray. This has the same semantics as an array of structs in C, but for regular Java objects. It's immutable, so there's no put(), only get() after creation. All elements in it are objects of the exact same type (so no type variance allowed).
There is a working vanilla implementation of this today, which means that it works on any JDK, passes tests, etc, with the correct semantics. We want future JVMs to be able to optimize this specific data structure, to provide an implementation that lays it out as a flat structure in memory, just like C does. Current JVMs can't do that, because they can't recognise this class for what it is. If we standardize this in a future JDK release, then the JVM could recognize this, and we can replace the implementation under the hood. This would give us the same interface, the same semantics, but much higher performance. That's what an intrinsified implementation would look like. We're building these optimizations into Zing, and a reference implementation for OpenJDK as well.
InfoQ: What's the timing for that? When do you expect to have some code?
Gil: The OpenJDK code has the purpose of being able to show it to people, and to try to influence the Java 9 development. I'm hoping to get it out in the next couple of months. We won't wait until it's complete, just demonstrable. The vanilla code is out there, and it's runnable. Its main purpose is to work out the semantics, and we're going for zero language changes (at least in this version). We think it's a very low footprint change to a JVM, and the layout aspect is actually very easy to deal with, and we believe it's very supportable in all garbage collectors, and we want to demonstrate that.
I started with StructuredArray, and the other 2 use cases we want to capture in ObjectLayout are InstrinsicObjects, which are essentially objects inlined in other objects. You can do something sort of similar in Java today with final fields, but the two objects are still laid out via indirection. An IntrinsicObject is an object that is created at the same time as the enclosing object, and which could benefit from being laid out within it. A struct within a struct is obviously the C pattern that we're trying to emulate.
The other use case that we want to tackle is the equivalent of a struct with an array at the end, specifically a variable length array. This is very common in messaging where you have a header and then a payload.
InfoQ: Presumably the intent would be that the array would be allocated inline, but zero-terminated?
Gil: Allocated inline, but not zero-terminated. The length will be stored somewhere, such as in metadata in the header.
InfoQ: So the terminating array still has the memory semantics of a Java array, not a String array?
Gil: That's right. You can think of String as being a prototypical example of the need for this. Logically, String is a struct that has an array of char at the end. So are packets, and messages, and all sorts of types. The way we're modeling this is very simple -as subclassable arrays. We have non-final classes that represent arrays of all the primitive types, as regular classes, not new language syntax. So we have a subclassable primitive long array, and so on. Subclassing is what allows you to create your struct, as any fields in the subclass are the fields that go into the struct, before the array.
The layout is of course up to the JVM implementation, but we would expect, and want, to lay it out as a struct with an inline array at the end, as this allows the exact determining of stride that we want.
We focused on making this a very natural Java API which can still support the low-level memory semantics. The contents of StructuredArray are no different from the contents of any other collection. There are no special behaviours that are required to participate - these are ordinary Java objects, including with regard to things like liveness, and this means that it fits into how Java already works, and so can fit in to existing code.
In the github page there's a great example, called Octagons, which shows how these concepts compose. You can go from an abstraction which describes a dated set of colored octagons, and get all the way down to x and y coordinates of individual vertices, without the executing code having to perform any of the 4 separate indirection steps that would need to be performed for the same logic on a current JVM. At the same time traversing through vertices in the set will stream through memory. It's this natural composability which makes me think that we have a chance to go one up on C, and to go further than just arrays of structs.
In addition to working on new features for the JVM, Gil is also participating in a new mini-book for InfoQ, all about garbage collection in Java, expected later in 2015.
About the Interviewee
Gil Tene is CTO and co-founder of Azul Systems. He has been involved with virtual machine technologies for the past 25 years and has been building Java technology-based products since 1995.