BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles Java: The Missing Features

Java: The Missing Features

Lire ce contenu en français

In this article, we look at some of the "missing features" of Java. Before we get fully underway, however, we should point out that there are some features that we deliberately don’t consider. This is usually because the feature has either been extensively discussed elsewhere, or would require too much work at VM level. For example:

  • No reified generics.

This has been discussed at great length elsewhere and much of the commentary frequently misunderstands type erasure. In reality when they say "I don’t like type erasure", many Java developers actually mean "I want List<int>" . The issue of primitive specialisation of generics is only tangentially related to type erasure, and run-time visible generics are actually much less useful than Java folk wisdom insists.

  • VM-level unsigned arithmetic.

Java’s lack of support for unsigned arithmetic types was a familiar complaint from developers in the earliest years of the platform, but it represented a deliberate design choice. Choosing to only implement signed arithmetic simplified Java considerably. To introduce unsigned integral types now would be a huge root-and-branch change which could introduce a lot of subtle and hard to find bugs, and the risk of destabilizing the platform is just too great.

  • Long indices for arrays.

Again, this feature is simply too deep a change in the depths of the JVM, with a broad set of consequences, not least in the behavior and semantics of the garbage collectors. However, it should be noted that Oracle are looking at providing related functionality via a project called VarHandles.

It’s also worth pointing out that we are not too concerned with the precise details of how a Java syntax for a given feature would work. As Brian Goetz has pointed out on many occasions, discussions of Java’s feature tend to over-focus on the syntax, at the expense of thinking deeply about the semantics that a given feature should enable.

Having set some frames of reference, let’s get started and take a look at the first of our missing features.

More Expressive Import Syntax

Java’s import syntax is quite limited. The only two options available to the developer are either the import of a single class or of an entire package. This leads to cumbersome multi line imports if we want just some but not all of a package, and necessitates IDE features such as import folding for most large Java source files.

Extended import syntax, allowing multiple classes to be imported from a single package with a single line would make life a bit simpler:

import java.util.{List, Map};

The ability to locally rename (or alias) a type would improve readability and help reduce confusion between types with the same short class name:

import java.util.{Date : UDate};
import java.sql.{Date : SDate};
import java.util.concurrent.Future;
import scala.concurrent.{Future : SFuture};

Enhanced wildcards would also help, like this:

import java.util.{*Map};

This is a small, but useful, language change and has the benefit that it can be implemented entirely within javac.

Collection Literals

Java has some (albeit limited) syntax for declaring array literals. For example:

int[] i = {1, 2, 3};

This syntax has a number of drawbacks, such as the requirement that array literals only appear in initializers.

Java’s arrays aren’t collections, and the "bridge methods" provided in the helper class Arrays also have some major drawbacks. For example, the Arrays.asList() helper method returns an ArrayList, which seems entirely reasonable, until closer inspection reveals that it is not the usual ArrayList but rather Arrays.ArrayList. This inner class does not implement the optional methods of List and so some familiar methods will throw OperationNotSupportedException. The result is an ugly seam in the API making it awkward to move between arrays and collections.

There’s no real reason the language has to omit a syntax for declaring array literals; after all, many languages provide a simple syntax for doing so. For example, in Perl, we can write:

my $primes = [2, 3, 5, 7, 11, 13];
my $capitals = {'UK' => 'London', 'France' => 'Paris'};

and in Scala:

val primes = Array(2, 3, 5, 7, 11, 13);
val m = Map('UK' -> 'London', 'France' -> 'Paris');

Java, unfortunately, does not offer useful collection literals. They have been repeatedly talked about, both for Java 7 and 8, but have never materialised. The case of object literals is also interesting, but much harder to achieve within Java’s type system.

Structural Typing

Java’s type system is famously nominal, to the point of being described as "name obsessed". As all variables are required to be of named types, there is no possibility of having a type that can only be expressed via a definition of its structure. In other languages, such as Scala, it is possible to express a type not by declaring it as implementing an interface (or a Scala trait), but instead simply by asserting that the type must have a particular method. For example:

def whoLetTheDucksOut(d: {def quack(): String}) {
  println(d.quack());
}

This will accept any type that possesses a quack() method, regardless of whether there is any inheritance or shared interface relationship between the types.

The use of quack() as an example is not an accident - structural typing can be thought of as related to "duck typing" in languages like Python, but of course in Scala the typing is happening at compile time, due to the flexibility of Scala’s type system in representing types that would be difficult or impossible to express in Java.

As first noted by James Iry, Java’s type system does actually allow a very limited form of structural typing. It is possible to define a local anonymous type that has additional methods, and providing that one of the new additional methods is immediately called, Java will allow the code to compile.

Unfortunately, the fun stops there, and one "structural" method is all we can call, because there is no type that can be returned from a "structural" method that encodes the additional information that we want. As Iry notes, the structural methods are all valid methods, and are present in the bytecode and for reflective access; they just can’t be represented in Java’s type system. This probably shouldn’t be that surprising, as under the hood, this mechanism is actually implemented by producing an additional class file that corresponds to the anonymous local type.

Algebraic Data Types

Java’s generics provide the language with parameterized types, which are reference types that have type parameters. Concrete types can then be created by substituting the type parameter for some actual type. Such types can be thought of as being composed of their "container" type (the generic type) and the "payload" types (the values of the type parameters).

However, some languages support types that are composite, but in a strikingly different manner from Java’s generics (or simple composition to create a new data type). One common example is tuples, but a more interesting case is that of the “sum type”, sometimes referred to as a “disjoint union of types” or a “tagged union”.

A sum type is a single-valued type (variables can only hold one value at a time), but the value can be any valid value of a specified range of distinct types. This holds true even if the disjoint types that can provide values have no inheritance relationship between them. For example, in Microsoft’s F# language we can define a Shape type, instances of which can be either rectangles or circles:

type Shape =
| Circle of int
| Rectangle of int * int

F# is a very different language from Java, but closer to home, Scala has a limited form of these types. The mechanism used is Scala’s sealed types as applied to case classes. A sealed type in Scala is not extensible outside of the current compilation unit. In Java, this would basically be the same as a final class, but Scala takes the file as the basic compilation unit, and multiple top-level public classes can be declared in a single file.

This leads to a pattern where a sealed abstract base class is declared, along with some subclasses, which correspond to the possible disjoint types of the sum type. Scala’s standard library contains many examples of this pattern, including Option[A], which is Scala’s equivalent of Java 8’s Optional<T> type.

In Scala, an Option is either Some or None and the Option type is a disjoint union of the two possibilities.

If we were to implement a similar mechanism in Java, then the restriction that the compilation unit is fundamentally the class would makes this feature much less convenient than in Scala, but we can still conceive of ways to make it work. For example, we could extend javac to process a new piece of syntax on any classes we wanted to seal:

final package example.algebraic;

This syntax would indicate that the compiler was to only allow extension of the class bearing a final package declaration within the current directory, and to reject any attempt to extend it otherwise. This change could be implemented within javac, but it would obviously not be completely safe from reflective code without runtime checks. It would also, unfortunately, be somewhat less useful than in Scala, as Java lacks the rich match expressions that Scala provides.

Dynamic call sites

With version 7, the Java platform added a feature that turns out to be surprisingly useful. The new invokedynamic bytecode was designed to be a general purpose invocation mechanism.

Not only does it allow dynamic languages to run on top of the JVM, but it also allows aspects of the Java type system to be extended in previously impossible ways, allowing default methods and interface evolution to be added. The price for this generality is a certain amount of unavoidable complexity, but once this has been understood, invokedynamic is a powerful mechanism.

One limitation of dynamic invocation is quite surprising. Despite introducing this support with Java 7, the Java language does not provide any way to directly access dynamic invocation of methods. The whole point of dynamic dispatch is to allow developers to defer until execution time, and participate in, decisions about which method to call from a given call site.

(Note: The developer should not confuse this type of dynamic binding with the C# keyword dynamic. This introduces an object that dynamically resolves its bindings at runtime, and will fail if the object cannot actually support the requested method calls. Instances of these dynamic objects are indistinguishable from objects at runtime and the mechanism is accordingly rather unsafe.)

While Java does use invokedynamic under the hood to implement lambda expressions and default methods, there is no direct access to allow application developers to do execution time dispatch. Put another way, the Java language does not have a keyword or other construct to create general-purpose invokedynamic call sites. The javac compiler simply will not emit an invokedynamic instruction outside of the language infrastructure use cases.

Adding this feature to the Java language would be relatively straightforward. Some sort of keyword, or possibly annotation would be needed to indicate it, and it would require additional library and linkage support.

Glimmers of hope?

The evolution of language design and implementation is the art of the possible, and there are plenty of examples of major upheavals that took aeons to be fully adopted across languages. For example, it was only with C++14 that lambda expressions finally arrived.

Java’s pace of change is often criticised but one of James Gosling’s guiding principles was that if a feature was not fully understood, it should not be implemented. Java’s conservative design philosophy has arguably been one of the reasons for its success, but it has also attracted a lot of criticism from younger developers impatient for faster change within the language. Is there ongoing work that might deliver some of the missing features here discussed? The answer is, perhaps, a cautious maybe.

The mechanism by which some of these ideas may be realized is one that we have referred to before - invokedynamic. Recall that the idea behind it is to provide a generalised invocation mechanism that is deferred until execution time. Recent enhancement proposal JEP 276 offers the possibility of standardising a library called Dynalink. This library, originally created by Attila Szegedi during his tenure at Twitter, was originally proposed as a way to implement “meta-object protocols” in the JVM. It was adopted by Oracle when Szegedi joined the firm and was used extensively in the internals of the Nashorn implementation of Javascript on the JVM. JEP 276 now proposes to standardise this and make it available as an official API for all languages on the JVM. An overview of Dynalink is available from Github but the library has moved on significantly since those resources were written.

Essentially, Dynalink provides a general way to talk about object-oriented operations, such as “get value of property”, “set value of property”, “create new object”, “call method” without requiring that the semantics of those operations be fulfilled by the corresponding, static-typed, low-level operations of the JVM.

This opens the door to using this linking technology to implement dynamic linkers with different behavior from that of the standard Java linker. It can also be used as a sketch of how new type system features could be implemented in Java.

In fact, this mechanism has already been evaluated by some Scala core developers, as a possible replacement mechanism for implementing structural types in Scala. The current implementation is forced to rely upon reflection, but the arrival of Dynalink could change all that.

About the Author

Ben Evans is the CEO of jClarity, a Java/JVM performance analysis startup. In his spare time he is one of the leaders of the London Java Community and holds a seat on the Java Community Process Executive Committee. His previous projects include performance testing the Google IPO, financial trading systems, writing award-winning websites for some of the biggest films of the 90s, and others.

Rate this Article

Adoption
Style

BT