Computer scientists emphasize the value of conciseness of expression in problem solving. Unix pioneer Ken Thompson once famously stated, “One of my most productive days was throwing away 1000 lines of code.” This is a worthy goal on any software project requiring ongoing support and maintenance, yet can be lost by a focus on software development metrics like lines-of-code. Early Lisp contributor Paul Graham went as far as to equate succinctness in a programming language with its power. This notion of power has made the ability to write compact, simple code a primary criterion for language selection in many modern software projects.
Any program can be made shorter by refactoring it to remove superfluous code or extraneous filler like whitespace, but certain languages are inherently expressive and particularly well suited for writing short programs. With this quality in mind, Perl programmers popularized code golf competitions; the goal being to use the shortest amount of code possible to solve a particular problem or implement a specific algorithm. The APL language was designed to use special graphical symbols that allowed programmers to write powerful programs with tiny amounts of code. Such programs, when properly implemented, map well to standard mathematical representations. Terse languages can be very effective for quickly creating small scripts, particularly when used in clearly delineated problem domains where their brevity does not obscure their purpose.
Java has a reputation for being verbose relative to other programming languages. This is partially due to established practices in the programming community, which in many instances allow for a greater degree of descriptiveness and control when performing a task. For example, long variable names can make a large codebase more readable and maintainable over the long term. Descriptive class names generally map to file names, which immediately clarify where new functionality should be added to an existing system. When used consistently, descriptive names can greatly simplify searching for text indicating a particular functionality within an application. These practices have contributed to Java’s great success in large-scale implementations with large, complex code bases.
Conciseness is preferred in smaller projects, and some languages are well-suited for writing short scripts or for interactive exploratory programming at a prompt. Java is extremely useful as a general-purpose language for writing cross-platform utilities. In such situations, the use of “verbose Java” doesn’t necessarily provide additional value. Although code style can be altered in areas such as the naming of variables, certain fundamental aspects of the Java language have historically required the use of more characters to accomplish a task than comparable code in other programming languages. In response to such limitations, the language has been updated over time to include features typically classified as “syntactic sugar.” These idioms allow the same functionality to be expressed with fewer characters. Such idioms are preferable to their more verbose counterparts and have generally been quickly adopted into common usage by the programming community.
This article will highlight practices for writing concise Java code, with a special focus on the new functionality available in JDK 8. Shorter, more elegant code is possible due to the inclusion of Lambda Expressions in the language. This is especially evident when processing collections using the new Java Streaming API.
Verbose Java
Java’s reputation for verbosity is partially due to its implementation style of object-orientation. The classic example of a “Hello World” program can be implemented in many languages in a single line of code containing less than 20 characters. In Java this requires a main method within a class definition which contains a method call to write the string using System.out.println(). At minimum, with only the requisite sprinkling of method qualifiers, brackets and semicolons the minimal “Hello World” program with all whitespace removed tops out at 86 characters. Coupled with spacing and a bit of indentation for readability, the “Hello World” program provides an inarguably wordy first impression.
Java’s verbosity is partially due to community standards that opt for descriptiveness over brevity. It is trivial to opt for different standards related to code format aesthetics in this regard. In addition, methods and sections of boilerplate code can be wrapped in methods that can be incorporated into APIs. Refactoring a program with an eye towards brevity can greatly simplify it without sacrificing accuracy or clarity.
Java’s reputation for verbosity is at times skewed by a plethora of old code examples. Many books have been written about Java over the years. Since Java has been around since the beginnings of the world wide web, many online resources provide snippets from the earliest versions of the language. But Java has matured over the years in response to perceived deficiencies, and so even accurate and well implemented examples might not take advantage of later language idioms and APIs.
Java’s design goals specified that it be object-oriented, familiar (which at that time meant using C++ style syntax), robust, secure, portable, threaded and highly performant. Brevity was not a goal. Functional languages provide terse alternatives to comparable tasks implemented using an object-oriented syntax. Lambda Expressions added in Java 8 open the door to functional programming idioms which alter the appearance of Java and reduce the amount of code needed to perform many common tasks.
Functional Programming
Functional Programming makes the function the central construct for programmers. This allows functions to be used in a very flexible manner such as passing them as arguments. Based on this capability Java lambda expressions enable you to treat functionality as method arguments or code as data. A lambda expression can be thought of as an unnamed method independent of any specific class association. There is a rich and fascinating mathematical basis for these ideas.
Functional programming and lambda expressions can be perceived as abstract, esoteric concepts. For a programmer chiefly concerned with tackling a task in industry, there might not be an interest in catching up on the latest computational trends. With the introduction of lambdas into Java, it is necessary for developers to understand these new features at least to the degree that programs written by other developers can be understood. There are practical benefits that can affect the design of concurrent systems that results in better performance. And, what is of immediate interest in this article is how these mechanisms can be used to craft short yet clear code.
There are several reasons lambda expressions produce code brevity. Fewer local variables are used reducing clutter required to declare and set them. Loops can be replaced with method calls, reducing three or more lines to a single line of code. Code traditionally expressed in nested loops and conditional statements can be expressed in a single method. Implemented as fluent interfaces, methods can be chained together in a manner analogous to Unix piping. The net effect of writing code in a functional style is not limited to readability. Such code can avoid maintaining state and be side-effect free. Such code has the added benefit of being easily parallelized for more efficient processing.
Lambda Expressions
The syntax related to lambda expressions is straightforward, but is unlike idioms seen in previous versions of Java. A lambda expression is made up of three parts, an argument list, an arrow, and a body. An argument list may or may not include parenthesis. A related operator consisting of a double colon has also been added that can further reduce the amount of code required for certain lambda expressions. This is known as a method reference.
Thread Creation
In this example, a thread is created and run. The lambda expression appears on the right side of the assignment operator and specifies an empty argument list with the simple outcome of a message being written to standard out when the thread is run.
Runnable r1 = () -> System.out.print("Hi!");
r1.run()
Argument List |
Arrow |
Body |
|
|
|
Processing Collections
One of the primary places where the presence of Lambdas will be noticed by many developers is in relation to the Collections API. Consider a list of Strings that we wish to sort by their length.
java.util.List<String> l;
l= java.util.Arrays.asList(new String[]{"aaa", "b", "cccc", "DD"});
A lambda expression can be created to implement this functionality.
java.util.Collections.sort(l, (s1, s2) ->
new Integer(s1.length()).
compareTo(s2.length())
This example includes two arguments which are passed to the body of the lambda so that their lengths can be compared.
Argument List |
Arrow |
Body |
|
|
|
There are several alternatives available to operate on each element in a list without resorting to standard “for” or “while” loops. Comparable semantics can be achieved by passing a lambda to the collection’s “forEach” method. In that case, no parenthesis is used with the single argument passed.
|
Argument List |
Arrow |
Body |
|
|
|
This particular example can be further shortened using a method reference to separate the containing class and a static method. Each element is passed to the println method in turn.
l.forEach(System.out::println)
The java.util.stream package is new to Java 8 and uses syntax familiar to functional programmers to process collections. Its summary explains its contents as
Concise Java
“Classes to support functional-style operations on streams of elements, such as map-reduce transformations on collections.”
The class diagram that follows provides an overview of the package with an emphasis on functionality that will be exercised in a subsequent example. The package structure lists a number of Builder classes. Such classes are common with fluent interfaces that allow methods to be chained together into a pipelined set of operations.
Although string parsing and collection manipulation is simple, it has many practical real-world applications. Sentences need to be segmented into separate words when doing Natural Language Processing (NLP). Bioinformatics represents macromolecules like DNA and RNA as Nucleobases consisting of letters such as C, G, A, T, or U. In each problem domain, Strings are broken down and constituent parts are manipulated, filtered, counted and sorted. So although the example contains very simple use cases, the concepts are generalizable to a wide variety of meaningful tasks.
The example code parses a String containing a sentence and counts the number of words and letters of interest. The complete listing is just under 70 lines of code including whitespace.
1. import java.util.*;
2.
3. import static java.util.Arrays.asList;
4. import static java.util.function.Function.identity;
5. import static java.util.stream.Collectors.*;
6.
7. public class Main {
8.
9. public static void p(String s) {
10. System.out.println(s.replaceAll("[\\]\\[]", ""));
11. }
12.
13. private static List<String> uniq(List<String> letters) {
14. return new ArrayList<String>(new HashSet<String>(letters));
15. }
16.
17. private static List<String> sort(List<String> letters) {
18. return letters.stream().sorted().collect(toList());
19. }
20.
21. private static <T> Map<String, Long> uniqueCount(List<String> letters) {
22. return letters.<String>stream().
23. collect(groupingBy(identity(), counting()));
24. }
25.
26. private static String getWordsLongerThan(int length, List<String> words) {
27. return String.join(" | ", words
28. .stream().filter(w -> w.length() > length)
29. .collect(toList())
30. );
31. }
32.
33. private static String getWordLengthsLongerThan(int length, List<String> words)
34. {
35. return String.join(" | ", words
36. .stream().filter(w -> w.length() > length)
37. .mapToInt(String::length)
38. .mapToObj(n -> String.format("%" + n + "s", n))
39. .collect(toList()));
40. }
41.
42. public static void main(String[] args) {
43.
44. String s = "The quick brown fox jumped over the lazy dog.";
45. String sentence = s.toLowerCase().replaceAll("[^a-z ]", "");
46.
47. List<String> words = asList(sentence.split(" "));
48. List<String> letters = asList(sentence.split(""));
49.
50. p("Sentence : " + sentence);
51. p("Words : " + words.size());
52. p("Letters : " + letters.size());
53.
54. p("\nLetters : " + letters);
55. p("Sorted : " + sort(letters));
56. p("Unique : " + uniq(letters));
57.
58. Map<String, Long> m = uniqueCount(letters);
59. p("\nCounts");
60.
61. p("letters");
62. p(m.keySet().toString().replace(",", ""));
63. p(m.values().toString().replace(",", ""));
64.
65. p("\nwords");
66. p(getWordsLongerThan(3, words));
67. p(getWordLengthsLongerThan(3, words));
68. }
69. }
Sample output from running the program:
Sentence : the quick brown fox jumped over the lazy dog
Words : 9
Letters : 44
Letters : t, h, e, , q, u, i, c, k, , b, r, o, w, n, , f, o, x, , j, u, m, p, e, d, , o, v, e, r, , t, h, e, , l, a, z, y, , d, o, g
Sorted : , , , , , , , , a, b, c, d, d, e, e, e, e, f, g, h, h, i, j, k, l, m, n, o, o, o, o, p, q, r, r, t, t, u, u, v, w, x, y, z
Unique : , a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, t, u, v, w, x, y, z
Counts
letters
a b c d e f g h i j k l m n o p q r t u v w x y z
8 1 1 1 2 4 1 1 2 1 1 1 1 1 1 4 1 1 2 2 2 1 1 1 1 1
words
quick | brown | jumped | over | lazy
5 | 5 | 6 | 4 | 4
The code has been shortened several different ways. Not all are possible in every version of Java, and not all are consistent with generally accepted style guides.Consider how this output would be obtained in earlier versions of Java. Several local variables would have been created to temporarily store data or serve as indexes. Numerous conditional statements and loops would be required to tell Java how to process the data. The newer functional approach is focused on what data is needed, and does not require attention related to temporary variables, nested loops, index management or conditional statement processing.
In some instances, standard Java syntax available since the earliest versions of the language was used to shorten the code at the expense of clarity. For instance, the Java Packages in the standard import statement in line 1 references all classes in java.util rather than each individual class by name. The call to System.out.println is replaced with a call to a method named p to allow a shorter name on each method invocation (lines 9-11). These changes are controversial as they would violate some Java coding standards, but programmers from other backgrounds would not necessarily view them with any concern.
In other cases, we take advantage of features that were not available in the earliest versions of the language, but have been available since pre-JDK8. Static imports (lines 3-5) are used to reduce the number of class references needed inline. Regular expressions (lines 10,45) effectively hide looping and conditionals in manner unrelated to functional programming per se. These idioms, particularly the use of Regular Expressions are often challenged for being difficult to read and interpret. Used judiciously, they reduce the amount of noise and restrict the amount of code that needs to read and interpreted by a developer.
Finally, the code takes advantage of the new JDK 8 streaming API. A number of methods available in the streaming API are used to filter, group and process the lists (lines 17-40). Though their association with enclosing classes is clear within an IDE, it is less obvious unless you are already conversant with the API. This list explains where each of the method calls that appear in the code originate.
Method |
Fully Qualified Method Reference |
stream() | java.util.Collection.stream() |
sorted() | java.util.stream.Stream.sorted() |
collect() | java.util.stream.Stream.collect() |
toList() | java.util.stream.Collectors.toList() |
groupingBy() | java.util.stream.Collectors.groupingBy() |
identity() | java.util.function.Function.identity() |
counting() | java.util.stream.Collectors.counting() |
filter() | java.util.stream.Stream.filter() |
mapToInt() | java.util.stream.Stream.mapToInt() |
mapToObject() | java.util.stream.Stream.mapToObject() |
The uniq() (line 13) and sort() (line 17) methods reflect the functionality of the unix utilities with the same name. Sort introduces the first call to a stream which is sorted and then collected into a List. UniqueCount() (line 21) is analogous to uniq –c and returns a map in which each key is a character and each value is a count of the number of times that character appears. The two “getWords” methods (line 26 and 33) filter out words that are shorter than a given length. In the case of the getWordLengthsLongerThan() additional method calls are used to format and cast the results into a final String.
The code does not introduce any new concepts related to lambda expressions. The syntax introduced earlier is simply applied to specific use with the Java streams API.
Conclusion
The idea of writing less code to accomplish a given task is consistent with Einstein’s idea to: “make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience.” This has been more popularly quoted as “Make things as simple as possible, but not simpler”. Lambda expressions and the new streams API are often highlighted due to new possibilities to write simplified code that scales well. They contribute to the programmers ability to properly simplify code to its best possible representation.
Functional programming idioms are shorter by design, and with a little thought, there are many cases where Java code can be profitably made more succinct. The new syntax is unfamiliar but not overly complex. These new features clearly demonstrate that Java has moved far beyond its original goals as a language. It is now embracing some of the best functionality available to other programming languages and integrating them as its own.
About the Author
Casimir Saternos has worked as a Software Developer, Database Administrator and Software Architect over the past 15 years. He has recently written and created a screencast on the R programming language. His articles on Java and Oracle Technologies have appeared in Java Magazine and on the Oracle Technology Network. He is the author of Client-Server Web Apps with JavaScript and Java available from O'Reilly Media.