Fedora Core 4 is the first Fedora release to include a substantial amount of code written in the Java programming language. These additions were made possible by improvements in GNU Classpath and GNU gcj.
GCJ Basics
First, GNU gcj is not Java.
Nevertheless, gcj aims to implement a complete system, compatible with Java, centered around an ahead-of-time compiler. It has a cleanroom class library based on GNU Classpath, and a built-in interpreter. The compiler can compile Java source files, class files, or even entire jar files to object code.
Historically gcj has taken a "radically traditional" approach of treating Java as if it were a somewhat unusual dialect of C++. This yields many nice results, but unfortunately the runtime linking models are too different -- and thus this approach breaks when faced with large, complex Java applications, particularly those that come with sophisticated class loading infrastructures.
For the GCC 4.0 release we implemented a new compilation mode for gcj, called the Binary Compatibility ABI. This compilation approach lets us defer all linking to runtime and fully implement Java's binary compatibility specification -- exactly what is needed to let precompiled code interact properly with class loading.
We also added a class mapping database. At runtime, whenever a class is defined, the gcj runtime (called "libgcj") will look for this class in this database. If the class is found (note that the *contents* of the class are used -- not merely the name), then libgcj will map in a shared library containing the compiled version of the class.
These two changes, taken together, let us do something quite powerful: we can compile Java programs ahead-of-time without requiring any application-level changes. Furthermore, due to a novel approach to bytecode verification, we're also able to ensure runtime type safety of the compiled code.
Building an RPM
Building your Java program on Fedora Core is straightforward -- your existing build infrastructure should work as-is. Fedora Core ships 'ant' and uses the Java compiler from Eclipse to convert your Java programs to bytecode.
Describing how to write a source RPM is outside the scope of this article, but the Fedora RPM Guide has useful general information and JPackage has some Java-specific guidelines.
Once your program is compiled to bytecode, you can compile it to native code. Because gcj does not yet include a JIT, this is the way to get reasonable performance. In some cases the result may outperform existing JITs, due to gcj's use of shared libraries... you can expect to see the biggest difference when simultaneously running many instances of your application.
Fedora provides two programs which make it very simple to natively compile your package. These are used when building an RPM.
The first program is 'aot-compile-rpm'. This searches for jar files and compiles each to a shared library, using gcj. aot-compile-rpm knows a few useful gcj-specific tricks, such as splitting large jar files into pieces before compilation (compiling a large jar at one go can use an enormous amount of memory), and using -Bsymbolic when linking the resulting shared library (this results in a runtime performance improvement).
A substitute for this program, in case you are not building an RPM, is to simply compile each jar file in your program to a shared library. Here we show the simplest approach (as mentioned earlier, for a large jar this might be quite slow):
gcj -fjni -findirect-dispatch -fPIC -shared \
-Wl,-Bsymbolic -o foo.jar.so
foo.jar
Breaking it down:
- -fjni tells gcj that native methods in your Java code are implemented using JNI.
- -findirect-dispatch tells gcj to use the binary compatibility ABI.
- -fPIC and -shared are required when building a shared library
- -Wl,-Bsymbolic tells the linker to bind references within the shared library, when possible.
The second useful program supplied with Fedora Core is rebuild-gcj-db. This program is used when installing or uninstalling an RPM to update the global class mapping database, and should be run in your RPM's %post and %postun sections.
rebuild-gcj-db operates by convention -- it assumes that each package installs its own class mapping database somewhere beneath /usr/lib/gcj (/usr/lib64/gcj for a 64 bit package on a multi-arch OS -- there are RPM macros to abstract out this oddity). Then it simply loops over all these individual databases and merges them into the global database which is used at runtime.
Note that it isn't always possible to compile a Java program using gcj. Gcj's class library is not quite complete, and on occasion a program will use some APIs which have not yet been implemented. For instance, Swing is currently under active development. Also, despite Sun's warning, some packages use private com.sun.* or sun.* APIs -- and generally speaking gcj does not implement these.
What Uses This?
Fedora Core 4 uses gcj compilation for a number of programs.
- First, Fedora Core includes ant and ant's many dependencies, in order to compile and run other Java programs. It also includes the Eclipse java compiler.
- Tomcat and its dependencies.
- The Java code in OpenOffice.
- The Eclipse IDE and several Eclipse plugins, such as the CDT.
Debuting in Fedora Core 5:
- RSSOwl, an RSS reader.
- Azureus, an awesome BitTorrent client.
- Frysk, an execution analysis technology for monitoring running processes or threads.
The JOnAS application server, a J2EE implementation, is also working but has not yet gone through the Fedora Extras review process.
Future Work
This past year the GNU Classpath community has made great strides toward completion, and we expect to continue this through 2006. We have an API comparison page which is updated regularly; you can track our API status here.
We're also working on some core improvements to gcj: updating the compiler, runtime, and class libraries to Java 5.
Finally, we're implementing the Java security infrastructure in libgcj. This will let us ship a Mozilla plugin and netx , a Java Web Start implementation.
Bio
Tom Tromey graduated from Caltech in 1990. He works on the GNU Java compiler and runtime for Red Hat. He wrote GNU Automake.