In this interview, Bernhard Rosenkränzer, a Linaro engineer, explains how they improved the performance of Android 4.0.4 from 20% to 500%.
Linaro has improved the performance of the standard AOSP Android 4.0.4 from 20-30% to 500% depending on the type of benchmark used. They have obtained the greatest improvements by optimizing string operations in the libc
library and by using a different compiler and flags than those used by Google to generate the Android binaries. Linaro has demonstrated at Linaro Connect Q2.12, Hong Kong, their build of Android running against the stock version on a Pandaboard-based device with TI OMAP4430. This specific board was used because it supports both versions of Android, but the tests could be repeated on other devices if necessary. The benchmark used showed Linaro Android running 3D graphics at 60 FPS while standard Android performed at 30 FPS.
Following is an interview with Bernhard Rosenkränzer, a Linaro engineer working on Android, explaining how they improved Android.
InfoQ: What performance improvements numbers are we talking about?
BR: As usual, that depends strongly on the exact thing you're trying to do. Obviously there won't be much of a performance improvement in running hand-crafted assembly code that doesn't make calls to shared libraries - but on the other hand, if you created a benchmark that relies heavily on memchr() and strlen(), you could easily show the Linaro build being 5+ times faster than stock Android. In real world applications, speedups of 20%-30% are realistic.
InfoQ: I saw 60 fps for some tests against 30 fps for stock Android. What benchmarks were those?
BR: That was the 3D part of 0xbench running on a Pandaboard on a 720p screen. This is admittedly not the best benchmark out there - but it is one that shows the improvement, and that looks fancy while doing it. With lots of non-technical people in the audience, we obviously had to show something other than result numbers.
InfoQ: What exactly have you enhanced?
BR: A couple of different things - in terms of actual code modifications, Bionic (Android's libc) is what we've changed most - we've replaced its core string handling functions with equivalent functions optimized specifically for ARMv7 CPUs, and in particular for the Cortex A9.
We've rebuilt the entire system using the Linaro toolchain (based on gcc 4.7), and with different compiler flags (the stock Android build system doesn't seem to have a provision to list per-device compiler flags, so it doesn't make use of -mcpu=, -mtune= etc.). Android also uses -fno-strict-aliasing by default (turning off optimizations that rely on code following C/C++ aliasing rules) – we turned that off, and fixed code to comply with aliasing rules all over the Android source tree. (In a couple of subdirectories, we decided it wasn't worth the hassle and just put -fno-strict-aliasing back for those directories only, without penalizing the rest of the OS for some noncompliant code).
We've also updated the kernel, and tweaked its options. There's still some room for improvement there, though: our kernels are built with full debugging support, performance counters etc. Some of those options come at a performance price tag, and should obviously be turned off when truly optimizing for speed instead of building a development platform.
One thing that may be interesting to note is also what we did not do (some people were speculating that the improvement in fps was related to turning off vsync in the driver): We didn't modify any drivers. The improvements are not tied to a particular hardware platform (we used a Pandaboard in the demo because it happens to be the only board that is supported by both stock AOSP and Linaro Android - for anything else, there's nothing to compare against) and will have a similar effect on all other boards Linaro Android runs on, such as the Origen, Snowball and iMX6.
InfoQ: Are these enhancements going to be pushed into the stock Android?
BR: We are going to submit them to AOSP when they've had a bit more testing. Of course, whether or not they end up being used in the next Android version is not our decision. Some Android modders [CyanogenMod] have already started picking up our changes into their trees.
InfoQ: Will the enhancements be used only by Linaro founding companies for their own Android-based devices?
BR: That, too, is ultimately not under our control - we're making the code available to them, whether or not (or when) they start using it is their decision.
InfoQ: Have you heard anything from Google on this?
BR: Outside of a Google engineer giving a +1 of a related post on Google Plus, no.
InfoQ: Do you cooperate with Google at all?
BR: Yes, we've submitted patches to make Android compile with gcc 4.6 and gcc 4.7 before. Some were accepted, some were dropped because they had already fixed the problem differently in their internal tree.
InfoQ: Are there other Android-related projects like this that Linaro is working on?
BR: We release Android builds for our member companies' development boards every month (and snapshots every day) - they can be found at https://android-build.linaro.org/ and will generally contain all improvements we're making.
Other than general performance tweaks, we also do platform enablement (integrating drivers etc.), and make more development tools available - Linaro Android includes, for example, busybox, perf, lrzsz and gator (needed to run the ARM's DS-5, a graphical debug/performance measuring tool).
Editor’s Note: Linaro is a non-profit organization founded by ARM, Freescale, IBM, Samsung, Ericsson and TI to consolidate and optimize open source software for the ARM platform.