Apple has improved the speed of Nitro with 35% – Safari’s JavaScript engine – by converting JavaScript into LLVM IR code which is then subject to heavy optimization.
According to a blog post on webkit.org, WebKit had in the past three levels of optimizations for its internal JavaScript bytecode, each one being used at run time by striking a balance between the time needed to optimize a section of code and the benefits resulting from doing it:
- LLInt (Low Level Interpreter) – this is a bytecode interpreter not a compiler and it is doing very little in terms of optimization. Each function call goes through LLInt, and if it contains a statement which is invoked more than 100 times or the function itself is called more than 6 times then the function is handed to the next level of optimization, Baseline JIT.
- Baseline JIT – this is a simple JIT creating code faster than LLInt, but it does not include heavy optimizations. Again, if a statement is found to be executed more than 1,000 times or a function is called more than 66 times, the compilation is passed to the following level, DFG JIT. It is possible to switch compilers right after executing a statement by using On-Stack Replacement (OSR).
- DFG JIT (Data Flow Graph JIT) – Until now this compiler was responsible for Safari’s performance, but it is used only for sections of code that need more CPU because code optimizations take time.
To improve Nitro’s performance even more, Apple decided to introduce LLVM into the optimization chain. Chris Lattner, the original author and lead of LLVM Compiler Infrastructure, works for Apple leading the Developer Tools department, and perhaps is behind this move. This forth tier is called FTL JIT (Fourth Tier LLVM) and is a C++ module that ends up using LLVM low level optimizations. To do that, a function’s bytecode is converted into LLVM IR through two intermediary phases - Continuation-Passing Style (CPS) and Static Single Assignment (SSA) - which are meant to transform and optimize an originally dynamic code into a static one which is later processed by the LLVM compiler. This is the first time LLVM has been used for profile-directed compilation of a dynamic language and required some deep changes in LLVM, according to Filip Pizlo:
The WebKit FTL JIT is the first major project to use the LLVM JIT infrastructure for profile-directed compilation of a dynamic language. To make this work, we needed to make some big changes – in WebKit and LLVM. LLVM needs significantly more time to compile code compared to our existing JITs. WebKit uses a sophisticated generational garbage collector, but LLVM does not support intrusive GC algorithms. Profile-driven compilation implies that we might invoke an optimizing compiler while the function is running and we may want to transfer the function’s execution into optimized code in the middle of a loop; to our knowledge the FTL is the first compiler to do on-stack-replacement for hot-loop transfer into LLVM-compiled code. Finally, LLVM previously had no support for the self-modifying code and deoptimization tricks that we rely on to handle dynamic code.
What did Apple achieve working on this for one year? According to Pizlo, Safari is 38% faster than DFG JIT for the Richards benchmark and on average 35% faster on a number of asm.js benchmarks. It remains to be seen how it compares on Octane against Chrome or Firefox. And “work on the FTL is just beginning – we still need to increase the set of JavaScript operations that the FTL can compile and we still have unexplored performance opportunities,” ended Pizlo.
FTL JIT has been committed to the WebKit trunk as r167958 and can be tested with WebKit Nightly.