I’ve found it a well known fact in the programming world that the java (double-check) singleton pattern is not thread safe and can’t be fixed. Here is a triplet of articles ([1], [2], and [3]) that explain why. These articles offer particular implementations of singleton pattern and very fairly explicate how these implementations can be broken, and then conclude with a generalized statement.
Worth mentioning is that in the work “The ‘Double-Checked Locking is Broken’ Declaration” [4], signed amongst others by Jeremy Manson and Bill Pugh (authors of the java 1.5 memory model), the authors suggested a fix for the double-checked singleton pattern in java 1.5 (the part “Fixing Double-Checked Locking using Volatile”).
Any discussion about java synchronization needs to consider an extensive and not very well defined java optimization. The Java Language Specification Third Edition (JLS3) [5] comments:
The specification provides a great deal of freedom for the implementor to perform a myriad of code transformations, including the reordering of actions and removal of unnecessary synchronization
The main purpose of this article is to test the “boundary conditions” of the programming language optimization. This task requires a non-standard approach, which I try to develop here.
Any offering of a programming language synchronization construct stands on a very shaky ground as long as it does not use general principles: what is allowed and what is not allowed with optimization. My idea is to show that if optimization can break code only by violation well accepted general principles, then such optimization is impossible (or at least, incorrect). And if we stand on this position, we can even make a statement about any programming language, not necessary about java only. Really, java uses the synchronized primitive, C# uses lock, C used POSIX primitives with Mutex, C++ also uses Mutex (on Win32 and Solaris), Python - lock, Ruby Mutex again. The question is then, can we try to generalize by deduction without jumping to a conclusion from specific examples?
The JLS3 uses the following definition of the intra-thread semantics:
The semantics for single threaded programs that allow the complete prediction of the behavior of a thread based on the values seen by read actions within the thread. The actions of each thread in isolation must behave as governed by the semantics of that thread.
We will be using the following weaker form of this definition:
The programming language implementation provides a minimum intra-thread semantics (minimum ITS) if any compile time / runtime optimization keeps the semantics of non-optimized code in a single-thread execution up to the level of any method. (We don’t include the inlined methods here).
It’s easy to show that if minimum ITS is broken on the method level, that it also can be broken on the program level.
Definition: The programming language implementation provides a minimum synchronization if the language synchronization primitive satisfy the following two conditions:
CS1: only one thread can enter a synchronized block
CS2: all shared variables for any thread are updated from the main memory on the entry and saved to the main memory on the exit of a synchronized block
Definition: The method returning a reference to an object produces a phantom-object effect if it is possible with this method to obtain a reference to not a fully constructed object
In this article I investigate the relative consistency of one singleton implementation (taking a leaf out of the Mathematical Logic book). More precisely, I will try to illustrate the following statement:
A: If a programming language implementation provides a minimum ITS and minimum synchronization, the usage of the gOracle pattern (*), is consistent
For this purpose I will try to prove the following statement:
B: If a programming language implementation provides a minimum ITS and minimum synchronization the gOracle pattern (*) can’t produce the phantom-object effect
From now I move to the context of java programming language, but I believe all the following can be adapted to any of the languages listed above.
So what, then, is the gOracle pattern? As the name hints, I will introduce an oracle, something similar to ancient Greeks’ oracle. My oracle will be a method with the signature static boolean oracle(int h) and one very special quality – there is no way for java compiler or java runtime to predict the value produced by this method. But I’ll divulge to you a secret – it always produces true. I will later return to the question of how to build such a method, but for now let me assure you that it is easy enough.
Then my gOracle singleton implementation is as follows:
class Singleton {
static private Singleton instance = null;
static private boolean bInited = false;
private Singleton() {…}
public static Singleton getInstance() {
if (!bInited || instance==null) { // FIRST CHECK
synchronized (Singleton.class) {
if (!bInited) { // INNER IF-BLOCK, SECOND CHECK
instance = new Singleton();
int hash = instance.hashCode();
bInited = oracle(hash);
}
}
}
return instance;
}
static boolean oracle(int h) {…}
}
Now I start my defense of gOracle.
The key idea is to show that if java optimization results in the phantom-object effect for getInstance() method, than the minimum ITS would be broken. In a sense, that the execution of such an optimized code in a single-thread mode would potentially produce a different results, than the execution of non-optimized code.
1. Let’s call i-thread the thread that first enter the synchronized block. This thread will eventually change the value of bInited variable, even if initially only in its local memory 2. Though compiler has ability to transform getInstance() code, and then runtime has ability change the execution order of operations, no such optimization is allowed to break the minimum ITS for the i-thread executed in isolation 3. The result of the call hash = instance.hashCode() is unpredictable before the constructing of the instance object. NOTE 1: According to the java API, the default hashCode() is typically implemented by converting the internal address of the object into an integer. But if that is not good enough for unpredictability, we can consider implementing hashCode() for Singleton as
and then let compiler or runtime try to predict inside or before the constructor the value of Date.getTime() placed in the original code order after the constructor! 4. The call bInited = oracle(hash) uses the results of the call instance.hashCode()and unpredictable, as we assumed 5. Therefore the compiler or runtime optimization can’t assign the value (true) of the bInited variable before the complete constructing the Singleton object and assigning to the instance variable in the local memory of the i-thread 6. After the i–thread exits the synchronized block and before any other thread enter that block, the values of the instance and bInited variables in the main memory will be set to the value from i-thread local memory. Both set operations are atomic. 7. Now consider the behavior of any other thread, let’s call it a w-thread, in minimum ITS semantics. That means, we can assume the order of executions for the w-thread as it specified in the Singleton code, but the shared variables can have “unexpected” values. That is what we will look at now. Let’s use the following notations:
8.a. The w-thread waits on the synchronized block for i-thread to leave the block – piece of cake. First i-thread leaves the synchronized block, then java synchronizes i-thread local variables to the main memory, I(bInited) => M(bInited) I(instance) => M(instance) then on w-thread entry to the synchronized block java synchronizes the w-thread local variables from the main memory, M(bInited) => W(bInited) M(instance) => W(instance) and now the w-thread has a correct view of shared variables, same view as the i-thread has. 8.b. The w-thread is inside or before the FIRST CHECK operation when i-thread leaves the synchronized block. Note, that W(bInited) and W(instance) are initialized from the main memory ether before or after the I(bInited)and I(instance) were synchronized to M(bInited)and M(instance), which happens after t-thread left the synchronized block. So each of W(bInited) and W(instance) can only have either initial value (as defined in the class Singleton) or the value produced by the i-thread, in any possible combination. Then w-thread first checks the value of the variable W(bInited). It value is false w-thread jumps to the synchronized block, essentially as in the case 7.a. If W(bInited)==true, then i-thread already has assigned I(bInited)=true and java already synchronized along the chain I(bInited) => M(bInited) => W(bInited) Therefore the i-thread already has initialized the Singleton object and at the very least the I(instance) is references that object. Now next the w-thread checks the value of the W(instance) variable. If W(instance)==null then java not yet synchronizes along the chain I(instance) => M(instance) => W(instance) but no worry, w-thread than only jumps on the entry of the synchronized block and discover the new value of the instance variable there, as in the 7.a case. But if W(instance)=!null, then the only value that the W(instance)can only have is the value of the I(instance). Therefore the w-thread will return the fully constructed instance of the Singleton. 9. So for any thread which obtains the instance variable through the getInstance() method it is guaranteed that the Singleton object has already been created and fully initialized by i-thread. |
In other words, the only way how compiler/runtime optimization can break the gOracle pattern is by also potentially breaking the single thread execution of its code (assuming that we have minimum synchronization). This proves (for java) the statement B and illustrates the statement A. The defense rests its case.
Corollary: In any JDK 1.4 implementation provides the minimum ITS requirement, the gOracle pattern can’t produce there the phantom-object effect.
Let’s return to the question of how to find such an oracle for gOracle. What about some non-trivial mathematical statement, like:
- the sum of every two sides of a triangle must be greater than or equal to the third side
- any integer can be presented as a product of primes
Taking to the second statement, let’s implement oracle like this:
static boolean oracle(int hash) {
Random rand = new Random(Date.getTime());
int iNum = rand.nextInt() + hash;
iNum = iNum % 100; // optional line
if( found iNum presentation as product of primes )
return true
else
return false;
}
I hope you agree that the compiler designers are not Gods or Oracles yet and can’t teach compiler or runtime to recognize the usage of the Fundamental Theorem of Arithmetic [6] or Euclid Geometry Axioms in java code. And then there is the Big Ferma Theorem.
(*) I introduce the name “gOracle” for my implementation of the singleton pattern to express its Greek origin, the idea of “predicting the future” and how it was used by ancient Greeks, not to be confused with some product from the Oracle Corp.
References
1. Double-checked locking and the Singleton pattern, by peter Haggar
2. Double-checked locking: Clever, but broken, by Brian Goetz
3. Java theory and practice: Fixing the Java Memory Model by Brian Goetz
4. The "Double-Checked Locking is Broken" Declaration, by Jeremy Manson, Bill Pugh …
5. Java Language Specification, Third Edition