Wednesday, December 24, 2014

Tidex Mutex

For our fourth Christmas gift, we have a new mutual exclusion lock!
That's right, it's not every day that you see a new algorithm for mutual exclusion, but today may just be one of those days  :)
It all depends on whether any of you already knows if such an algorithm exists, so if you do, then please leave a note on the Comments with a reference/name of the original paper with this approach. Actually, Andreia is a bit reluctant about the "new" part because of its (conceptual) similarities to the Ticket Lock, but you'll see that in the mechanism shown below there is no atomic_fetch_add() and there are no counters, and each thread needs two identifiers. Perhaps a more accurate description would be that this new algorithm is an oversimplification of the CLH lock, but the similarities may be hard to see. I think this approach is so simple that it's strange no one has thought of it first (which usually means someone has), but Google search doesn't seem to know about it, and it doesn't show up on the usual CS class slides and here or here.

We call this the Tidex Mutex because it uses Thread-Ids and an atomic_exchange() (getAndAdd() in Java). It has two atomic variables, which we call ingress and egress and they hold thread-ids. The ingress holds the thread id of the last thread waiting on the lock, and the egress contains the thread id of the thread holding (or has just released) the lock.

The main idea is that a thread attempting to acquire the lock will do atomic_exchange() on the ingress (once and only once) and this way put itself on a kind of virtual queue and gets the thread id of the previous waiting thread. When a thread does unlock() it sets the egress to its own thread id, and the next waiter (that is waiting in the lock() function) will see a thread id that matches the one returned by atomic_exchange() and knows that now it's its turn to enter the critical section.
There is also a third variable (non atomic) which we called nextEgress which is used to pass state from the lock() to the unlock(). Alternatively, this can be achieved with a thread-local variable or returning the value in lock() and passing it to unlock(), but that is not as versatile.

private final AtomicLong ingress = new AtomicLong(INVALID_TID);
private volatile long egress = INVALID_TID;
private long nextEgress = INVALID_TID;
So here's what lock() looks like:

public void lock() {
    long mytid = Thread.currentThread().getId();
    if (egress == mytid) mytid = -mytid;
    final long prevtid = ingress.getAndSet(mytid);
    while (egress != prevtid) {    
    // Lock has been acquired
    nextEgress = mytid;

The first trick about lock() is that we can't use the thread id as is. Imagine for a moment that a thread does lock() then unlock() then lock() again and then unlock(). If there is another thread jumping in between, there is no way to determining whether it's waiting for the first lock()/unlock() or for the second lock()/unlock() because it is the same thread id, and thus we break mutual exclusion.
To work around this, before doing atomic_exchange() we must first check that our thread id is not currently in the egress. If it is, then we need to use an alternate thread id, which in this case we choose to be the negative of the current thread's thread id. This means that each thread needs two distinct states to implement this algorithm, which we chose to be the thread id and its negative, but other approaches can be chosen, like for example, an array of pairs of distinct values which is "reserved" every time a thread starts, or enters lock() for the first time, or in the constructor of a thread-local. As long as each thread has two distinct values that no other thread has, we can use them as values for ingress and egress in the Tidex Mutex algorithm.
The usage of the two states creates a second problem which is, how does the unlock() method knows which of these two states was chosen by the lock() to pass to atomic_exchange()? We solve this by adding a third variable named nextEgress which is just a regular variable (non atomic) and once lock() acquires the lock it can save the thread id in nextEgress for unlock() to later write to egress. The code for unlock() is thus extremely simple:

public void unlock() {
    egress = nextEgress;

There are a few things interesting about the Tidex Lock.
The first one, is that it is starvation-free on x86, just like the Ticket Lock or the CLH Lock. This comes from the fact that atomic_exchange() in x86 generates a single XCHG instruction.
The second one, is that it uses just two variables, like the Ticket Lock, and the lock is acquired when the second variable (egress) is seen to have some pre-defined value, which in the Ticket Lock case is the ticket that came out of doing atomic_fetch_add() on ingress, and on the Tidex Lock it is the thread id returned by atomic_exchange().
A third one, is that it uses a kind of virtual queue, which in a way is similar to the CLH Lock, but it doesn't need to allocate nodes for the queue.
A fourth one, is that it can be implemented to use the minimum number of acquire and release fences, i.e. one acquire and one release for lock(), and one release for unlock().

In conclusion, the Tidex Lock is: simple to understand and implement, starvation-free, strict FIFO (fair), low memory usage (just two or three variables), doesn't allocate memory on lock() or unlock(), and uses the minimum number of fences (in the C11/C++1x memory model) needed to achieve pessimistic mutual exclusion.

Here is the Java code on Github:

Merry Christmas!