Saturday, January 3, 2015

Ticket Lock - Array of Waiting Nodes (AWN)

Let's start the year with more mutual exclusion locks. In this post we're going to talk about a variant of the Ticket Lock which we called the Ticket Lock with an Array of Waiting Nodes or Ticket AWN for short.
As far as we could tell, this is a new mechanism, but it's not necessarily a new mutual exclusion lock, because it is based on the Ticket Lock, so let's call it a variant of the Ticket Lock.
In fact, we are going to talk about three slightly different variants, but they have very similar algorithms and properties.

The classic Ticket Lock algorithm has each thread spin on the same variable (which we call egress) until it arrives at the ticket taken from ingress. The design goal of the Ticket AWN Lock is to have each thread spinning on a cache line that itself has allocated, like on the CLH Lock algorithm, and this way reduce traffic on the cache-coherence system, which is supposed to be one of the main advantages of the CLH Lock.
To achieve this goal, we are going to add an array of WaitingNodes of fixed size which we call waitersArray. The size should be about the number of cores in the system, but if it is the maximum size of the number of threads it is ideal (well, not ideal in terms of memory usage, but ideal in terms of performance). This array is made up of references to WaitingNodes and not actual WaitingNodes instances (not that you could do that on Java, but you could do it on C/C++, we just don't want to).
Each WaitingNode contains only one boolean variable named lockIsMine, set to false by default. Each thread has only one instance of WaitingNode regardless of the number of instances of Ticket AWN, and it accesses the WaitingNode instances through a static thread-local variable named tlNode.



static class WaitingNode {
    volatile boolean lockIsMine = false;
}
   
private static final ThreadLocal<WaitingNode> tlNode = new ThreadLocal<WaitingNode>();  
private final int maxArrayWaiters;
private final AtomicLong ingress = new AtomicLong(0);
private volatile long egress = 0;
private final AtomicReferenceArray<WaitingNode> waitersArray;


This algorithm does more work than CLH so it's not that better, but in a  machine where there is no getAndSet() or it is much slower than  getAndAdd(), this technique could be better than CLH.
There is however an advantage over CLH: In Ticket AWN there is a single WaitingNode instance per thread that is shared among all instances of Ticket AWN, while on the CLH there must be one node per instance per thread. This means that memory wise, if an array of MAX_THREADS is used for the Ticket AWN, the Ticket AWN should consume slightly less memory overall:
CLH: sizeof(node) x Number of instances x MAX_THREADS
Ticket AWN: (sizeof(node) x MAX_THREADS) + (sizeof(ptr) x Number of instances x maxArrayWaiters)

Moreover, we can easily transform the part of the algorithm that sets the waiting node to true and the one that spins on the waiting node, to a message sending and receiving mechanism, which some CPUs like the Tilera provide, or even use a message passing API like MPI. On Java, we can use stuff like Unsafe.park() and unpark().

All three variants have the same basic behavior: if there is no thread waiting or only one thread waiting, they function like a ticket lock. To be more precise:

  • If ticket == egress:   Works like a Ticket Lock
  • If ticket-1 == egress: Works like a ticket Lock
  • If ticket-2 >= egress: Uses the new mechanism

One thing to notice about this algorithm is that the only atomic operation that is not a simple load or store is the getAndIncrement() done at the beginning of the lock() method. No other CAS, EXHG, or XADD are done on this algorithm, and several of the atomic loads and stores can be relaxed.


On the next posts we will provide some brief explanations on each of the variants. For the moment, here is the source code in Java for the three variants:
https://github.com/pramalhe/ConcurrencyFreaks/blob/master/Java/com/concurrencyfreaks/locks/experimental/TicketAWNEndsEgressLock.java
https://github.com/pramalhe/ConcurrencyFreaks/blob/master/Java/com/concurrencyfreaks/locks/experimental/TicketAWNSpinsBothLock.java
https://github.com/pramalhe/ConcurrencyFreaks/blob/master/Java/com/concurrencyfreaks/locks/experimental/TicketAWNNegativeEgressLock.java
and C11 implementations of them using relaxed atomics (where possible):
https://github.com/pramalhe/ConcurrencyFreaks/blob/master/C11/locks/ticketawn/ticket_awnee_mutex.c
https://github.com/pramalhe/ConcurrencyFreaks/blob/master/C11/locks/ticketawn/ticket_awnee_mutex.h
https://github.com/pramalhe/ConcurrencyFreaks/blob/master/C11/locks/ticketawn/ticket_awnne_mutex.c
https://github.com/pramalhe/ConcurrencyFreaks/blob/master/C11/locks/ticketawn/ticket_awnne_mutex.h
https://github.com/pramalhe/ConcurrencyFreaks/blob/master/C11/locks/ticketawn/ticket_awnsb_mutex.c
https://github.com/pramalhe/ConcurrencyFreaks/blob/master/C11/locks/ticketawn/ticket_awnsb_mutex.h
 

Happy 2015!

2 comments:

  1. Hi guys,
    Interesting post :-)
    Ticket AWN reminds me of the partitioned ticket lock (http://dl.acm.org/citation.cfm?id=1989543), but my guess would be that you are already aware of that.

    ReplyDelete
  2. Yep, we mentioned it on yesterday's post ;)

    ReplyDelete