Thursday, January 15, 2015

Ticket Lock AWN - Spins on Both variant

On this post we are going to cover the third and last of the three variants of Ticket AWN. We call this variant the Spins on Both because the at the end of the lock() method we have to spin on both egress and lockIsMine if and only if egress is ticket-1.
Here's what the lock() and unlock() methods look like in this variant:

void ticket_awnsb_mutex_lock(ticket_awnsb_mutex_t * self)
    const long long ticket = atomic_fetch_add(&self->ingress, 1);
    if (atomic_load(&self->egress) == ticket) return;
    while (atomic_load_explicit(&self->egress, memory_order_relaxed) >= ticket-1) {
        if (atomic_load(&self->egress) == ticket) return;
    // If there is no slot to wait, spin until there is
    while (ticket-atomic_load(&self->egress) >= (self->maxArrayWaiters-1)) sched_yield();

    // There is a spot for us on the array, so place our node there
    awnsb_node_t * wnode = &tlNode;
    // Reset lockIsMine from previous usages
    atomic_store_explicit(&wnode->lockIsMine, false, memory_order_relaxed);
    atomic_store(&self->waitersArray[(int)(ticket % self->maxArrayWaiters)], wnode);

    if (atomic_load(&self->egress) < ticket-1) {
        // Spin on lockIsMine
        while (!atomic_load(&wnode->lockIsMine)) sched_yield();
        atomic_store_explicit(&self->egress, ticket, memory_order_relaxed);
    } else {
        // Spin on both lockIsMine and egress
        while (atomic_load(&self->egress) != ticket) {
            if (atomic_load(&wnode->lockIsMine)) {
                atomic_store_explicit(&self->egress, ticket, memory_order_relaxed);
                return; // Lock acquired
    // Lock acquired

void ticket_awnsb_mutex_unlock(ticket_awnsb_mutex_t * self)
    long long ticket = atomic_load_explicit(&self->egress, memory_order_relaxed);
    // Clear up our entry in the array before releasing the lock.
    atomic_store_explicit(&self->waitersArray[(int)(ticket % self->maxArrayWaiters)], NULL, memory_order_relaxed);
    // We could do this load as relaxed per se but then the store on egress of -(ticket+1) could be re-ordered to be before, and we don't want that
    awnsb_node_t * wnode = atomic_load(&self->waitersArray[(int)((ticket+1) % self->maxArrayWaiters)]);
    if (wnode != NULL) {
        // We saw the node in waitersArray
        atomic_store(&wnode->lockIsMine, true);
    } else {
        atomic_store(&self->egress, ticket+1);

In the "Spins on Both" variant we may have to spin on both egress and lockIsMine, but notice that this only happens if the egress is equal to ticket-1 (or to ticket). For all other cases we spin on lockIsMine only.
The main advantage of this technique is that we can place the store of the egress advancing to the current ticket right in the lock() function (with a relaxed store) and this way, on the unlock(), we only do one store, either on egress or on lockIsMine.
Surprisingly, this seems to be the best of the three variants, at least on our C11 benchmarks (more on that in a future post).

Sample Scenario

To understand how this mechanism works, imagine a sample scenario where Thread 1 (T1) gets a ticket of 10:
  • egress is 10: T1 has the lock
  • egress is  9: T1 will add its node to the waitersArray and will spin both on egress and lockIsMine and exit when:
    • egress is       10: T1 has the lock
    • lockIsMine is true: T1 has the lock and sets egress to 10
  • egress is  8: T1 will add its node to the waitersArray and wait until lockIsMine is true, once lockIsMine is true T1 has the lock and sets egress to 10

Relaxed Atomics

There are a few places in this algorithm where we use relaxed atomic loads or stores in the C11/C++1x implementation:
  • In lock(), the store on wnode->lockIsMine to false can be relaxed because it will become visible on the release-store of the wnode in the array and it will only be accessible from that instant in time.
  • In lock(), the store to egress of ticket can be made relaxed because it will become visible at the end of unlock() when we do a store with release on either lockIsMine or egress.
  • In unlock(), the first load on egress can be relaxed because it was read last by the same thread in lock() so it is guaranteed to be up to date.
  • In unlock(), the store on the self entry of the array to NULL can be relaxed because it will become visible either with the store with release on the wnode->lockIsMine to true or on the egress to ticket+1.
Here is the source code in Java and C11:

No comments:

Post a Comment