Here's what the lock() and unlock() methods look like in this variant:
void ticket_awnee_mutex_lock(ticket_awnee_mutex_t * self)
{
const long long ticket =
atomic_fetch_add(&self->ingress, 1);
if (atomic_load(&self->egress) == ticket) return;
while (atomic_load_explicit(&self->egress, memory_order_relaxed) >=
ticket-1) {
if (atomic_load(&self->egress) == ticket) return;
sched_yield();
}
// If there is no
slot to wait, spin until there is
while (ticket-atomic_load(&self->egress) >= (self->maxArrayWaiters-1)) sched_yield();
// There is a
spot for us on the array, so place our node there
awnee_node_t * wnode = &tlNode;
// Reset
lockIsMine from previous usages
atomic_store_explicit(&wnode->lockIsMine, false, memory_order_relaxed);
atomic_store(&self->waitersArray[(int)(ticket % self->maxArrayWaiters)], wnode);
if (atomic_load(&self->egress) < ticket-1) {
// Spin on
lockIsMine
while (!atomic_load(&wnode->lockIsMine)) sched_yield();
}
// Spin on egress
while (atomic_load(&self->egress) != ticket) {
sched_yield();
}
}
void ticket_awnee_mutex_unlock(ticket_awnee_mutex_t * self)
{
long long ticket =
atomic_load_explicit(&self->egress, memory_order_relaxed);
// Clear up our
entry in the array before releasing the lock.
atomic_store_explicit(&self->waitersArray[(int)(ticket % self->maxArrayWaiters)], NULL, memory_order_relaxed);
// We could do
this load as relaxed per se but then the store on egress of -(ticket+1)
could be re-ordered to be before, and we don't want that
awnee_node_t * wnode = atomic_load(&self->waitersArray[(int)((ticket+1) % self->maxArrayWaiters)]);
if (wnode != NULL) {
// We saw the
node in waitersArray
atomic_store_explicit(&wnode->lockIsMine, true, memory_order_relaxed);
}
atomic_store(&self->egress, ticket+1);
}
In the "Ends on Egress" variant, in the lock() method, after placing the node on the waitersArray we look again at egress and if there is at least one other thread waiting in front of the current one then we spin on lockIsMine because we know for sure that that next thread will see the node we've just placed in the array.
We could return from the lock() method as soon as lockIsMine becomes true, but then we could have (bad) races on egress and overwrite it. To solve that problem, we could do an atomic increment on egress at the end of unlock() instead of doing a store with release, but atomic increments are more expensive than stores with release, which would mean that on a single-thread case this variant would lose out against a simple Ticket Lock. The design goal of the AWN algorithm is to be at least as good as a Ticket Lock, so we needed to find a way to increment egress with just a store.
The solution is simple, it involves spinning on egress one last time before returning from lock(). This way, we are sure that the last instruction of the previous thread calling unlock() (setting egress to its ticket+1) has completed by the time we enter the critical section, which means that the current thread can safely call unlock() without the risk of having multiple threads writing to egress simultaneously.
Sample Scenario
To understand how the egress mechanism works, imagine a sample scenario where Thread 1 (T1) gets a ticket of 10:- egress is 10: T1 has the lock
- egress is 9: T1 will spin on egress waiting for it to become 10
- egress is 8: T1 will add its node to the waitersArray, then T1 will spin on lockIsMine until it is set to true and after will spin on egress waiting for it to become 10
Relaxed Atomics
There are a few places in this algorithm where we use relaxed atomic loads or stores in the C11/C++1x implementation:- In lock(), the store on wnode->lockIsMine to false can be relaxed because it will become visible on the release-store of the wnode in the array and it will only be accessible from that instant in time.
- In unlock(), the first load on egress can be relaxed because it was read last by the same thread in lock() so it is guaranteed to be up to date.
- In lock(), the store on the self entry of the array to NULL can be relaxed because it will become visible either with the store with release on the wnode->lockIsMine to true or on the egress to ticket+1.
- In unlock(), the store in lockIsMine to true can be relaxed because before it there is a load with with acquire of the array which prevents reordering above, and the store will become visible when we do the store with release of egress to ticket+1
Something to notice on the unlock() in the "Ends on Egress" variant, the store on lockIsMine will become visible to other threads right before the store to egress, which means that the next waiting thread that is currently spinning on lockIsMine will break out of the loop and check once (and only once) the egress to see that it is its turn to acquire the lock. This means that if the previous thread has seen the current thread's waiting node, the last loop in the method lock() will be executed for a single iteration, thus incurring at most the price of a cache-miss on egress.
Here is the source code in Java and C11:
https://github.com/pramalhe/ConcurrencyFreaks/blob/master/Java/com/concurrencyfreaks/locks/experimental/TicketAWNEndsEgressLock.java
https://github.com/pramalhe/ConcurrencyFreaks/blob/master/C11/locks/ticketawn/ticket_awnee_mutex.h
https://github.com/pramalhe/ConcurrencyFreaks/blob/master/C11/locks/ticketawn/ticket_awnee_mutex.c
No comments:
Post a Comment