Friday, November 22, 2013

Double Instance Locking

Did you know you could make any read-only operation of any single-threaded data structure be lock-free in a multi-threaded program? And not just any data structure, but any object that can be replicated?

... yes, of course you knew that, you can just use a Copy-On-Write (COW) pattern to make a new copy of the object or data structure, thus making any read operation wait-free. This is in fact the main "trick" of the CopyOnWriteArrayList in java.util.concurrent

There are two main issues with the COW pattern though:
1st, it only works on a system with a Garbage Collector, otherwise you have to keep track of how many instances there are, and which threads are accessing which instances, and doing that is most certainly not wait-free (for example, google for "Hazard-Pointers", or "Drop The Anchor"). This means it is mostly unpractical in C/C++
2nd, it can be very slow for Readers (threads doing a read-only operation) because every time a modification occurs, a new copy will have to be made of the entire data structure, which will cause a lot of cache-misses on the Readers. See for example this blog post http://concurrencyfreaks.blogspot.fr/2013/10/immutable-data-structures-are-not-as.html

An alternative to COW is the Double Instance Locking, which is a name I made up, because as far we could tell, Andreia and I invented this technique so we get to name it .... It could be that this technique already exists, so if you heard of it before, please leave a comment below with a reference to it, so that we can acknowledge the true authors, otherwise, we keep using whatever crazy name we came up with  ;)

What's in a Double Instance Locking pattern?

Double Instance Locking requires only two replicas (instances) of the data structure you want to use, and it is perfectly safe to use even on systems without GC. This means it is trivial to use (and implement) in C and C++.
The components of this concurrency control pattern are:
  1. A mutual exclusion lock that serializes access by the mutable operations (Writers);
  2. Two exact instances of the object or data structure that is being protected;
  3. Two Reader-Writer locks (one to protect each instance of the data structure) that supports tryReadLock() and should have Writer-preference;

Simple huh?


How does it work?

Each Writer must work in exclusivity, and for that, it must first acquire the writer's mutex. Then, it must acquire the write-lock of the rw-lock on the first instance and do it's mutable operation on the first instance. It will then release the rw-lock of the first instance and acquire the write-lock of the rw-lock of the second instance. The Writer will then release the lock on the second rw-lock and then release the writer's mutex.
Notice that the Writer must release the first before acquiring the second, otherwise there could be an instant in time where both locks are held, which would "block" the  Readers causing them not to be able to acquire any of the two rw-locks.

The Reader starts by trying to acquire the read-only lock of the rw-lock of the first instance, and if it fails, tries the second rw-lock, and if it also fails, tries the first one again and then the second, and so on, until one is successfully acquired. Notice that the Reader only "reads" one of the instances, the one to which it successfully acquires the read-lock.

Here are the operations in more detail:

Write operation:
  1. Any thread wishing to do a mutable operation must first acquire the writersMutex;
  2. Acquire the write lock on the rw-lock of the first instance;
  3. Execute the mutable operation on the first instance;
  4. Release the write lock on the rw-lock of the first instance;
  5. Acquire the write lock on the rw-lock of the second instance;
  6. Execute exactly the same mutable operation;
  7. Release the write lock on the rw-lock of the second instance;
  8. Release the writersMutex;

Read operation:
  1. Do a tryReadLock on the rw-lock of the first instance, and if it fails then try the same on the second instance, and if it fails then try again the first instance and so on until one of the tryReadLock suceeds;
  2. Execute the read-only operation on the instance corresponding to the lock that was acquired;
  3. Release the lock


What is Double Instance Locking?

- It is a pessimistic concurrency control pattern;
- It is blocking for mutable operations (Writers) and lock-free for read-only operations (Readers);
- It does not need a GC, which means it can easily be implemented and used in C and C++ (and Java, Scala, Python, C#, whatever);
- Needs two identical instances of whatever is going to be protected;


In what scenarios should this pattern be used?

As usual, there are no "silver bullets", the Double Instance Locking is just another tool to help you solve problems in Concurrency.
Below is a table to help us compare a few different techniques:
The optimistic rw-lock technique is the one that is used by Java's StampedLock tryOptimisticRead()/validate().


To sum up this table, if you're using C/C++ then you're pretty much stuck with a "regular" Reader-Writer Lock or with the Double Instance Locking.
If you're using Java/Scala then you have a little bit more choice, but keep in mind that the Copy-On-Write may be significantly slower than the other techniques, and that the "RW-Lock with Optimistic Reads" technique requires some careful usage as can be seen in the talk given by Hans Boehm here http://concurrencyfreaks.blogspot.fr/2013/10/hans-boehm-on-reader-writer-locks.html

The main advantage of the Double Instance Locking is that it is similar to a Reader-Writer lock but allows Readers to run at the same time as a Writer (each on a different instance).
The trade-off is in memory consumption (twice the size of the data structure, but not the user data), and twice the work for write operations. Neither of these should be an issue because memory is cheap, and most data structures have a write-few-read-many access pattern, so the number of write operations should be low anyways.


How come I've never heard of this pattern?

Well, this is something new, and we have an even better technique, that is not just lock-free, but wait-free population oblivious, and it's actually fast. We named that technique "Left-Right" and the paper should be out soon, so stay tuned  ;)


Where can I get more info and source code?

There is a powerpoint presentation at the link below:
https://sourceforge.net/projects/ccfreaks/files/papers/DoubleInstance/DoubleInstanceLocking.pptx

Java code:
http://sourceforge.net/projects/ccfreaks/files/papers/DoubleInstance/DoubleInstanceLockGuard.java
http://sourceforge.net/projects/ccfreaks/files/papers/DoubleInstance/DITreeSet.java

C (pthreads) code:
http://sourceforge.net/projects/ccfreaks/files/papers/DoubleInstance/di_rwlock.c
http://sourceforge.net/projects/ccfreaks/files/papers/DoubleInstance/di_rwlock.h
http://sourceforge.net/projects/ccfreaks/files/papers/DoubleInstance/di_rwlock_example.c

6 comments:

  1. Stumbled upon this when reviewing Java 8 StampedLock comparisons.
    CoW is workable for non-GC languages, assuming you are willing to pay for reference counting (with ensuing locking overhead) and carry the burden of disciplined object life time management.
    Regardless, encapsulating doubleinstance (aka double buffering) would yield significant gains.

    ReplyDelete
    Replies
    1. Hi Mark,

      I don't see how it is possible to use CoW with reference counting (can you give a link with an example?), but you are correct when you say that it _is_ possible to do it even in non-GC languages, because I know of a few techniques that work, for example Hazard Pointers.
      The problem with Hazard Pointers is that they're quite slow, but maybe there are other techniques I'm missing.

      I've updated the comparison table in the post to reflect this. Thanks for the comment!

      Delete
  2. Hi, This is a nice interesting idea. In the comparison table, you write that readers are lock free in an RW lock with optimistic readers. I don't think this is the case since an optimistic reader would need to retry until a write critical section has finished in order to make sure that the data is consistent.

    ReplyDelete
    Replies
    1. Dope... you're absolutely right!
      I've just updated the table with your correction. Thanks Kowon!

      Delete
  3. I think you invented transactional memory)

    ReplyDelete
    Replies
    1. lol
      Actually, this is a kind of (really crappy) transactional memory, but then again, so is a mutex... this is only slightly better ;)

      Delete