Monday, April 15, 2019

OneFile - The world's first wait-free Software Transactional Memory

Ok, so we've been silent for more than a year but for a good reason: busy doing cool stuff.
I'm pleased to announce one of those cool things:
OneFile - A Wait-Free Persistent Transactional Memory
Paper here:
Source code here:

OneFile is the world's first STM (Software Transactional Memory) with wait-free progress.
Andreia, Pascal, Nachshon and I got together to make OneFile and what this means is that you can write a sequencial data structure (single-threaded code) and with OneFile it gets transformed into a wait-free concurrent data structure. Moreover, it comes with integrated memory reclamation, memory allocation and memory de-allocation, all with wait-free progress. And on top of that, it works on Persistent Memory (PM), sometimes called Non-Volatile Main Memory NVMM).

Let's go one step at a time.
First things first, why make a wait-free STM ?

As a developer of multi-threaded applications, whenever the need arises to have a concurrent data structure, you typically have two different solutions:
take a sequential data structure (single-threaded code) that does whatever you need to do and protect it with a lock, or take an existing fine-grained lock or lock-free data structure that someone else implemented.

If you go with the first approach, you get maximum flexibility because it's easy to modify the data structure to your own needs. The reason for this "ease-of-use" is because code protected by a lock can be reasoned about in a "sequential way", something which is very intuitive to us humans. You won't have to consider interleavings of operations or instructions or the the memory model of the language you're using or anything like that, all you need to do is add mutex.lock() at the beginning of the data structure's methods and mutex.unlock() at the end, or the equivalent semantics in your language/lock implementation.
This is possible because a researcher, many years ago, invented a "mutual exclusion lock algorithm" to deal with all of that, and some kind-soul engineer has decided to implement it in the language/library you use so that you, the developer, can now write safe and correct concurrent code without actually having to understand anything about concurrency. It's hard enough to deal with the business logic of your application and all the details that using a programming language entails.

The second approach is much more restrictive. Fine-grained locking code is very sensitive to changes and any modification to the underlying data structure's behavior is likely to cause incorrect results. This is even worse for lock-free code, where changing the order of a single line of code almost always results in an incorrect data structure. Considerations about interleavings and memory models become crucial for these, implying that such data structures are the product of hard work by experts in the field.

In essence, it's the difference between taking a HashMap (a sequential data structure) modifying whatever you need and protecting it with a lock to make it concurrent, or taking a ConcurrentHashMap (a fine-grained concurrent data structure) and start modifying code in it.
Modifying the code in HashMap is something that an entry-level engineer is likely to achieve correctly, but modifying the code in ConcurrentHashMap is something only expert engineers are likely to succeed in doing correctly. And when we start talking about lock-free data structures, then we enter an expert-only domain, an elite group of people who spend a lot of time thinking about concurrency... and dealing with concurrency bugs  ;)

With OneFile we have eroded this "class" separation. Now, even a novice developer is capable of implementing a lock-free data structure correctly. Not just lock-free, but wait-free even!
The reason for this is because OneFile takes a sequential data structure and provides concurrent wait-free access to it. Instead of adding mutex.lock()/unlock() at the beginning/end of each method in the data structure, the transformation requires a few more changes, but still easy enough for any developer to implement. Namely:
- Wrap each mutative method in a lambda and call to updateTx([=] { your_code_here });
- Wrap each read-only method in a lambda and call to readTx([=] { your_code_here });
- Replace calls to 'new T' with tmNew<T>();
- Replace calls to 'delete' with tmDelete();
- Wrap the members of every struct/class T in the tmtype<T> template, so as to provide concurrent access;

Compare a singly-linked-list set data structure (wait-free) here
with a hand-made Maged-Harris singly-linked list set (lock-free) here
Notice the difference?
Which one would you feel comfortable modifying or adapting to your application?
And the hand-made set is not even wait-free, if you want a hand-made wait-free set then good luck implementing this one (it's the only one I know of)  ... ohhh and did I mention that out of all the hand-made wait-free data structures in existence, there is only one published with wait-free memory reclamation?
Data structures in OneFile have integrated wait-free memory reclamation  :)

OneFile has its limitations tough.
Mutative operations are serialized, i.e. only one is executed at time, although disjoint read-only operations can execute concurrently most of the time.
It needs a machine with a double-word compare-and-swap (DCAS) which all current x86 support, but doesn't exist on ARM or PowerPC.
Because of the use of DCAS, all data types have to be broken down to 64bit words. This means that having things like arrays of bits is hard to do (takes up too much memory). Most data structures are all about operations on pointers (to nodes or other data types) so it's mostly ok, but it may make it difficult to use in some data structures.
Despite these limitations, when you take into consideration that the alternative is to make your own hand-made wait-free data structure, the choice becomes obvious.

Next week we'll discuss the importance of having data structures with wait-free progress. In the meantime you can check out the first 20 minutes of this video to understand the importance of wait-freedom for latency

No comments:

Post a Comment