Saturday, November 28, 2015

The importance of Sequential Consistency

Today we're going to talk about Consistency Models, things like Linearizability, Sequential Consistency, and Serializability.
https://en.wikipedia.org/wiki/Linearizability
https://en.wikipedia.org/wiki/Sequential_consistency
https://en.wikipedia.org/wiki/Serializability

I'll use C++ for the code examples, but this applies to all other languages with a memory model (C11, Java, Scala, Clojure, etc).

In the context of concurrent data structures, there are two extra vectors that influence the characteristics of (each method of) a data structure. One is the "Progress Condition" and the other is the "Consistency Model". 
Those of you familiar with databases, know that there is a huge number of different consistency models, but when it comes to concurrent data structures, we usually care if it is Linearizable (or Sequentially Consistent) or something more relaxed (for which no name may even exist).

Linearizability is important in the academic world because it makes it easy(ier) to prove that a certain algorithm is correct, so academics (and even some practitioners) like this very much. In addition, it is very intuitive to reason about for our humans brains.
A good example are Mutual exclusion locks and Reader-Writer locks, that provide linearizability for all its methods: lock(), unlock(), trylock(), etc
Unfortunately, it is a very "strong" condition, so much so, that not only it can be tricky to come up with algorithms that provide this kind of consistency, but doing do can have strong (negative) impacts on the performance, both in terms of throughput and latency.

Just one step below Linearizability, we have Sequential Consistency. This one is still easy to reason about, but it's not composable, and it is harder to prove the correctness of an algorithm (or data structure) that is sequentially consistent, than one that is linearizable.
An algorithm that is Linearizable is also Sequentially Consistent.

Below Sequential Consistency, it starts to be progressively harder to reason about an algorithm, and the number of different consistency models grows quickly, each one with its own quirkiness. Examples are: Acquire-Release (like the memory_order_acq_rel in the C++ memory model), or Serializability, or even completely relaxed (like memory_order_relaxed in the C++ memory model).
The most relaxed of all is the 'relaxed' (like the name indicates), to the point where calling it a "consistency model" is kind of a misnomer because it really isn't consistent at all.
Personally, reasoning about anything less than Acquire-Release is extremely difficult, and I believe most human brains suffer from the same obstacle  ;)

I think that the "being able to reason about it" is an excellent argument to go for linearizability/seq-cst/acq-rel and not something more relaxed. In fact, the whole point of things like Actor Models, CSP, and immutable data structures, is to sacrifice some of the performance at the exchange of being easier to reason about the code, so this should be enough of a reason to convince most people.







But it's not just about how hard or easy it is to reason about it, nope, it's much more than that!





To understand the importance of Sequential Consistency, we should start with something like "The Laws of Order": http://www.srl.inf.ethz.ch/papers/popl168gf-attiya.pdf
The implications of this paper may not be completely obvious at first, so I'll let you read it and digest it over a few days, but I will go even further than what the authors are saying: I think that operations with strong non-commutativity implicitly need (at least) Sequential Consistency.
Andreia and I don't completely agree on this point, so feel free to disagree as well.

This "strong non-commutativity" thingy is explained in the paper, but I can explain it in a paragraph as well (won't do a great job though):
Suppose you have a set (or a map, or a multiset, or a multimap) and you do in a single-threaded application the following operations:
    set.add(x);
    std::cout << set.contains(x) ;

This should print out "true" because the element "x" is now part of the set. But doing it the other way around:
    std::cout << set.contains(x) ;
    set.add(x);

will print out "false" assuming the set is initially empty, because "x" is not part of the set yet when we call contains().
This property of these operations of the set is said to be "strongly non-commutative", and it applies to all the operations of the set: add(), remove(), contains()

That's all fine and dandy in the single-threaded world, but what does it mean to have a concurrent set (i.e. multi-threaded world)?
How does this "strong non-commutativity" manifests itself in a concurrent set?

From a Program Order (PO) point of view, we expect that everything behaves in the order (sequence) that we implemented it. For example, if we write:
  set.add(x);
  set.add(y);
  set.add(z);

and then we place some set.contains() somewhere in the middle, we expect them to return true/false (depending on the place where we put them), but more than that, we implicitly expect something else:
- If set.contains(y) is true, then set.contains(x) MUST be true;
- If set.contains(z) is true, then set.contains(y) MUST be true;

If this was ever not the case, then you would (rightfully) complain that the compiler is broken!

So, suppose now you have two threads, T1 and T2 that will execute the following code, where xyz_are_in is an atomic<bool>, initially at false:
T1
  set.add(x);
  set.add(y);
  set.add(z);
  xyz_are_in.store(true);  // memory_order_seq_cst


T2:
    while (!xyz_are_in.load()) Pause();         // memory_order_seq_cst
    std::cout << xyz_are_in.load() << "\n"; // memory_order_seq_cst
    std::cout << set.contains(z) << "\n";
    std::cout << set.contains(y) << "\n";
    std::cout << set.contains(x) << "\n";


I hope everyone agrees that the result of T2 will ALWAYS be true, true, true, true.
I'll let you look at the code for a while until you're convinced of that.

Done?
Ok, great, now let's make it more difficult.
Suppose T2 now doesn't wait for xyz_are_in to become true:
T2:
    std::cout << xyz_are_in.load() << "\n"; // memory_order_seq_cst
    std::cout << set.contains(z) << "\n";
    std::cout << set.contains(y) << "\n";
    std::cout << set.contains(x) << "\n";


What are the possible outcomes?
Well, I didn't say what is the consistency model of this set, so let's not assume anything, i.e. it's relaxed. The possible outcomes are:
- false, any combination of true and false for the three contains(), and I really mean ANY combination
- true, true, true, true
Keep in mind that the rules of the C++ Memory Model say that you guarantee that anything that happened before the xyz_are_in.store(true) in T1, will be visible after the xyz_are_in.load() is true in T2, but it doesn't give any further guarantees, specifically, no guarantee is given on the order of the set.add()s or the set.contains()s.

So now think about the following: the set.add()s in Thread 1 can be re-ordered in any way (as seen from T2), which means that the invariants described above will no longer hold!
There is no "strong non-commutativity" when using a relaxed model  :-O
If there is no "strong non commutativity" then, does it even make sense to talk about a "set"? What is a "set" for which the invariants described above do not hold? What is a "set" that doesn't have "strong non-commutativity"? Is it even correct to still call it a "set"?
It's a set where you put an element in it, and then you go to check if the element is part of the set, and it's not (yet).... huhhhh
To me, a data structure with those properties can maybe be called a "pool", but definitely not a "set".

In fact, it's harder to show, but even for a consistency of acquire-release it will not have "strong non-commutativity".
The only way to have "strong non-commutativity" is to have sequential consistency between the operations of the set, i.e. a "sequentially consistent set"!

Without sequential consistency, it does not make sense to talk about sets, or multisets, or maps, or multimaps.
I know that there is a strong tendency to make data structures with a relaxed consistency because you can make them (waaaayyyy) faster than one that is sequentially consistent (or linearizable), but the thing is, IMHO, it doesn't even make sense to call it a "set" it doesn't have sequential consistency.
Sure, you may not actually need a "set", and something with more relaxed properties is enough, and then, it will be ok with have a "relaxed set", or whatever you want to call it.

It seems to me that the notion of Sequential Consistency is embedded in the very essence of what a "set" is, to the point that a set without seq-cst is not a set.
I know this is a somewhat controversial statement, and I wish I had the mathematical skills to show a proof of this, but I don't, so feel free to try a couple of examples and reach your own conclusions, but I'm confident you'll reach the same conclusion, and if not, that's what the Comments section is for   ;)

Sunday, November 22, 2015