Monday, October 14, 2019

What is a concurrent algorithm without order?

I've heard Leslie Lamport mention on multiple occasions: "An algorithm is not a program!"
and I fully agree with him and would like to add that "A concurrent or durable algorithm without ordering is not an algorithm!"

What Leslie says makes sense because code is just a particular implementation or expression of an algorithm in a specific language. Code is bounded by the constructs that form the language in which that code is written.
An algorithm has no such constraints and is the mathematical expression of an idea.

An algorithm is precise, yet, universal.
A program works only for a particular language.

When an algorithm is incorrect, a program that uses such an algorithm will be incorrect as well.
When an algorithm is correct, a program that uses such an algorithm may be incorrect because it implemented the algorithm incorrectly, but the correctness of the underlying algorithm in unaffected.

An algorithm has mathematical beauty.
A well made program can have craftsmanship and be appreciated as beautiful, but rarely (if ever) on the mathematical sense.

So what do I mean by "An algorithm without ordering is not an algorithm"?
In the context of concurrent and/or durable algorithms order is vital for correctness. In concurrency papers, most people assume sequential consistency of the (atomic) steps that the algorithm performs and indeed, the default memory model in C11 and C++11 is memory_order_seq_cst exactly for that reason. In practice, if you were to implement a concurrent algorithm in such a way, it would likely be slow because each store (and load on non-TSO CPUs) would require a fence to guarantee ordering.
Researchers writing papers assume this strong ordering because they focus on "the algorithm" as being a series of steps and leave the dependencies of those steps (ordering) as an implementation detail.
If performance (throughput/scalability/latency) was not an issue, then we would still be using single-core CPUs and single-threaded applications and would not have to deal with all the complexity that concurrent algorithms entail! The amount of fences are (not always, but) usually what dictates the throughput of a concurrent algorithm. This means that the placement and number of fences (ordering constraints) are vital to the algorithm, not just for correctness but also for performance reasons.

The same logic applies to durable algorithms, where the ordering constraints are typically the main performance indicator. A durable algorithm without ordering does not guarantee durability and therefore, it becomes useless. However, not all steps impose a strong ordering among each other. It becomes important when we describe the algorithm to explicitly mention this dependency of steps.

Most of us don't realize the importance of "order" because for the majority of algorithms this order is implicit. For example, consider a sequential algorithm for inserting a new node in the middle of a linked-list:
Step 1) Create a new node;
Step 2) Make the "next" pointer in the node point to the successor node in the list;
Step 3) Make The "next" pointer in the predecessor node of the list point to the new node;

This is trivially simple, but if you try use this algorithm in a system where ordering is not guaranteed, these steps may become re-ordered (from the point of view of another process/thread) and the algorithm becomes incorrect. And this is one of the main issues of concurrent algorithms and even distributed systems.

Going back to Leslie, he was the first to show that to have mutual exclusion on a concurrent system we need to have at least a store-load fence. In other words, there is a minimum amount of ordering that is needed to make an algorithm that is mutually exclusive.
A similar result exists for durable algorithms: atomic durability requires at least two constraints, one for ordering and one for synchronization (round-trip), see this post for more info.
The similarity is such that Paxos, an algorithm discovered by Leslie Lamport and used in distributed systems, can be used not just to provide failure-resilience in concurrent/distributed systems but also to provide failure-resilience for durability. And the reason behind it is somewhat related to ordering.

Unfortunately, there is no easy way or commonly accepted way of expressing ordering constraints in an algorithm and because of that, most people think that orderings constraints are "an implementation detail". Likely this happens because our human brains are so used to thinking in sequential order that we expect the steps (code) that we write in a certain order to be executed in that exact order, because that's how the physical world around us typically behaves.
The problem is, concurrent algorithms don't follow these rules and because of that, ordering must be part of a concurrent algorithm.

IMO, a concurrent algorithm without a specification of the order is an incomplete algorithm. Same thing for durable algorithms. This also implies that an optimal algorithm is one that has the least amount of ordering, or in other words, a good concurrent/durable algorithm is an algorithm with a small number of fences.

No comments:

Post a Comment