Sunday, March 2, 2014

Harris's Linked List

One of the most widely used concurrent data structures in research papers is the Harris Linked List.
The Harris Linked List is a linked-list-based concurrent ordered set (or map) with lock-free proprieties for insertions, deletions, and lookups.
It was published back in 2001 on the DISC conference and the author is Tim Harris, currently at Oracle.

Unless you're into reading research papers on concurrency, it is likely that you don't know what this data structure is, and the main reason is that, although it is the fastest and "simplest" concurrent list-based-set known, it is, in my opinion, mostly impractical.

Each node in the linked list has a reference to a key, and the node's relative position in the list reflect the relative order of the keys they contain. There are two special nodes which are always present (i.e. will never be deleted), one at the start of the list, head, and another at the end of the list, tail.

I see Harris's linked list as a kind of chinese jigsaw puzzle, the kind that you can play around for a while and even accidentally solve, but then when you reset it, it will take you another few hours to solve it again, only instead of having a single trick, Harris's linked list has many tricks.

One of the tricks of the Harris linked list consists of setting one bit of the pointer to the next node, to mark it as logically removed.

and here is where the problem lies.

In managed languages (like Java, Scala, C#, etc) it is not possible to directly access a bit of a pointer, and most likely this will never possible.
What is possible is to use AtomicMarkableReference or a similar technique, where a new object is created every time a CAS is performed, and this new object contains a field with the reference to the new node and a field with the bit flag, both being final. This constant creation of new objects means that implementing this technique in Java is (very) slow.
Another possibility in Java, is to use the Run Time Type Information (RTTI) and create a variant of the Harris algorithm that uses RTTI to determine the type of object each node is, something like MarkedNode or UnMarkedNode, and at runtime, replace the objects to signal a bit marked/unmarked. This technique creates a whole new level of pain because in reality, there will be a multitude of intertwined logical linked lists, and it becomes very hard to reason about it, but it is faster than the approach with java.util.concurrent's AtomicMarkableReference.
We have made implementations for both techniques and will talk about each of them on future posts.

In non-managed languages (like C, C++, etc) it is possible to directly access a bit of the pointer, but there is an issue that the managed languages don't have, namely, memory management.
On Java, we don't have to worry about memory cleanup because the JVM Garbage Collector takes care of cleaning up the unused nodes, but on C/C++ there isn't such a feature.
I'm not talking about the ABA problem here, no, that is taken care of by the Harris linked-list algorithm itself, mostly due to the fact that once a node is "marked" to be deleted it can never be "unmarked", and if the same key gets re-inserted in the list, then a new node must be created.
Now, there are some native languages like D that allow for the usage of a GC on specific pointers, but they don't support bit manipulation on those pointers, so it's the same issue even there.

The only way to have a safe implementation of the Harris linked list on a non-managed language, is to use it with a lock-free memory management technique, like "Hazard Pointers", or "Pass the Buck", or "Drop the Anchor":
the problem with this is that so far, I haven't seen any implementation of Harris with Hazard Pointers that uses the C11/C++11 memory model, and even if someone writes a good one, it won't be fast because doing a simple traversal of the list will require two globally visible store instructions with at least on release barrier per node.... which is faster than using reference counting, but still slow. Perhaps the "Drop the Anchor" technique would provide decent enough performance, but I haven't found any code for it yet, good or bad, so for the moment it's all about Hazard Pointers.
Even so, if someone knows of a decent implementation of Harris with Hazard Pointers or another technique, I would like to hear about it, so please add the link in the comments section!

In summary, if you want to implement Harris's linked list on a managed language, you'll have to use weird tricks like AtomicMarkableReference or RTTI, that will significantly affect the performance of the algorithm, or even considerably change the way the algorithm works. And if you want to implement Harris in a non-managed language, then you need a memory management technique that will also impact performance and it is non-trivial to implement, not to say extremely difficult.
Some research papers I've seen, overtake this obstacle by going with the C/C++ implementation and simply skipping the memory management altogether. They just have a lot of memory and stop their microbenchmark before it gets filled up. This is kind of cheating, but I understand them, because in the end they are trying to compare the Harris with some other algorithms for Sets and Maps and not which memory management technique to use, but sometimes I doubt how realistic or useful those benchmarks really are.
Anyways, that's why Harris's Linked List is not as popular with software engineers as it with researchers, and why there isn't an implementation of it on java.util.concurrent

No comments:

Post a Comment