It seems that the current queue implementation in Akka uses this algorithm. It took me some time to understand how it works, and there are a few interesting things:
First, this queue is wait-free for push() and pop() operations.
The push() function does one XCHG and one store with release barrier, and no retries are needed.
The pop() function does stores, and that's it.
It's true that this queue is single-consumer only, but you should try to come up with a MPSC queue that is wait-free (or even lock-free) to see how hard it is. Yep, it's hard, and this guy did it, so I'm definitely impressed, and so should you!
Just to give it some perspective, the typical Michael & Scott queue which is implemented in Java's ConcurrentLinkedQueue, needs at least two CAS to insert a new node... both of which may fail and will have to be retried.
The disadvantage is, that this queue is not linearizable. In fact, it's not even sequentially consistent, or weakly consistent, or even has an happens-before relation between calls of push() and pop(). It's consistency model is serializable.
As surprising as it may seem, this is enough to get the actor model properly working and passing messages from actor to actor. I'm guessing this happens because each actor needs to see the received messages in FIFO order, but it is ok to not see those messages immediately.
It's the joy of distributed systems :)
After some comments (thanks Nitsan!) I realized that some details of this post were a bit confusing, namely, there are two variants of this MPSC queue and they have a slightly different pop().
In the implementation by Vyukov the pop()'s Progress Condition is Lock-Free and its Consistency Model is Serializable.
In the implementation in Akka, the pop()'s Progress Condition is Blocking, but its Consistency Model is Linearizable.
The Akka implementation is Blocking because it may spin eternally if the thread inserting the node blocks/crashes/sleeps before setting the next of the previous node, but Linearizability is a much stronger consistency model than Serializability, so it's a trade-off (both implementation are correct).