Tuesday, February 10, 2015
Left-Right and the (broken) peer-review system
On early September last year, we finished an updated version of the Left-Right paper. We submitted it (again) to another of the major conferences in the field and (again) it came back with lots of weak-rejects. Here is the updated paper on github in case you're interested:
https://github.com/pramalhe/ConcurrencyFreaks/blob/master/papers/left-right-2014.pdf
The reviewers were unanimously about the novelty of this technique, and most were encouraging in their statements and suggested us to continue to improve the paper until it is "ready for a conference", the problem is, it will never be "ready for a conference" because they ask for opposite things, or just too much work that we don't have the time or patience to do. This means that the Left-Right paper will most likely never be shown in a major computing conference, and that my friends, is a big shame, for multiple reasons.
The first reason is time: Most people in academia (i.e. peer reviewing papers for conferences) are either a PhD student, or a professor at a university that has one or several PhD students working under him.
If they're a PhD student, then they have a big slice of their time devoted to work on papers, unless their PhD supervisor is a jackass and suckered them into babysitting his kids or some other task completely unrelated to the topic of the student's PhD thesis. They still have to attend and sometimes give classes, which can suck up a good deal of time, but PhD students aren't expected to have any personal time anyways, so that's life.
If they're a Professor at a university, they will probably not have too much time (between classes and going to conferences and committees and whatnot) but they will have one or several PhD students to do the "grunt work" on the papers, and all the professor has to do is have a few somewhat original ideas and choose at least one PhD student that's not completely dumb and is capable of doing the work or explaining it to the other PhD students. He then reviews the draft and corrects all the crap the PhD students write, points them in the right direction, adds the right references to previous work, and the final result will be a decent enough paper, which doesn't even have to be very innovative, as long as it has "the right stuff". If they're really lucky, the professor will have a Post-doc that does this last part of the work of reviewing the paper and doing the multiple iterations, and all he has to do is look at the almost finished version and do some minor changes.
A much smaller group, consists of the people working in companies but still tied to academic life or to research (like on research labs), and I guess that for those, they have to show some results and they will have a lot more tasks than just write papers, but getting papers out is one way of "showing work done" and keeping their job, so they do it. I think these guys usually focus on stuff that is more on the practical side, i.e. more "engineering research" and less "pie in the sky research", but that's just my impression and I'm sure it varies a lot from company to company and research group to research group.
And then there are people like me and Andreia. We don't have a background in this particular field (Concurrency and Synchronization algorithms), Andreia has a masters in Applied Mathematics with a post-grad in Information Systems, and I have a PhD in Physics (but I spent most of my PhD doing non-parametric low-sample statistical analysis and developing new statistical techniques and algorithms), so we're only "amateurs" in this field, but we do have many years of experience in Java/C/C++ and we both like studying algorithms, so we ended up here.
We have a day job, and it's not really related to these topics, which means that whatever time we devote to Concurrent Algorithms must be taken from our spare/leisure time, and because we enjoy it so much, we do spend a big chunk of it doing research, but with a 4-month old baby, our spare/leisure time is getting shorter and shorter (as is to be expected).
This means that, no, dear reviewers, we won't do all that insane amount of work that you suggest us doing. The Left-Right paper explains the algorithm, has the source code, has microbenchmarks for a specific use-case, and even has a proof of correctness... there's just not much more we can do... even if we wanted to!
The second reason is that they don't get it: The first time we submitted the paper to a conference, there were a few reviewers that thought the Left-Right was some kind of new concurrent tree data structure. I must admit that we're definitely not the best people in the world at writing papers, but obviously those reviews skimmed just the paper and "didn't get" what it was about.
This time, none of them thought it was a new kind of tree, but almost all suggested that it should be compared against RCU, and saying that is about the same as you coming up a new algorithm for a Reader-Writer lock, and then someone telling you that you should compare your Reader-Writer Lock with Hazard Pointers... duhhh. This only shows that they either don't know what Left-Right is, or they don't know what RCU is, or both. I don't blame them for not knowing this, because both are non-trivial concepts, but it does upset me that they call themselves "experts" in the field, and then don't what these things are about.
The Left-Right pattern is a new kind of "synchronization primitive" or "concurrency technique" or whatever you want to call it, and the closest thing to it is indeed a Reader-Writer Lock, like we mention in the paper. Comparing it with a Reader-Writer Lock is a reasonable suggestion because we can use a Left-right pattern in almost all the same scenarios that we use a Reader-Writer Lock, and indeed we do this comparison on our microbenchmarks.
Comparing the Left-Right with RCU is not very reasonable, because Left-right is meant to be used in a very different way. And yes, it can be used in a similar way, and in fact, I believe it can even be used to implement a generic RCU API, but it's besides the point... it's like saying that a lock-free queue/list can be used to implement a Reader-Writer Lock (like on the MCS algorithm), and as such, we should compare the two with each other... how would you even do that, it doesn't make sense.
The goal of the Left-Right paper is to explain the Left-right pattern and the concurrency control algorithm behind it. It's impossible to "predict" all the applications that people will apply it to. The Left-right pattern can be applied to any data structure, just like you can apply a Reader-Writer lock to any data structure. If you discover a new algorithm for a Reader-Writer Lock, does that mean that to make a paper about it you will have to apply it to ALL the data structures that the reviewers know about it, and compare it against all the lock-free data structures that exist?
Well then, if that's true, then don't expect any other new Reader-Writer Lock to ever be discovered or at least published.
How's that for stifling innovation?
And that takes me to the third reason, the peer review system is stifling true innovation: Let me exemplify in another way by telling a personal story.
Last May, I challenged Andreia to come up with a technique for a Reader-Writer Lock that using a particular kind of algorithm, would be able to scale with readers just as well as the C-RW-WP but be starvation-free all around, i.e. starvation free for readers-to-writers, writers-to-readers, readers-to-readers, and writers-to-writers. For those of you unaware, this is kind of a "Holy Grail" of Reader-Writer Locks, because it provides good Reader scalability but has no preference for either Readers or Writers, which gives good latency guarantees (as good as a Reader-Writer Lock can provide). As it so happens, she "took the bait" and did come up such an algorithm, which we then expanded to multiple variants, implemented in multiple languages, ran on microbenchmarks, made plots, and even a nice powerpoint presentation with a few animations that we showed to a couple of people.
We haven't made a paper about it yet, mostly because we didn't have time because we're fixing up the other papers we are trying to submit in order to please reviewers in conferences.
One more example just to make my point, is that last year in January (more than a year ago) we had this idea for a Lock-Free linked list that is kind of like Harris's linked list, but easy to implement in any language and easy to explain. We implemented it in 4 different languages, benchmarked it, even used it as a building block to other more complex data structures... but we didn't write a paper.
Why? Because it would take a considerable amount of time, and most likely would not be accepted because of reviewers wanting it to be compared with x and y lock-free hashset or treeset, because hey, it's (also) a "set", so why not compare it against an "hashset" ?
I could go on with more examples, but I'll spare you.
There's no point in searching on github or the concurrencyfreaks site for the Reader-Writer Lock or Lock-Free linked list mentioned above, because we haven't made it public, and the reason for that is related with time it takes to write the papers, and reviewers that stifle innovation, but also with the next reason...
The fourth and final reason, is someone else stealing you work: Andreia and I work hard on these topics, expending most of our personal time, having long discussions over lunch, dinner, throughout the night, until we figure out the best way to do a certain algorithm, or to explain why a certain technique gives the results it gives.We write thousands of lines of code until the algorithm is fine tuned and free of bugs, we write benchmarks and stress tests, we add and remove debugging code, we write new ideas just to test some behavior or hypothesis.
We don't get paid to do this, we don't register patents of our discoveries (yes, they are "discoveries", not "inventions"), we make the source code available in multiple languages, we provide plots with benchmark results done on multiple architectures (as many machines as we have access to).
You're free to use the algorithms we make available to the public for whatever purpose you want, including getting rich from it (if you manage to do so, then you certainly deserve it, lol), and the only thing we ask in return, is to be given credit for them. That's not much to ask, is it?
Sometimes, different groups discover the same ideas at almost the same time, and that's normal, and whomever publishes first takes the credit for it, and that's the way it should be (one example that happened to us, was the C-RW-WP by the Oracle research labs that we discovered independently as Scalable Reader-Writer Locks). This is not a problem and it happens more often than we might think, probably because some ideas depend on previous work, and only once that work is made public, it "unlocks" the ideas that build on that.
What is problematic is when you come up with a new idea, and then make it publicly available, and then someone comes and claims it as their own (like WriterReaderPhaser).
There is a well known psychological effect that happens when you read/hear about an idea/concept and then you kind of forget about it, until later you "come up with it" again or some small variation of it and you think it is new. Our brain can trick us in that way, because we have a tendency to make good ideas our own, and personally, I'm probably guilty of this sin on more than one occasion, like for example this one which is just an implementation of the C-RW-NP, but once someone points out that this is an already known idea, we take the time to go over carefully to understand if that is really so. The truth is, it's really hard to know every thing that has been published, and even harder for stuff that is publicly known, but not mainstream published, even on a field like this one which isn't really all that extensive. Because of these factors, it becomes easy to copy someone else's idea (on purpose or inadvertently), or simply to re-invent the wheel.
Of course, we could do like Dmitry Vyukov. He publishes his ideas and algorithms on his website and doesn't spend useless time trying to please reviewers that don't understand what he's talking about. I'm afraid I don't have the "cojones" he has, because if we put all our stuff on our web site and not with papers, then it is easy for someone else to come and say it was their idea and there is not much we can do about it.
This effect is so strong, that some researches publish papers that are deliberately made complex so that it is hard to understand all the details or even copy the idea. And let's not even go into making source code available, because apart from a few great exceptions (with The Art of Multiprocessor Programming being best example of code source sharing), most papers don't have source code.
What good is to publish an algorithm if we don't provide the source code? I thought that one of the basic tenets of the Scientific Method is Reproducibility, and how can you reproduce a certain result in this field without (at least) the source code?
For the moment, the only way we found to protect our long and hard work is to write a paper about it and get it published, but the truth is, this is taking so much of our time that all of our great ideas are getting left behind, and it frustrates me a lot. I guess it is not just a critique to the whole peer review system, but also to our capability of writing scientific papers, but the end result is that innovation is being stiffed.
How many engineers finding new innovations don't even bother to put it anywhere, or even to spend time thinking about these things because they know that they won't get any benefit from it, most likely, they won't even get the credit for it?
It's all about the Incentives, and in our case, it's mostly about the Passion, because the incentives are nearly zero.
So that's it. If you don't see many updates to this blog for the upcoming times, it doesn't means we gave up, it just means we're spending our time exploring new ideas or writing some papers about them ;)
I'll start off by saying that I've never reviewed your paper, but I can see why reviewers view it as similar to RCU.
ReplyDeleteWriters in your system currently have to wait for the readers to get out of the new write side. You could imagine optimizing this case by having a queue of data items that you modify, and the writer goes around this queue in a round robin fashion. The thread states could include what item of the queue they are reading. A given writer could then clear many elements of this queue for future writers. If you make the queue large enough, you could minimize writers waiting.
Now you take the next step and make this queue unbounded, and use thread local flags to figure out whether the data structures are free. Now you are at a user-space RCU implementation.
The Left-Right technique is composed of two inovations:
Delete1. An "algorithm" where Readers arrive()/depart() and Writers wait for the Readers;
2. A "mechanism with two instances" associated with the previous algorithm
Yes, it is true that the "algorithm" can be replaced with Userspace RCU, to be
more specific, either the "General Purpose" or "Bullet Proof" variants, but
the "mechanism with two instances" is something new, and without the combination
of the two things (the algorithm and the two instances) it is not possible to
provide fast linearizable wait-free read-only access to any object or data structure
which is what Left-Right does.
If I'm wrong, would you be so kind as to indicate in which section of the Userspace
RCU paper they describe a technique with two instances in order to provide
wait-free access (without needing a Linux kernel that manages the threads
for you?)
https://www.efficios.com/pub/rcu/urcu-main.pdf
Userspace RCU by itself is a memory reclamation system which is meant to be
used in data structures that are already themselves lock-free (or wait-free).
URCU by itself does not provide wait-free progress to a data structure that is
single threaded... unless it is coupled with a Read-Copy-Update technique
(called Copy-On-Write in the Java world) where everytime a modification is
done, a complete new copy of the data structure is done,
but this is waaaayyyy slower than Left-Right.
Thanks! It's always good to get positive feedback ;)
ReplyDeleteWe're still working on lots of new stuff, so more cool techniques are on their way.