During the weekend, Andreia came up with an idea that allows the implementation of the Scalable Read-Write Lock in a system that doesn't support thread-local-storage, like for example, a linux box with an old gcc where the __thread qualifier is not available.
The idea consists of searching the array of assigned_threads for the current thread's id, each time there is a read-only lock or unlock. If the thread's id is not found, then add to one of the available entries.
You can get the code here:
The cool thing about this idea is that you don't really need to do any expensive synchronization during the search of the assigned_threads array because we are interested only in the entry that has the current thread's id, which was written by the current thread and is, therefore, up-to-date.
Only when we add the thread's id for the first time to assigned_threads must it be done with a mutex or with a CompareAndSet() operation, and that's ok because it is done only once per lock instance per thread.
Here is the plot of the performance of the C implementation on my linux box:
As you can see, the performance is not as good as the one from the method using the thread-local-storage, but it is still much better than a regular read-write lock. Cool stuff :)