Hi, nice to hear that :) 1) If you look in the p...

2018-02-23T11:32:28.219+01:00

Hi, nice to hear that :)

1) If you look in the paper, figure 1
https://github.com/pramalhe/ConcurrencyFreaks/blob/master/papers/gracesharingurcu-2017.pdf

the lines labeled "GraceVersion" correspond to this class:
https://github.com/pramalhe/ConcurrencyFreaks/blob/master/CPP/papers/gracesharingurcu/URCUGraceVersion.hpp

and the lines labeled "Two-Phase" correspond to this class:
https://github.com/pramalhe/ConcurrencyFreaks/blob/master/CPP/papers/gracesharingurcu/URCUTwoPhase.hpp

The Two-Phase algorithm can use different ReadIndicators, so we made three different ReadIndicator implementations each with different properties:
https://github.com/pramalhe/ConcurrencyFreaks/blob/master/CPP/papers/gracesharingurcu/RIAtomicCounter.hpp
https://github.com/pramalhe/ConcurrencyFreaks/blob/master/CPP/papers/gracesharingurcu/RIAtomicCounterArray.hpp
https://github.com/pramalhe/ConcurrencyFreaks/blob/master/CPP/papers/gracesharingurcu/RIEntryPerThread.hpp

2) The goal of the 128 byte padding is because on x86 the prefetcher can request the cache line and the next. Each cache line is 64 bytes, therefore, to completely avoid false-sharing, we should have non-consecutive cache lines for each reader's state.
It's a trade-off, we're using more memory to have better scalability for readers, and causing the writer/updater to have to request a different cache line for each active reader. URCUs are made for read-mostly scenarios, so it should be ok for these kind of scenarios.

This paper served as a great explanation of RCU in...

2018-02-23T06:58:42.695+01:00

This paper served as a great explanation of RCU in general (reading working implementations was VERY helpful). I was curious about a few things:

1) in the GitHub repo the article points to, you have multiple implemenations. There is GraceVersion, GraceVersionSyncScale etc. Which one do the benchmarks correspond to.

2) In the C++ impl there is use of a padding construct within the array which is trying to align things on a 128 byte (?) boundary? It was not clear to me (as a concurrency novice) what that does. Secondly, from what I can tell, most compilers won't respect the overalignment requested. At least on my Mac, I can only get up to 16 byte alignment respected properly. What does that do to the implementation shown?

Comments on Concurrency Freaks: URCU ReadersVersion in C++

Hi, nice to hear that :) 1) If you look in the p...

This paper served as a great explanation of RCU in...