tag:blogger.com,1999:blog-8231772264325864647.post1523490533658481674..comments2024-01-02T11:09:02.709+01:00Comments on Concurrency Freaks: URCU ReadersVersion in C++Pedro Ramalhetehttp://www.blogger.com/profile/01340437958052998917noreply@blogger.comBlogger2125tag:blogger.com,1999:blog-8231772264325864647.post-572047276245018662018-02-23T11:32:28.219+01:002018-02-23T11:32:28.219+01:00Hi, nice to hear that :)
1) If you look in the p...Hi, nice to hear that :)<br /><br />1) If you look in the paper, figure 1<br />https://github.com/pramalhe/ConcurrencyFreaks/blob/master/papers/gracesharingurcu-2017.pdf<br /><br />the lines labeled "GraceVersion" correspond to this class:<br />https://github.com/pramalhe/ConcurrencyFreaks/blob/master/CPP/papers/gracesharingurcu/URCUGraceVersion.hpp<br /><br />and the lines labeled "Two-Phase" correspond to this class:<br />https://github.com/pramalhe/ConcurrencyFreaks/blob/master/CPP/papers/gracesharingurcu/URCUTwoPhase.hpp<br /><br />The Two-Phase algorithm can use different ReadIndicators, so we made three different ReadIndicator implementations each with different properties:<br />https://github.com/pramalhe/ConcurrencyFreaks/blob/master/CPP/papers/gracesharingurcu/RIAtomicCounter.hpp<br />https://github.com/pramalhe/ConcurrencyFreaks/blob/master/CPP/papers/gracesharingurcu/RIAtomicCounterArray.hpp<br />https://github.com/pramalhe/ConcurrencyFreaks/blob/master/CPP/papers/gracesharingurcu/RIEntryPerThread.hpp<br /><br /><br />2) The goal of the 128 byte padding is because on x86 the prefetcher can request the cache line and the next. Each cache line is 64 bytes, therefore, to completely avoid false-sharing, we should have non-consecutive cache lines for each reader's state.<br />It's a trade-off, we're using more memory to have better scalability for readers, and causing the writer/updater to have to request a different cache line for each active reader. URCUs are made for read-mostly scenarios, so it should be ok for these kind of scenarios.Pedro Ramalhetehttps://www.blogger.com/profile/01340437958052998917noreply@blogger.comtag:blogger.com,1999:blog-8231772264325864647.post-59840824148145152522018-02-23T06:58:42.695+01:002018-02-23T06:58:42.695+01:00This paper served as a great explanation of RCU in...This paper served as a great explanation of RCU in general (reading working implementations was VERY helpful). I was curious about a few things:<br /><br />1) in the GitHub repo the article points to, you have multiple implemenations. There is GraceVersion, GraceVersionSyncScale etc. Which one do the benchmarks correspond to.<br /><br />2) In the C++ impl there is use of a padding construct within the array which is trying to align things on a 128 byte (?) boundary? It was not clear to me (as a concurrency novice) what that does. Secondly, from what I can tell, most compilers won't respect the overalignment requested. At least on my Mac, I can only get up to 16 byte alignment respected properly. What does that do to the implementation shown?S Lotiahttps://www.blogger.com/profile/06774286372919819458noreply@blogger.com