Sunday, April 20, 2014

Native Code Performance on Modern CPUs

...and speaking of BUILD 2014, Eric Brumer did a cool presentation on native code performance on modern CPUs.
https://channel9.msdn.com/Events/Build/2014/4-587


He covered some new things like fused-multiply, AVX2, vectorization, store buffers, store load forwarding, etc.
My favourite slide out of the entire presentation is the one of item #2, where he shows that performance on modern CPUs is memory bound (should be true for most workloads):



https://channel9.msdn.com/Events/Build/2014/4-587 
This is something most people on the concurrency world are already aware, but it seems that this is true even for single-threaded coded, mostly (but not only) due to vectorization.

I think he distilled a very good idea out of it: that we should pay attention to the where the loads and stores are located in our code, which is not an easy task for a developer, but a necessary one for those of us wishing to write high-performance code.

No comments:

Post a Comment