The source code can be seen on github
The algorithm is not trivial at all with many interesting subtleties, like the key and the value being in adjacent entries in the array to reduce cache-misses, or the entry on the array for each key being "immutable" once assigned.
You can check out his presentation at Google
There is another version of the same presentation given at Stanford just a few days before
Unlike what it is mentioned in some places, this data structure is not wait-free, at least not for the put() or get() methods, but it is lock-free for both.
Notice that in his presentation he mentions that he uses CAS without fences to write the key and value, which I think would not be correct, but anyways, the code on github is using the version that does not allow reordering, so that version of the code is correct (as far as I could tell)... and yes, it is correct to do the loads of the key and value in the get_impl() method without any fence or volatile semantics!
Again, nothing we didn't already know, but there are some very smart details in this data structure, so very worth the time to go through it.