In attempt to improve performance of YCSB on zipfian distribution, it were found that significant time is spent in XidInMVCCSnapshot in scanning snapshot->xip array. While overall CPU time is not too noticable, it has measurable impact on scaleability.
First I tried to sort snapshot->xip in GetSnapshotData, and search in a sorted array. But since snapshot->xip is not touched if no transaction contention occurs, sorting xip always is not best option.
Then I sorted xip array on demand in XidInMVCCSnapshot only if search in snapshot->xip occurs (ie lazy sorting). It performs much better, but since it is O(NlogN), sort's execution time is noticable for large number of clients.
Third approach (present in attached patch) is making hash table lazily on first search in xip array.
Note: hash table is not built if number of "in-progress" xids is less than 60. Tests shows, there is no big benefit from doing so (at least on Intel Xeon).
(Note: I'm not quite sure, why my numbers for master are lower than Alik's one. Probably, our test methodology differs a bit. Perhaps, it is because I don't recreate tables between client number change, but between branch switch).
These results look very cool!
I think CSN is eventually inevitable, but it's a long distance feature. Thus, this optimization could make us a good serve before we would have CSN.
Do you expect there are cases when this patch causes slowdown? What about case when we scan each xip array only once (not sure how to simulate that using pgbench)?