Re: [Patch] Optimize dropping of relation buffers using dlist - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: [Patch] Optimize dropping of relation buffers using dlist |
Date | |
Msg-id | 1801920.1596217177@sss.pgh.pa.us Whole thread Raw |
In response to | Re: [Patch] Optimize dropping of relation buffers using dlist (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: [Patch] Optimize dropping of relation buffers using dlist
|
List | pgsql-hackers |
Robert Haas <robertmhaas@gmail.com> writes: > Unfortunately, I don't have time for detailed review of this. I am > suspicious that there are substantial performance regressions that you > just haven't found yet. I would not take the position that this is a > completely hopeless approach, or anything like that, but neither would > I conclude that the tests shown so far are anywhere near enough to be > confident that there are no problems. I took a quick look through the v8 patch, since it's marked RFC, and my feeling is about the same as Robert's: it is just about impossible to believe that doubling (or more) the amount of hashtable manipulation involved in allocating a buffer won't hurt common workloads. The offered pgbench results don't reassure me; we've so often found that pgbench fails to expose performance problems, except maybe when it's used just so. But aside from that, I noted a number of things I didn't like a bit: * The amount of new shared memory this needs seems several orders of magnitude higher than what I'd call acceptable: according to my measurements it's over 10KB per shared buffer! Most of that is going into the CachedBufTableLock data structure, which seems fundamentally misdesigned --- how could we be needing a lock per map partition *per buffer*? For comparison, the space used by buf_table.c is about 56 bytes per shared buffer; I think this needs to stay at least within hailing distance of there. * It is fairly suspicious that the new data structure is manipulated while holding per-partition locks for the existing buffer hashtable. At best that seems bad for concurrency, and at worst it could result in deadlocks, because I doubt we can assume that the new hash table has partition boundaries identical to the old one. * More generally, it seems like really poor design that this has been written completely independently of the existing buffer hash table. Can't we get any benefit by merging them somehow? * I do not like much of anything in the code details. "CachedBuf" is as unhelpful as could be as a data structure identifier --- what exactly is not "cached" about shared buffers already? "CombinedLock" is not too helpful either, nor could I find any documentation explaining why you need to invent new locking technology in the first place. At best, CombinedLockAcquireSpinLock seems like a brute-force approach to an undocumented problem. * The commentary overall is far too sparse to be of any value --- basically, any reader will have to reverse-engineer your entire design. That's not how we do things around here. There should be either a README, or a long file header comment, explaining what's going on, how the data structure is organized, and what the locking requirements are. See src/backend/storage/buffer/README for the sort of documentation that I think this needs. Even if I were convinced that there's no performance gotchas, I wouldn't commit this in anything like its current form. Robert again: > Also, systems with very large shared_buffers settings are becoming > more common, and probably will continue to become more common, so I > don't think we can dismiss that as an edge case any more. People don't > want to run with an 8GB cache on a 1TB server. I do agree that it'd be great to improve this area. Just not convinced that this is how. regards, tom lane
pgsql-hackers by date: