Re: Frequent Update Project: Design Overview of HOTUpdates - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Re: Frequent Update Project: Design Overview of HOTUpdates |
Date | |
Msg-id | 1163166789.3634.728.camel@silverbirch.site Whole thread Raw |
In response to | Re: Frequent Update Project: Design Overview of HOT Updates ("Zeugswetter Andreas ADI SD" <ZeugswetterA@spardat.at>) |
Responses |
Re: Frequent Update Project: Design Overview of HOTUpdates
Re: Frequent Update Project: Design Overview of HOTUpdates Re: Frequent Update Project: Design Overview of HOTUpdates |
List | pgsql-hackers |
On Fri, 2006-11-10 at 12:32 +0100, Zeugswetter Andreas ADI SD wrote: > > > As more UPDATEs take place these tuple chains would grow, making > > > locating the latest tuple take progressively longer. > > > More generally, do we need an overflow table at all, rather > > than having these overflow tuples living in the same file as > > the root tuples? As long as there's a bit available to mark > > a tuple as being this special not-separately-indexed type, > > you don't need a special location to know what it is. > > Yes, I have that same impression. I think this might work, I'll think about it. If you can come up with a test case where we would need this optimisation then I'm sure that can be prototyped. In many cases though an overflow relation is desirable, so ISTM that we might want to be able to do both: have an updated tuple with not-indexed bit set in the heap if on same page, else go to overflow relation. > 1. It doubles the IO (original page + hot page), if the new row would > have fit into the original page. Yes, but thats a big if, and once it is full that optimisation goes away. Some thoughts: First, it presumes that the second block is not in cache (see later). Second, my observation is that this can happen for some part of the time, but under heavy updates this window of opportunity is not the common case and so this optimisation would not make much difference. However, in some cases, it would be appropriate, so I'll investigate. HOT optimises the case where a tuple in the overflow relation is UPDATEd, so it can place the subsequent tuple version on the same page. So if you perform 100 UPDATEs on a single tuple, the first will write to the overflow relation in a new block, then the further 99 will attempt to write to the same block, if they can. So in many cases we would do only 101 block accesses and no real I/O. > 2. locking should be easier if only the original heap page is involved. Yes, but multi-page update already happens now, so HOT is not different on that point. > 3. It makes the overflow pages really hot because all concurrent updates > compete > for the few overflow pages. Which ensures they are in shared_buffers, rather than causing I/O. The way FSM works, it will cause concurrent updaters to spread out their writes to many blocks. So in the case of a single high frequency updater all of the updates go into the same block of the overflow relation, so the optimisation you referred to in (1) does take effect strongly in that case, yet without causing contention with other updaters. The FSM doesn't change with HOT, but the effects of having inserted additional tuples into the main heap are much harder to undo afterwards. The overflow relation varies in size according to the number of updates, not the actual number of tuples, as does the main heap, so VACUUMing will focus on the hotspots and be more efficient, especially since no indexes need be scanned. [Sure we can use heapblock-need-vacuum bitmaps, but there will still be a mix of updated/not-updated tuples in there, so VACUUM would still be less efficient than with HOT]. So VACUUM can operate more frequently on the overflow relation and keep the size reasonable for more of the time, avoiding I/O. Contention is and will remain a HOT topic ;-) I understand your concerns and we should continue to monitor this on the various performance tests that will be run. > 4. although at first it might seem so I see no advantage for vacuum with > overflow No need to VACUUM the indexes, which is the most expensive part. The more indexes you have, the more VACUUM costs, not so with HOT. > 5. the size reduction of heap is imho moot because you trade it for a > growing overflow > (size reduction only comes from reusing dead tuples and not > adding index tuples --> SITC) HOT doesn't claim to reduce the size of the heap. In the presence of a long running transaction, SizeOfHOT(heap + overflow) = SizeOfCurrent(Heap). VACUUM is still required in both cases to manage total heap size. If we have solely UPDATEs and no deletes, then only the overflow relation need be VACUUMed. > Could you try to explain the reasoning behind separate overflow storage > ? I think the answers above cover the main points, which seem to make the case clearly enough from a design rationale perspective, even without the performance test results to confirm them. > What has been stated so far was not really conclusive to me in this > regard. Personally, I understand. I argued against them for about a month after I first heard of the idea, but they make sense for me now. HOT has evolved considerably from the various ideas of each of the original idea team (Jonah, Bruce, Jan, myself) and will continue to do so as better ideas replace poor ones, based on performance tests. All of the ideas within it need to be strongly challenged to ensure we arrive at the best solution. > e.g. a different header seems no easier in overflow than in heap True. The idea there is that we can turn frequent update on/off fairly easily for normal tables since there are no tuple format changes in the main heap. It also allows HOT to avoid wasting space when a table is heavily updated in certain places only. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
pgsql-hackers by date: