Re: New WAL record to detect the checkpoint redo location - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: New WAL record to detect the checkpoint redo location |
Date | |
Msg-id | CA+Tgmob90XtBJ+WdSHA+y8ZikHPSxQvZwYDefEBLC7Bkwx=3-g@mail.gmail.com Whole thread Raw |
In response to | Re: New WAL record to detect the checkpoint redo location (Andres Freund <andres@anarazel.de>) |
Responses |
Re: New WAL record to detect the checkpoint redo location
|
List | pgsql-hackers |
On Thu, Oct 5, 2023 at 2:34 PM Andres Freund <andres@anarazel.de> wrote: > One thing that's notable, but not related to the patch, is that we waste a > fair bit of cpu time below XLogInsertRecord() with divisions. I think they're > all due to the use of UsableBytesInSegment in > XLogBytePosToRecPtr/XLogBytePosToEndRecPtr. The multiplication of > XLogSegNoOffsetToRecPtr() also shows. Despite what I said in my earlier email, and with a feeling like unto that created by the proximity of the sword of Damocles or some ghostly albatross, I spent some time reflecting on this. Some observations: 1. The reason why we're doing this multiplication and division is to make sure that the code in ReserveXLogInsertLocation which executes while holding insertpos_lck remains as simple and brief as possible. We could eliminate the conversion between usable byte positions and LSNs if we replaced Insert->{Curr,Prev}BytePos with LSNs and had ReserveXLogInsertLocation work out by how much to advance the LSN, but it would have to be worked out while holding insertpos_lck (or some replacement lwlock, perhaps) and that cure seems worse than the disease. Given that, I think we're stuck with converting between usable bye positions and LSNs, and that intrinsically needs some multiplication and division. 2. It seems possible to remove one branch in each of XLogBytePosToRecPtr and XLogBytePosToEndRecPtr. Rather than testing whether bytesleft < XLOG_BLCKSZ - SizeOfXLogLongPHD, we could simply increment bytesleft by SizeOfXLogLongPHD - SizeOfXLogShortPHD. Then the rest of the calculations can be performed as if every page in the segment had a header of length SizeOfXLogShortPHD, with no need to special-case the first page. However, that doesn't get rid of any multiplication or division, just a branch. 3. Aside from that, there seems to be no simple way to reduce the complexity of an individual calculation, but ReserveXLogInsertLocation does perform 3 rather similar computations, and I believe that we know that it will always be the case that *PrevPtr < *StartPos < *EndPos. Maybe we could have a fast-path for the case where they are all in the same segment. We could take prevbytepos modulo UsableBytesInSegment; call the result prevsegoff. If UsableBytesInSegment - prevsegoff > endbytepos - prevbytepos, then all three pointers are in the same segment, and maybe we could take advantage of that to avoid performing the segment calculations more than once, but still needing to repeat the page calculations. Or, instead or in addition, I think we could by a similar technique check whether all three pointers are on the same page; if so, then *StartPos and *EndPos can be computed from *PrevPtr by just adding the difference between the corresponding byte positions. I'm not really sure whether that would come out cheaper. It's just the only idea that I have. It did also occur to me to wonder whether the apparent delays performing multiplication and division here were really the result of the arithmetic itself being slow or whether they were synchronization-related, SpinLockRelease(&Insert->insertpos_lck) being a memory barrier just before. But I assume you thought about that and concluded that wasn't the issue here. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: