Re: Optimising compactify_tuples() - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: Optimising compactify_tuples() |
Date | |
Msg-id | CAApHDvrTqc0XvYDHuh6irteOy4e4t0JNjY6JGpsmXLnHksnyxg@mail.gmail.com Whole thread Raw |
In response to | Re: Optimising compactify_tuples() (Thomas Munro <thomas.munro@gmail.com>) |
Responses |
Re: Optimising compactify_tuples()
|
List | pgsql-hackers |
On Wed, 9 Sep 2020 at 05:38, Thomas Munro <thomas.munro@gmail.com> wrote: > > On Wed, Sep 9, 2020 at 3:47 AM David Rowley <dgrowleyml@gmail.com> wrote: > > On Tue, 8 Sep 2020 at 12:08, Thomas Munro <thomas.munro@gmail.com> wrote: > > > One thought is that if we're going to copy everything out and back in > > > again, we might want to consider doing it in a > > > memory-prefetcher-friendly order. Would it be a good idea to > > > rearrange the tuples to match line pointer order, so that the copying > > > work and also later sequential scans are in a forward direction? > > > > That's an interesting idea but wouldn't that require both the copy to > > the separate buffer *and* a qsort? That's the worst of both > > implementations. We'd need some other data structure too in order to > > get the index of the sorted array by reverse lineitem point, which > > might require an additional array and an additional sort. > > Well I may not have had enough coffee yet but I thought you'd just > have to spin though the item IDs twice. Once to compute sum(lp_len) > so you can compute the new pd_upper, and the second time to copy the > tuples from their random locations on the temporary page to new > sequential locations, so that afterwards item ID order matches offset > order. I think you were adequately caffeinated. You're right that this is fairly simple to do, but it looks even more simple than looping twice of the array. I think it's just a matter of looping over the itemidbase backwards and putting the higher itemid tuples at the end of the page. I've done it this way in the attached patch. I also added a presorted path which falls back on doing memmoves without the temp buffer when the itemidbase array indicates that higher lineitems all have higher offsets. I'm doing the presort check in the calling function since that loops over the lineitems already. We can just memmove the tuples in reverse order without overwriting any yet to be moved tuples when these are in order. Also, I added code to collapse the memcpy and memmoves for adjacent tuples so that we perform the minimal number of calls to those functions. Once we've previously compacted a page it seems that the code is able to reduce the number of calls significantly. I added some logging and reviewed at after a run of the benchmark and saw that for about 192 tuples we're mostly just doing 3-4 memcpys in the non-presorted path and just 2 memmoves, for the presorted code path. I also found that in my test the presorted path was only taken 12.39% of the time. Trying with 120 million UPDATEs instead of 12 million in the test ended up reducing this to just 10.89%. It seems that it'll just be 1 or 2 tuples spoiling it since new tuples will still be added earlier in the page after we free up space to add more. I also experimented seeing what would happen if I also tried to collapse the memcpys for copying to the temp buffer. The performance got a little worse from doing that. So I left that code #ifdef'd out With the attached v3, performance is better. The test now runs recovery 65.6 seconds, vs master's 148.5 seconds. So about 2.2x faster. We should probably consider what else can be done to try to write pages with tuples for earlier lineitems earlier in the page. VACUUM FULLs and friends will switch back to the opposite order when rewriting the heap. Also fixed my missing libc debug symbols: 24.90% postgres postgres [.] PageRepairFragmentation 15.26% postgres libc-2.31.so [.] __memmove_avx_unaligned_erms 9.61% postgres postgres [.] hash_search_with_hash_value 8.03% postgres postgres [.] compactify_tuples 6.25% postgres postgres [.] XLogReadBufferExtended 3.74% postgres postgres [.] PinBuffer 2.25% postgres postgres [.] hash_bytes 1.79% postgres postgres [.] heap_xlog_update 1.47% postgres postgres [.] LWLockRelease 1.44% postgres postgres [.] XLogReadRecord 1.33% postgres postgres [.] PageGetHeapFreeSpace 1.16% postgres postgres [.] DecodeXLogRecord 1.13% postgres postgres [.] pg_comp_crc32c_sse42 1.12% postgres postgres [.] LWLockAttemptLock 1.09% postgres postgres [.] StartupXLOG 0.90% postgres postgres [.] ReadBuffer_common 0.84% postgres postgres [.] SlruSelectLRUPage 0.74% postgres libc-2.31.so [.] __memcmp_avx2_movbe 0.68% postgres [kernel.kallsyms] [k] copy_user_generic_string 0.66% postgres postgres [.] PageAddItemExtended 0.66% postgres postgres [.] PageIndexTupleOverwrite 0.62% postgres postgres [.] smgropen 0.60% postgres postgres [.] ReadPageInternal 0.57% postgres postgres [.] GetPrivateRefCountEntry 0.52% postgres postgres [.] heap_redo 0.51% postgres postgres [.] AdvanceNextFullTransactionIdPastXid David
Attachment
pgsql-hackers by date: