Re: Batch TIDs lookup in ambulkdelete - Mailing list pgsql-hackers
From | Junwang Zhao |
---|---|
Subject | Re: Batch TIDs lookup in ambulkdelete |
Date | |
Msg-id | CAEG8a3LzioQhng85tzKL9Btsx+xd5h+RLnzN24SgqtChDeeZAw@mail.gmail.com Whole thread Raw |
In response to | Re: Batch TIDs lookup in ambulkdelete (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: Batch TIDs lookup in ambulkdelete
|
List | pgsql-hackers |
Hi Masahiko, On Sat, Jun 7, 2025 at 5:34 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Tue, May 13, 2025 at 2:26 PM Matheus Alcantara > <matheusssilv97@gmail.com> wrote: > > > > Hi, > > > > On 30/04/25 19:36, Masahiko Sawada wrote: > > > Here are the summary of each attached patch: > > > > > > 0001: Introduce TIdStoreIsMemberMulti() where we can do IsMember check > > > for multiple TIDs in one function call. If the given TIDs are sorted > > > (at least in block number), we can save radix tree lookup for the same > > > page entry. > > > > > > 0002: Convert IndexBUlkDeleteCallback() to batched operation. > > > > > > 0003: Use batch TIDs lookup in btree index bulk-deletion. > > > > > > In patch 0003, we implement batch TID lookups for both each > > > deduplicated index tuple and remaining all regular index tuples, which > > > seems to be the most straightforward approach. While further > > > optimizations are possible, such as performing batch TID lookups for > > > all index tuples on a single page, these could introduce additional > > > overhead from sorting and re-sorting TIDs. Moreover, considering that > > > radix tree lookups are relatively inexpensive, the benefits of sorting > > > TIDs and using TidStoreIsMemberMulti() might be minimal. Nevertheless, > > > these potential optimizations warrant further evaluation to determine > > > their actual impact on performance. > > > > > > Also, the patch includes the batch TIDs lookup support only for btree > > > indexes but we potentially can improve other index AMs too. > > > > > > > The code looks good and also +1 for the idea. I just have some small > > points: > > - Maybe it would be good to mention somewhere that > > IndexBulkDeleteCallback() callback returns the number of tids > > members found on TidStore? > > - The vac_tid_reaped() docs may need to be updated? > > Thank you for looking at the patches. I agree with the above comments. > > > > > I also executed meson tests for each patch individually and the 0002 > > patch is not passing on my machine (MacOs). > > > > Ok: 39 > > Expected Fail: 0 > > Fail: 271 > > Unexpected Pass: 0 > > Skipped: 22 > > Timeout: 0 > > > > One behaviour that I found by executing the 0002 tests is that it may be > > leaking some shared memory segments. I notice that because after > > executing the tests I tried to re-execute based on master and all tests > > were failing with the "Failed system call was shmget(key=97530599, > > size=56, 03600)" error. I also checked the shared memory segments using > > "ipcs -m" and it returns some segments which is don't returned when I > > execute the tests on master (after cleaning the leaked memory segments) > > and it also doesn't occur when executing based on 0001 or 0003. > > > > ~/d/p/batch-tids-lookup-ambulkdelete ❯❯❯ ipcs -m > > IPC status from <running system> as of Tue May 13 18:19:14 -03 2025 > > T ID KEY MODE OWNER GROUP > > Shared Memory: > > m 18087936 0x05f873bf --rw------- matheus staff > > m 15925250 0x05f966fe --rw------- matheus staff > > m 24248325 0x05f9677e --rw------- matheus staff > > .... > > > > Note that the the 0003 patch don't have this issue so at the end we will > > not have problem with this I think, but it maybe be good to mention that > > although the patches are separate, there is a dependency between them, > > which may cause issues on buildfarm? > > Thank you for the report. With the 0001 and 0002 patches, I got a > SEGV. I've fixed this issue in the attached updated version patches. > I've confirmed that the patches pass CI tests but I'm not sure it > fixes the shared memory segment leak problem you reported. The > attached patches incorporated the comments[1] from John as well. > > BTW I found that the constant 'maxblkno' in test_tidstore.sql actually > equals to InvalidBlockNumber, but not MaxBlockNumber. I think it > doesn't make sense that TidStore uses InvalidBlockNumber as the key. > The attached 0001 patch fixes it. I think we can fix it separately on > HEAD as well as back branches. > > Regards, > > [1] https://www.postgresql.org/message-id/CANWCAZbiJcwSBCczLfbfiPe1mET+V9PjTZv5VvUBwarLvx1Hfg@mail.gmail.com > > -- > Masahiko Sawada > Amazon Web Services: https://aws.amazon.com + /* + * We will sort the deletable array if there are existing + * offsets as it should be sorted in ascending order (see + * _bt_delitems_vacuum()). + */ + need_sort = (ndeletable > 0); + + ndels = callback(workbuf_htids, workbuf_nitem, workbuf_deletable, + callback_state); + if (ndels > 0) + { + for (int i = 0; i < workbuf_nitem; i++) + { + if (workbuf_deletable[i]) + deletable[ndeletable++] = workbuf_offs[i]; + } + + if (need_sort) + qsort(deletable, ndeletable, sizeof(OffsetNumber), cmpOffsetNumbers); + + nhtidsdead += ndels; + } I think the need_sort should be calculated after the callback? Maybe just: if (ndeletable > 1) qsort(deletable, ndeletable, sizeof(OffsetNumber), cmpOffsetNumbers); I think there is no need to sort when ndeletable is 1, so here compare to 1. -- Regards Junwang Zhao
pgsql-hackers by date: