Re: Batch TIDs lookup in ambulkdelete - Mailing list pgsql-hackers

From Junwang Zhao
Subject Re: Batch TIDs lookup in ambulkdelete
Date
Msg-id CAEG8a3LzioQhng85tzKL9Btsx+xd5h+RLnzN24SgqtChDeeZAw@mail.gmail.com
Whole thread Raw
In response to Re: Batch TIDs lookup in ambulkdelete  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Batch TIDs lookup in ambulkdelete
List pgsql-hackers
Hi Masahiko,

On Sat, Jun 7, 2025 at 5:34 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, May 13, 2025 at 2:26 PM Matheus Alcantara
> <matheusssilv97@gmail.com> wrote:
> >
> > Hi,
> >
> > On 30/04/25 19:36, Masahiko Sawada wrote:
> > > Here are the summary of each attached patch:
> > >
> > > 0001: Introduce TIdStoreIsMemberMulti() where we can do IsMember check
> > > for multiple TIDs in one function call. If the given TIDs are sorted
> > > (at least in block number), we can save radix tree lookup for the same
> > > page entry.
> > >
> > > 0002: Convert IndexBUlkDeleteCallback() to batched operation.
> > >
> > > 0003: Use batch TIDs lookup in btree index bulk-deletion.
> > >
> > > In patch 0003, we implement batch TID lookups for both each
> > > deduplicated index tuple and remaining all regular index tuples, which
> > > seems to be the most straightforward approach. While further
> > > optimizations are possible, such as performing batch TID lookups for
> > > all index tuples on a single page, these could introduce additional
> > > overhead from sorting and re-sorting TIDs. Moreover, considering that
> > > radix tree lookups are relatively inexpensive, the benefits of sorting
> > > TIDs and using TidStoreIsMemberMulti() might be minimal. Nevertheless,
> > > these potential optimizations warrant further evaluation to determine
> > > their actual impact on performance.
> > >
> > > Also, the patch includes the batch TIDs lookup support only for btree
> > > indexes but we potentially can improve other index AMs too.
> > >
> >
> > The code looks good and also +1 for the idea. I just have some small
> > points:
> >     - Maybe it would be good to mention somewhere that
> >       IndexBulkDeleteCallback() callback returns the number of tids
> >       members found on TidStore?
> >     - The vac_tid_reaped() docs may need to be updated?
>
> Thank you for looking at the patches. I agree with the above comments.
>
> >
> > I also executed meson tests for each patch individually and the 0002
> > patch is not passing on my machine (MacOs).
> >
> > Ok:                 39
> > Expected Fail:      0
> > Fail:               271
> > Unexpected Pass:    0
> > Skipped:            22
> > Timeout:            0
> >
> > One behaviour that I found by executing the 0002 tests is that it may be
> > leaking some shared memory segments. I notice that because after
> > executing the tests I tried to re-execute based on master and all tests
> > were failing with the "Failed system call was shmget(key=97530599,
> > size=56, 03600)" error. I also checked the shared memory segments using
> > "ipcs -m" and it returns some segments which is don't returned when I
> > execute the tests on master (after cleaning the leaked memory segments)
> > and it also doesn't occur when executing based on 0001 or 0003.
> >
> > ~/d/p/batch-tids-lookup-ambulkdelete ❯❯❯ ipcs -m
> > IPC status from <running system> as of Tue May 13 18:19:14 -03 2025
> > T     ID     KEY        MODE       OWNER    GROUP
> > Shared Memory:
> > m 18087936 0x05f873bf --rw-------  matheus    staff
> > m 15925250 0x05f966fe --rw-------  matheus    staff
> > m 24248325 0x05f9677e --rw-------  matheus    staff
> > ....
> >
> > Note that the the 0003 patch don't have this issue so at the end we will
> > not have problem with this I think, but it maybe be good to mention that
> > although the patches are separate, there is a dependency between them,
> > which may cause issues on buildfarm?
>
> Thank you for the report. With the 0001 and 0002 patches, I got a
> SEGV. I've fixed this issue in the attached updated version patches.
> I've confirmed that the patches pass CI tests but I'm not sure it
> fixes the shared memory segment leak problem you reported. The
> attached patches incorporated the comments[1] from John as well.
>
> BTW I found that the constant 'maxblkno' in test_tidstore.sql actually
> equals to InvalidBlockNumber, but not MaxBlockNumber. I think it
> doesn't make sense that TidStore uses InvalidBlockNumber as the key.
> The attached 0001 patch fixes it. I think we can fix it separately on
> HEAD as well as back branches.
>
> Regards,
>
> [1] https://www.postgresql.org/message-id/CANWCAZbiJcwSBCczLfbfiPe1mET+V9PjTZv5VvUBwarLvx1Hfg@mail.gmail.com
>
> --
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com

+ /*
+ * We will sort the deletable array if there are existing
+ * offsets as it should be sorted in ascending order (see
+ * _bt_delitems_vacuum()).
+ */
+ need_sort = (ndeletable > 0);
+
+ ndels = callback(workbuf_htids, workbuf_nitem, workbuf_deletable,
+ callback_state);
+ if (ndels > 0)
+ {
+ for (int i = 0; i < workbuf_nitem; i++)
+ {
+ if (workbuf_deletable[i])
+ deletable[ndeletable++] = workbuf_offs[i];
+ }
+
+ if (need_sort)
+ qsort(deletable, ndeletable, sizeof(OffsetNumber), cmpOffsetNumbers);
+
+ nhtidsdead += ndels;
+ }

I think the need_sort should be calculated after the callback? Maybe just:

if (ndeletable > 1)
    qsort(deletable, ndeletable, sizeof(OffsetNumber), cmpOffsetNumbers);

I think there is no need to sort when ndeletable is 1, so here compare to 1.


--
Regards
Junwang Zhao



pgsql-hackers by date:

Previous
From: Alexander Lakhin
Date:
Subject: Re: Make tuple deformation faster
Next
From: Junwang Zhao
Date:
Subject: Re: Batch TIDs lookup in ambulkdelete