On Tue, May 13, 2025 at 2:26 PM Matheus Alcantara
<matheusssilv97@gmail.com> wrote:
>
> Hi,
>
> On 30/04/25 19:36, Masahiko Sawada wrote:
> > Here are the summary of each attached patch:
> >
> > 0001: Introduce TIdStoreIsMemberMulti() where we can do IsMember check
> > for multiple TIDs in one function call. If the given TIDs are sorted
> > (at least in block number), we can save radix tree lookup for the same
> > page entry.
> >
> > 0002: Convert IndexBUlkDeleteCallback() to batched operation.
> >
> > 0003: Use batch TIDs lookup in btree index bulk-deletion.
> >
> > In patch 0003, we implement batch TID lookups for both each
> > deduplicated index tuple and remaining all regular index tuples, which
> > seems to be the most straightforward approach. While further
> > optimizations are possible, such as performing batch TID lookups for
> > all index tuples on a single page, these could introduce additional
> > overhead from sorting and re-sorting TIDs. Moreover, considering that
> > radix tree lookups are relatively inexpensive, the benefits of sorting
> > TIDs and using TidStoreIsMemberMulti() might be minimal. Nevertheless,
> > these potential optimizations warrant further evaluation to determine
> > their actual impact on performance.
> >
> > Also, the patch includes the batch TIDs lookup support only for btree
> > indexes but we potentially can improve other index AMs too.
> >
>
> The code looks good and also +1 for the idea. I just have some small
> points:
> - Maybe it would be good to mention somewhere that
> IndexBulkDeleteCallback() callback returns the number of tids
> members found on TidStore?
> - The vac_tid_reaped() docs may need to be updated?
Thank you for looking at the patches. I agree with the above comments.
>
> I also executed meson tests for each patch individually and the 0002
> patch is not passing on my machine (MacOs).
>
> Ok: 39
> Expected Fail: 0
> Fail: 271
> Unexpected Pass: 0
> Skipped: 22
> Timeout: 0
>
> One behaviour that I found by executing the 0002 tests is that it may be
> leaking some shared memory segments. I notice that because after
> executing the tests I tried to re-execute based on master and all tests
> were failing with the "Failed system call was shmget(key=97530599,
> size=56, 03600)" error. I also checked the shared memory segments using
> "ipcs -m" and it returns some segments which is don't returned when I
> execute the tests on master (after cleaning the leaked memory segments)
> and it also doesn't occur when executing based on 0001 or 0003.
>
> ~/d/p/batch-tids-lookup-ambulkdelete ❯❯❯ ipcs -m
> IPC status from <running system> as of Tue May 13 18:19:14 -03 2025
> T ID KEY MODE OWNER GROUP
> Shared Memory:
> m 18087936 0x05f873bf --rw------- matheus staff
> m 15925250 0x05f966fe --rw------- matheus staff
> m 24248325 0x05f9677e --rw------- matheus staff
> ....
>
> Note that the the 0003 patch don't have this issue so at the end we will
> not have problem with this I think, but it maybe be good to mention that
> although the patches are separate, there is a dependency between them,
> which may cause issues on buildfarm?
Thank you for the report. With the 0001 and 0002 patches, I got a
SEGV. I've fixed this issue in the attached updated version patches.
I've confirmed that the patches pass CI tests but I'm not sure it
fixes the shared memory segment leak problem you reported. The
attached patches incorporated the comments[1] from John as well.
BTW I found that the constant 'maxblkno' in test_tidstore.sql actually
equals to InvalidBlockNumber, but not MaxBlockNumber. I think it
doesn't make sense that TidStore uses InvalidBlockNumber as the key.
The attached 0001 patch fixes it. I think we can fix it separately on
HEAD as well as back branches.
Regards,
[1] https://www.postgresql.org/message-id/CANWCAZbiJcwSBCczLfbfiPe1mET+V9PjTZv5VvUBwarLvx1Hfg@mail.gmail.com
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com