Re: pg_logical_slot_get_changes waits continously for a partial WAL record spanning across 2 pages - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: pg_logical_slot_get_changes waits continously for a partial WAL record spanning across 2 pages
Date
Msg-id CAFiTN-ucw0Zu28oq7_r4Gcvkt3geqV6d6eDae+ksnKFTdATYaw@mail.gmail.com
Whole thread Raw
In response to pg_logical_slot_get_changes waits continously for a partial WAL record spanning across 2 pages  (vignesh C <vignesh21@gmail.com>)
List pgsql-hackers
On Fri, Jun 27, 2025 at 9:29 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, 27 Jun 2025 at 07:05, Michael Paquier <michael@paquier.xyz> wrote:
> >
> > On Thu, Jun 26, 2025 at 05:25:42PM +0530, vignesh C wrote:
> > > On Thu, 26 Jun 2025 at 06:22, Michael Paquier <michael@paquier.xyz> wrote:
> > >> So you are suggesting the addition of an extra ReadPageInternal() that
> > >> forces a read of only the read, perform the checks on the header, then
> > >> read the rest.  After reading SizeOfXLogShortPHD worth of data,
> > >> shouldn't the checks on xlp_rem_len be done a bit earlier than what
> > >> you are proposing in this patch?
> > >
> > > Modified
> >
> > It seems to me that this assert can be moved after the second page
> > read:
> > Assert(SizeOfXLogShortPHD <= readOff);
>
> I felt this Assert can be here to ensure we’ve read SizeOfXLogShortPHD
> before checking the page header contents. But for the second page
> read, the following existing Assert should be enough:
> Assert(pageHeaderSize <= readOff);

Thanks make sense.

> >
> > Coming back to the point of Kuroda-san about performance, did you do
> > some checks related to that and did you measure any difference?  I
> > suspect none of that because in most cases we are just going to fetch
> > the next page and we would trigger the fast-exit path of
> > ReadPageInternal() on the second call when fetching the rest.  I still
> > need to get an idea of all that by myself, probably with various
> > lengths of logical message records.
>
> The test execution times are measured in microseconds. In the results
> table below, the first row indicates the message size, and each value
> represents the median of 5 test runs.
> The attached script was used to run these tests. In the attached
> script the MSG_SIZE variable in the script should be changed to 1000(1
> page), 10000 (2 pages approx), 25000 (3 pages approx) , 50000 (6 pages
> approx), 100000 (12 page approx), 1000000 (122 pages approx), 10000000
>  (1220 pages approx) and 100000000 (12207 pages approx) and be run.
> Test execution time can be taken from run_*.dat files that will be
> generated.
>
> Size   | 1000    | 10000  | 25000    | 50000    | 100000  | 1000000
> --------|-----------|-----------|------------|-------------|------------|--------------
> Head | 9297.1 | 9895.4 | 10844.2 | 12946.5 | 16945.1 | 86187.1
> Patch | 9222.7 | 9889   | 10897.1 | 12904.2 | 16858.4 | 87115.5
>
> Size    | 10000000  | 100000000
> ---------|----------------|-----------------
> HEAD | 804965.6   | 331639.7
> Patch  | 804942.6   | 321198.6
>
> The performance results show that the patch does not introduce any
> noticeable overhead across varying message sizes, I felt there was no
> impact because of the additional page header read.

Yeah, that doesn't seem like a regression.

--
Regards,
Dilip Kumar
Google



pgsql-hackers by date:

Previous
From: Daisuke Higuchi
Date:
Subject: Re: Typo in pg_stat_activity definition
Next
From: torikoshia
Date:
Subject: Re: speedup COPY TO for partitioned table.