RE: pg_logical_slot_get_changes waits continously for a partial WAL record spanning across 2 pages - Mailing list pgsql-hackers

From Hayato Kuroda (Fujitsu)
Subject RE: pg_logical_slot_get_changes waits continously for a partial WAL record spanning across 2 pages
Date
Msg-id OSCPR01MB1496617B6F24EA85DE5D9D9D1F57AA@OSCPR01MB14966.jpnprd01.prod.outlook.com
Whole thread Raw
In response to RE: pg_logical_slot_get_changes waits continously for a partial WAL record spanning across 2 pages  ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
List pgsql-hackers
> > Another reliable approach would be to make the
> > code wait before reading the record in the internal loop of
> > ReadPageInternal() with an injection point when we know that we have a
> > contrecord, but I'm not really excited about this prospect in
> > xlogreader.c which can be also used in the frontend.
> 
> Per my understanding an injection point must be added while flushing a WAL
> record,
> to emulate the incomplete WAL record issue. To confirm, how can it be used in
> ReadPageInternal()?

I've spent time how we use the injection point to cause the same situation, which
generates the OVERWRITE_CONTRECORD in-between the page, but it seems difficult.

XLogFlush()->XLogWrite() has a responsibility to flush WAL records, but it does not
write/flush per pages. It tries to write to pages as much as possible and
flushes the result at once. A corner case is when the segment is changed, but
this is not the same situation we observed the failure.

So... I have no idea to create deterministic reproducers, it is OK for me to use
046 test for the purpose.

Best regards,
Hayato Kuroda
FUJITSU LIMITED


pgsql-hackers by date:

Previous
From: Dmitry
Date:
Subject: Re: IPC/MultixactCreation on the Standby server
Next
From: Yugo Nagata
Date:
Subject: Re: Suggestion to add --continue-client-on-abort option to pgbench