Re: Crash by targetted recovery - Mailing list pgsql-hackers
From | Fujii Masao |
---|---|
Subject | Re: Crash by targetted recovery |
Date | |
Msg-id | f7c69cf0-fcac-91b4-6fa2-f2560b6c49b5@oss.nttdata.com Whole thread Raw |
In response to | Re: Crash by targetted recovery (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Responses |
Re: Crash by targetted recovery
|
List | pgsql-hackers |
On 2020/02/28 12:13, Kyotaro Horiguchi wrote: > At Thu, 27 Feb 2020 20:04:41 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in >> >> >> On 2020/02/27 17:05, Kyotaro Horiguchi wrote: >>> Thank you for the comment. >>> At Thu, 27 Feb 2020 16:23:44 +0900, Fujii Masao >>> <masao.fujii@oss.nttdata.com> wrote in >>>> On 2020/02/27 15:23, Kyotaro Horiguchi wrote: >>>>>> I failed to understand why random access while reading from >>>>>> stream is bad idea. Could you elaborate why? >>>>> It seems to me the word "streaming" suggests that WAL record should be >>>>> read sequentially. Random access, which means reading from arbitrary >>>>> location, breaks a stream. (But the patch doesn't try to stop wal >>>>> sender if randAccess.) >>>>> >>>>>> Isn't it sufficient to set currentSource to 0 when disabling >>>>>> StandbyMode? >>>>> I thought that and it should work, but I hesitated to manipulate on >>>>> currentSource in StartupXLOG. currentSource is basically a private >>>>> state of WaitForWALToBecomeAvailable. ReadRecord modifies it but I >>>>> think it's not good to modify it out of the the logic in >>>>> WaitForWALToBecomeAvailable. >>>> >>>> If so, what about adding the following at the top of >>>> WaitForWALToBecomeAvailable()? >>>> >>>> if (!StandbyMode && currentSource == XLOG_FROM_STREAM) >>>> currentSource = 0; >>> It works virtually the same way. I'm happy to do that if you don't >>> agree to using randAccess. But I'd rather do that in the 'if >>> (!InArchiveRecovery)' section. >> >> The approach using randAccess seems unsafe. Please imagine >> the case where currentSource is changed to XLOG_FROM_ARCHIVE >> because randAccess is true, while walreceiver is still running. >> For example, this case can occur when the record at REDO >> starting point is fetched with randAccess = true after walreceiver >> is invoked to fetch the last checkpoint record. The situation >> "currentSource != XLOG_FROM_STREAM while walreceiver is >> running" seems invalid. No? > > When I mentioned an possibility of changing ReadRecord so that it > modifies randAccess instead of currentSource, I thought that > WaitForWALToBecomeAvailable should shutdown wal receiver as > needed. > > At Thu, 27 Feb 2020 15:23:07 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in > me> location, breaks a stream. (But the patch doesn't try to stop wal > me> sender if randAccess.) Sorry, I failed to notice that. > And random access during StandbyMode ususally (always?) lets RecPtr go > back. I'm not sure WaitForWALToBecomeAvailable works correctly if we > don't have a file in pg_wal and the REDO point is far back by more > than a segment from the initial checkpoint record. It works correctly. This is why WaitForWALToBecomeAvailable() uses fetching_ckpt argument. > If we go back to XLOG_FROM_ARCHIVE by random access, it correctly > re-connects to the primary for the past segment. But this can lead to unnecessary restart of walreceiver. Since fetching_ckpt ensures that the WAL file containing the REDO starting record exists in pg_wal, there is no need to reconnect to the primary when reading the REDO starting record. Is there other case where we need to go back to XLOG_FROM_ARCHIVE by random access? Regards, -- Fujii Masao NTT DATA CORPORATION Advanced Platform Technology Group Research and Development Headquarters
pgsql-hackers by date: