Re: BUG #17928: Standby fails to decode WAL on termination of primary - Mailing list pgsql-bugs
From | Thomas Munro |
---|---|
Subject | Re: BUG #17928: Standby fails to decode WAL on termination of primary |
Date | |
Msg-id | CA+hUKGLzv=8g3MFwm+TGFu+7koypgmU7qKHni+nrEnuJ8P6eog@mail.gmail.com Whole thread Raw |
In response to | Re: BUG #17928: Standby fails to decode WAL on termination of primary (Michael Paquier <michael@paquier.xyz>) |
Responses |
Re: BUG #17928: Standby fails to decode WAL on termination of primary
|
List | pgsql-bugs |
On Tue, Aug 15, 2023 at 2:05 PM Michael Paquier <michael@paquier.xyz> wrote: > /* > * Try to find space to decode this record, if we can do so without > * calling palloc. If we can't, we'll try again below after we've > * validated that total_len isn't garbage bytes from a recycled WAL page. > */ > decoded = XLogReadRecordAlloc(state, > total_len, > false /* allow_oversized */ ); > > The patch relies on total_len before the header validation is > completed, meaning that this value of total_len could be invalid > because it comes from the partially-read header.. Oh wait, that's > actually OK because an oversized allocation is not allowed yet and the > decode buffer would not be able to store more than what it can? > Perhaps this comment should be updated to tell something like, adding > that total_len can be garbage, but we don't care because we don't > allocate anything that the decode buffer cannot hold. Yeah. > #ifndef FRONTEND > > - /* > - * Note that in much unlucky circumstances, the random data read from a > - * recycled segment can cause this routine to be called with a size > - * causing a hard failure at allocation. For a standby, this would cause > - * the instance to stop suddenly with a hard failure, preventing it to > - * retry fetching WAL from one of its sources which could allow it to move > - * on with replay without a manual restart. If the data comes from a past > - * recycled segment and is still valid, then the allocation may succeed > - * but record checks are going to fail so this would be short-lived. If > - * the allocation fails because of a memory shortage, then this is not a > - * hard failure either per the guarantee given by MCXT_ALLOC_NO_OOM. > - */ > if (!AllocSizeIsValid(newSize)) > return false; > > Wouldn't it be OK to drop entirely this check? I'm fine to keep it as > a safety measure, but it should not be necessary now that it gets > called only once the header is validated? Yeah, now we're in "shouldn't happen" territory, and I'm not sure. > Regarding the tests, I have been using this formula to produce the > number of bytes until the next page boundary: > select setting::int - ((pg_current_wal_lsn() - '0/0') % setting::int) > from pg_settings where name = 'wal_block_size'; > > Using pg_logical_emit_message() non-transactional with an empty set of > strings generates records of 56 bytes, so I was thinking about the > following: > - Generate a bunch of small records with pg_logical_emit_message(), or > records based on the threshold with the previous formula. > - Loop until we reach a page limit, at 24 bytes (?). > - Generate one last record to cut through. > - Stop the node in immediate mode. > - Write some garbage bytes on the last page generated to emulate the > recycled contents and an allocation > - Start the node, which should be able to startup. > With wal_level = minimal, autovacuum = off and a large > checkpoint_timeout, any other records are minimized. That's fancy, > though. > > Do you think that this could work reliably? Yeah, I think that sounds quite promising, and funnily enough I was just now working on some Perl code that appends controlled junk to the WAL in a test like that so we can try to hit all the error paths...
pgsql-bugs by date: