Re: [BUG] non archived WAL removed during production crash recovery - Mailing list pgsql-bugs
From | Michael Paquier |
---|---|
Subject | Re: [BUG] non archived WAL removed during production crash recovery |
Date | |
Msg-id | 20200416081100.GD81957@paquier.xyz Whole thread Raw |
In response to | Re: [BUG] non archived WAL removed during production crash recovery (Jehan-Guillaume de Rorthais <jgdr@dalibo.com>) |
Responses |
Re: [BUG] non archived WAL removed during production crash recovery
Re: [BUG] non archived WAL removed during production crash recovery |
List | pgsql-bugs |
On Tue, Apr 14, 2020 at 06:03:19PM +0200, Jehan-Guillaume de Rorthais wrote: Thanks for the new version. > On Mon, 13 Apr 2020 16:14:14 +0900 Michael Paquier <michael@paquier.xyz> wrote: >> It seems to me that the initial value of SharedRecoveryState should be >> CRASH_RECOVERING all the time no? StartupXLOG() is a code path taken >> even if starting cleanly, and the flag would be reset correctly at the >> end to NOT_IN_RECOVERY. > > Previous version of the patch was setting CRASH_RECOVERING. Fujii-san reported > (the 4 Apr 2020 02:49:50 +0900) that it could be useful to expose a better value > until relevant code is reached in StartupXLOG() so GetRecoveryState() returns > a safer value for futur use. > > As I answered upthread, I'm not sure who would need this information before the > WAL machinery is up though. Note that ARCHIVE_RECOVERING and CRASH_RECOVERING > are compatible with the previous behavior. I am not sure either if we need this information until the startup process is up, but even if we need it I'd rather keep the code consistent with the past practice, which was that SharedRecoveryInProgress got set to true, the equivalent of crash recovery as that's more generic than the archive recovery switch. > Maybe the solution would be to init with CRASH_RECOVERING and add some comment > in GetRecoveryState() to warn the state is "enforced" after the XLOG machinery > is started and is init'ed to RECOVERING in the meantime? > > I initialized it to CRASH_RECOVERING in the new v5 patch and added a comment > to GetRecoveryState(). Not sure that the comment is worth it. Your description of the state looks enough, and the code is clear that we have just an initial shared memory state in this case. >> + /* The file is always deletable if archive_mode is "off". */ >> + if ( ! XLogArchivingActive() ) >> + return true; >> [...] >> + if ( recoveryState == ARCHIVE_RECOVERING && !XLogArchivingAlways() ) >> return true; >> Incorrect indentation. > > Is it the spaces as reported by Horiguchi-san? I removed them in latest patch. Yes. >> It would be good to mention that while in crash recovery, files are >> not considered as deletable. > > Well, in fact, I am still wondering about this. I was hesitant to add a > shortcut like: > > /* no WAL segment cleanup during crash recovery */ > if (recoveryState == RECOVERT_STATE_CRASH) > return false; > > But, what if for example we crashed for lack of disk space during intensive > writes? During crash recovery, any WAL marked as .done could be removed and > allow the system to start again and maybe make even further WAL cleanup by > archiving some more WAL without competing with previous high write ratio. > > When we recover a primary, this behavior seems conform with any value of > archive_mode, even if it has been changed after crash and before starting it. > On a standby, we might create .ready files, but they will be removed during the > first restartpoint if needed. I guess that you mean .ready and not .done here? Removing .done files does not matter as the segments related to them are already gone. Even with that, why should we need to make the backend smarter about the removal of .ready files during crash recovery. It seems to me that we should keep them, and an operator could always come by himself and do some manual cleanup to free some space in the pg_wal partition. > I needed a failure so I can test pg_stat_archiver reports it as well. In my > mind, using "false" would either trigger a failure because false returns 1 or... > a failure because the command is not found. In either way, the result is the > same. > > Using poll_query_until+pg_stat_file, is a good idea, but not enough as > archiver reports a failure some moment after the .ready signal file appears. Using an empty string makes the test more portable, but while I looked at it I have found out that it leads to delays in the archiver except if you force the generation of more segments in the test, causing the logic to get more complicated with the manipulation of the .ready and .done files. And I was then finding myself to add an archive_timeout.. Anyway, this reduced the readability of the test so I am pretty much giving up on this idea. >> The end of the tests actually relies on the >> fact that archive_command is set to "false" when the cold backup is >> taken, before resetting it. I think that it would be cleaner to >> enforce the configuration you want to test before starting each >> standby. It becomes easier to understand the flow of the test for the >> reader. > > done as well. I have put my hands on the code, and attached is a cleaned up version for the backend part. Below are some notes. + * RECOVERT_STATE_ARCHIVE is set for archive recovery or for a standby. Typo here that actually comes from my previous email, and that you blindly copy-pasted, repeated five times in the tree actually. + RecoveryState recoveryState = GetRecoveryState(); + + /* The file is always deletable if archive_mode is "off". */ + if (!XLogArchivingActive()) + return true; There is no point to call GetRecoveryState() is archive_mode = off. + * There's two different reasons to recover WAL: when standby mode is requested + * or after a crash to catchup with consistency. Nit: s/There's/There are/. Anyway, I don't see much point in keeping this comment as the description of each state value should be enough, so I have removed it. I am currently in the middle of reviewing the test and there are a couple of things that can be improved but I lack of time today, so I'll continue tomorrow on it. There is no need to send two separate patches by the way as the code paths touched are different. -- Michael
Attachment
pgsql-bugs by date: