Re: Unarchived WALs deleted after crash - Mailing list pgsql-hackers
From | Fujii Masao |
---|---|
Subject | Re: Unarchived WALs deleted after crash |
Date | |
Msg-id | CAHGQGwGvPosGdUtcffbKEi0jqe0B50ZPdQiRxqL8kQ4dArNwKA@mail.gmail.com Whole thread Raw |
In response to | Re: Unarchived WALs deleted after crash (Heikki Linnakangas <hlinnakangas@vmware.com>) |
Responses |
Re: Unarchived WALs deleted after crash
|
List | pgsql-hackers |
On Sat, Feb 16, 2013 at 2:07 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > On 15.02.2013 18:10, Fujii Masao wrote: >> >> On Fri, Feb 15, 2013 at 11:31 PM, Heikki Linnakangas >> <hlinnakangas@vmware.com> wrote: >>>> >>>> - /* >>>> >>>> - * Normally we don't delete old XLOG files during recovery to >>>> - * avoid accidentally deleting a file that looks stale due to a >>>> - * bug or hardware issue, but in fact contains important data. >>>> - * During streaming recovery, however, we will eventually fill the >>>> - * disk if we never clean up, so we have to. That's not an issue >>>> - * with file-based archive recovery because in that case we >>>> - * restore one XLOG file at a time, on-demand, and with a >>>> - * different filename that can't be confused with regular XLOG >>>> - * files. >>>> - */ >>>> - if (WalRcvInProgress() || XLogArchiveCheckDone(xlde->d_name)) >>>> + if (RecoveryInProgress() || XLogArchiveCheckDone(xlde->d_name)) >>>> [ delete the file ] >>> >>> >>> With that commit, we started to keep WAL segments restored from the >>> archive >>> in pg_xlog, so we needed to start deleting old segments during archive >>> recovery, even when streaming replication was not active. But the above >>> change was to broad; we started to delete old segments also during crash >>> recovery. >>> >>> The above should check InArchiveRecovery, ie. only delete old files when >>> in >>> archive recovery, not when in crash recovery. But there's one little >>> complication: InArchiveRecovery is currently only valid in the startup >>> process, so we'll need to also share it in shared memory, so that the >>> checkpointer process can access it. >>> >>> I propose the attached patch to fix it. >> >> >> At least in 9.2, when the archived file is restored into pg_xlog, its >> xxx.done >> archive status file is created. So we don't need to check >> InArchiveRecovery >> when deleting old WAL files. Checking whether xxx.done exists is enough. > > > Hmm, what about streamed WAL files? I guess we could go back to the pre-9.2 > coding, and check WalRcvInProgress(). But I didn't actually like that too > much, it seems rather random that old streamed files are recycled when wal > receiver is running at the time of restartpoint, and otherwise not. Because > whether wal receiver is running at the time the restartpoint happens has > little to do with which files were created by streaming replication. With > the right pattern of streaming files from the master, but always being > teporarily disconnected when the restartpoint runs, you could still > accumulate WAL files infinitely. Walreceiver always creates .done file when it closes the already-flushed WAL file and switches WAL file to next. So we also don't need to check WalRcvInProgress(). >> Unfortunately in HEAD, xxx.done file is not created when restoring >> archived >> file because of absence of the patch. We need to implement that first. > > > Ah yeah, that thing again.. > (http://www.postgresql.org/message-id/50DF5BA7.6070200@vmware.com) I'm going > to forward-port that patch now, before it's forgotten again. It's not clear > to me what the holdup was on this, but whatever the bigger patch we've been > waiting for is, it can just as well be done on top of the forward-port. I posted the patch to that thread. Regards, -- Fujii Masao
pgsql-hackers by date: