Re: .ready and .done files considered harmful - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: .ready and .done files considered harmful |
Date | |
Msg-id | CA+Tgmoaq5eBFoj--Xn92Mz5H+YomkE5xyv-_ETB1L-P7BjRCDA@mail.gmail.com Whole thread Raw |
In response to | Re: .ready and .done files considered harmful (Dipesh Pandit <dipesh.pandit@gmail.com>) |
Responses |
Re: .ready and .done files considered harmful
|
List | pgsql-hackers |
On Tue, Jul 27, 2021 at 3:43 AM Dipesh Pandit <dipesh.pandit@gmail.com> wrote: > and updated a new patch. Please find the attached patch v4. Some review: /* + * If archiver is active, send notification that timeline has switched. + */ + if (XLogArchivingActive() && ArchiveRecoveryRequested && + IsUnderPostmaster) + PgArchNotifyTLISwitch(); There are a few other places in xlog.c that are conditional on XLogArchivingActive(), but none of them test ArchiveRecoveryRequested or IsUnderPostmaster. It appears to me that PgArchStartupAllowed() controls whether the archiver runs, and that's not contingent on ArchiveRecoveryRequested and indeed couldn't be, since it's running in the postmaster where that variable wouldn't be initialized. So why do we care about ArchiveRecoveryRequested here? This is not entirely a rhetorical question; maybe there's some reason we should care. If so, the comment ought to mention it. If not, the test should go away. IsUnderPostmaster does make a difference, but I think that test could be placed inside PgArchNotifyTLISwitch() rather than putting it here in StartupXLOG(). In fact, I think the test could be removed entirely, since if PgArchNotifyTLISwitch() is called in single-user mode, it will presumably just discover that arch_pgprocno == INVALID_PGPROCNO, so it will simply do nothing even without the special-case code. + pqsignal(SIGINT, pgarch_timeline_switch); I don't think it's great that we're using up SIGINT for this purpose. There aren't that many signals available at the O/S level that we can use for our purposes, and we generally try to multiplex them at the application layer, e.g. by setting a latch or a flag in shared memory, rather than using a separate signal. Can we do something of that sort here? Or maybe we don't even need a signal. ThisTimeLineID is already visible in shared memory, so why not just have the archiver just check and see whether it's changed, say via a new accessor function GetCurrentTimeLineID()? I guess there could be a concern about the expensive of that, because we'd probably be taking a spinlock or an lwlock for every cycle, but I don't think it's probably that bad, because I doubt we can archive much more than a double-digit number of files per second even with a very fast archive_command, and contention on a lock generally requires a five digit number of acquisitions per second. It would be worth testing to see if we can see a problem here, but I'm fairly hopeful that it's not an issue. If we do feel that it's important to avoid repeatedly taking a lock, let's see if we can find a way to do it without dedicating a signal to this purpose. + * + * "nextLogSegNo" identifies the next log file to be archived in a log + * sequence and the flag "dirScan" specifies a full directory scan to find + * the next log file. */ - while (pgarch_readyXlog(xlog)) + while (pgarch_readyXlog(xlog, &dirScan, &nextLogSegNo)) I do not like this very much. dirScan and nextLogSegNo aren't clearly owned either by pgarch_ArchiverCopyLoop() or by pgarch_readyXlog(), since both functions modify both variables, in each case conditionally, while also relying on the way that the other function manipulates them. Essentially these are global variables in disguise. There's a third, related variable too, which is handled differently: + static TimeLineID curFileTLI = 0; This is really the same kind of thing as the other two, but because pgarch_readyXlog() happens not to need this one, you just made it static inside pgarch_readyXlog() instead of passing it back and forth. The problem with all this is that you can't understand either function in isolation. Unless you read them both together and look at all of the ways these three variables are manipulated, you can't really understand the logic. And there's really no reason why that needs to be true. The job of cleaning timeline_switch and setting dirScan could be done entirely within pgarch_readyXlog(), and so could the job of incrementing nextLogSegNo, because we're not going to again call pgarch_readyXlog() unless archiving succeeded. Also note that the TLI which is stored in curFileTLI corresponds to the segment number stored in nextLogSegNo, yet one of them has "cur" for "current" in the name and the other has "next". It would be easier to read the code if the names were chosen more consistently. My tentative idea as to how to clean this up is: declare a new struct with a name like readyXlogState and members lastTLI and lastSegNo. Have pgarch_ArchiverCopyLoop() declare a variable of this type, zero it, pass it as a parameter to pgarch_readyXlog(), and otherwise leave it alone. Then let pgarch_readyXlog() do all of the manipulation of the values stored therein. + /* + * Fall-back to directory scan + * + * open xlog status directory and read through list of xlogs that have the + * .ready suffix, looking for earliest file. It is possible to optimise + * this code, though only a single file is expected on the vast majority + * of calls, so.... + */ You've moved this comment from its original location, but the trouble is that the comment is 100% false. In fact, the whole reason why you wrote this patch is *because* this comment is 100% false. In fact it is not difficult to create cases where each scan finds many files, and the purpose of the patch is precisely to optimize the code that the person who wrote this thought didn't need optimizing. Now it may take some work to figure out what we want to say here exactly, but preserving the comment as it's written here is certainly misleading. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: