Re: [Patch] ALTER SYSTEM READ ONLY - Mailing list pgsql-hackers
From | amul sul |
---|---|
Subject | Re: [Patch] ALTER SYSTEM READ ONLY |
Date | |
Msg-id | CAAJ_b95UPY6K4W0_=hRN1qoqG8EDYYSQnoq1tyhJZ_HM+xETLA@mail.gmail.com Whole thread Raw |
In response to | Re: [Patch] ALTER SYSTEM READ ONLY (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: [Patch] ALTER SYSTEM READ ONLY
|
List | pgsql-hackers |
On Thu, Jun 18, 2020 at 3:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Jun 17, 2020 at 8:12 PM Robert Haas <robertmhaas@gmail.com> wrote: > > > > On Wed, Jun 17, 2020 at 9:02 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > Do we prohibit the checkpointer to write dirty pages and write a > > > checkpoint record as well? If so, will the checkpointer process > > > writes the current dirty pages and writes a checkpoint record or we > > > skip that as well? > > > > I think the definition of this feature should be that you can't write > > WAL. So, it's OK to write dirty pages in general, for example to allow > > for buffer replacement so we can continue to run read-only queries. > > > > For buffer replacement, many-a-times we have to also perform > XLogFlush, what do we do for that? We can't proceed without doing > that and erroring out from there means stopping read-only query from > the user perspective. > Read-only does not restrict XLogFlush(). > > But there's no reason for the checkpointer to do it: it shouldn't try > > to checkpoint, and therefore it shouldn't write dirty pages either. > > > > What is the harm in doing the checkpoint before we put the system into > READ ONLY state? The advantage is that we can at least reduce the > recovery time if we allow writing checkpoint record. > The checkpoint could take longer, intending to quickly switch to the read-only state. > > > > > What if vacuum is on an unlogged relation? Do we allow writes via > > > vacuum to unlogged relation? > > > > Interesting question. I was thinking that we should probably teach the > > autovacuum launcher to stop launching workers while the system is in a > > READ ONLY state, but what about existing workers? Anything that > > generates invalidation messages, acquires an XID, or writes WAL has to > > be blocked in a read-only state; but I'm not sure to what extent the > > first two of those things would be a problem for vacuuming an unlogged > > table. I think you couldn't truncate it, at least, because that > > acquires an XID. > > > > If the truncate operation errors out, then won't the system will again > trigger a new autovacuum worker for the same relation as we update > stats at the end? Also, in general for regular tables, if there is an > error while it tries to WAL, it could again trigger the autovacuum > worker for the same relation. If this is true then unnecessarily it > will generate a lot of dirty pages and don't think it will be good for > the system to behave that way? > No new autovacuum worker will be forked in the read-only state and existing will have an error if they try to write WAL after barrier absorption. > > > > Another part of the patch that quite uneasy and need a discussion is that when the > > > > shutdown in the read-only state we do skip shutdown checkpoint and at a restart, first > > > > startup recovery will be performed and latter the read-only state will be restored to > > > > prohibit further WAL write irrespective of recovery checkpoint succeed or not. The > > > > concern is here if this startup recovery checkpoint wasn't ok, then it will never happen > > > > even if it's later put back into read-write mode. > > > > > > I am not able to understand this problem. What do you mean by > > > "recovery checkpoint succeed or not", do you add a try..catch and skip > > > any error while performing recovery checkpoint? > > > > What I think should happen is that the end-of-recovery checkpoint > > should be skipped, and then if the system is put back into read-write > > mode later we should do it then. > > > > But then if we have to perform recovery again, it will start from the > previous checkpoint. I think we have to live with it. > Let me explain the case, if we do skip the end-of-recovery checkpoint while starting the system in read-only mode and then later changing the state to read-write and do a few write operations and online checkpoints, that will be fine? I am yet to explore those things. Regards, Amul
pgsql-hackers by date: