Re: pg_upgrade and logical replication - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: pg_upgrade and logical replication |
Date | |
Msg-id | CAA4eK1+pWXqDSO+QpGtLFqcCP5+qCXkvLN-YviRqdXV+d7Fdow@mail.gmail.com Whole thread Raw |
In response to | Re: pg_upgrade and logical replication (Julien Rouhaud <rjuju123@gmail.com>) |
Responses |
Re: pg_upgrade and logical replication
|
List | pgsql-hackers |
On Wed, Mar 8, 2023 at 12:26 PM Julien Rouhaud <rjuju123@gmail.com> wrote: > > On Sat, 4 Mar 2023, 14:13 Amit Kapila, <amit.kapila16@gmail.com> wrote: >> >> >> > For the publisher nodes, that may be something nice to support (I'm assuming it >> > could be useful for more complex replication setups) but I'm not interested in >> > that at the moment as my goal is to reduce downtime for major upgrade of >> > physical replica, thus *not* doing pg_upgrade of the primary node, whether >> > physical or logical. I don't see why it couldn't be done later on, if/when >> > someone has a use case for it. >> > >> >> I thought there is value if we provide a way to upgrade both publisher >> and subscriber. > > > it's still unclear to me whether it's actually achievable on the publisher side, as running pg_upgrade leaves a "hole"in the WAL stream and resets the timeline, among other possible difficulties. Now I don't know much about logical replicationinternals so I'm clearly not the best person to answer those questions. > I think that is the part we need to analyze and see what are the challenges there. One part of the challenge is that we need to preserve slots that have some WAL locations like restart_lsn, confirmed_flush and we need WAL from those locations for decoding. I haven't analyzed this but isn't it possible to that on clean shutdown we confirm that all the WAL has been sent and confirmed by the logical subscriber in which case I think truncating WAL in pg_upgrade shouldn't be a problem? >> Now, you came up with a use case linking it to a >> physical replica where allowing an upgrade of only subscriber nodes is >> useful. It is possible that users find your steps easy to perform and >> didn't find them error-prone but it may be better to get some >> authentication of the same. I haven't yet analyzed all the steps in >> detail but let's see what others think. > > > It's been quite some time since and no one seemed to chime in or object. IMO doing a major version upgrade with limiteddowntime (so something faster than stopping postgres and running pg_upgrade) has always been difficult and never preventedanyone from doing it, so I don't think that it should be a blocker for what I'm suggesting here, especially sincethe current behavior of pg_upgrade on a subscriber node is IMHO broken. > > Is there something that can be done for pg16? I was thinking that having a fix for the normal and easy case could be acceptable:only allowing pg_upgrade to optionally, and not by default, preserve the subscription relations IFF all subscriptionsonly have tables in ready state. Different states should be transient, and it's easy to check as a user beforehandand also easy to check during pg_upgrade, so it seems like an acceptable limitations (which I personally see asa good sanity check, but YMMV). It could be lifted in later releases if wanted anyway. > > It's unclear to me whether this limited scope would also require to preserve the replication origins, but having lookedat the code I don't think it would be much of a problem as the local LSN doesn't have to be preserved. > I think we need to preserve replication origins as they help us to determine the WAL location from where to start the streaming after the upgrade. If we don't preserve those then from which location will the subscriber start streaming? We don't want to replicate the WAL which has already been sent. -- With Regards, Amit Kapila.
pgsql-hackers by date: