Thread: PostgreSQL switchover process
Hi All,
I'm trying to do the Switchover process in postgres.
1) I have my primary cluster running on port 5432 on server A
2) secondary cluster on the port 5432 on server B
I did setup streaming replication b/w primary and secondary using replication slots.
I did the following steps to perform switchover
1) ran a checkpoint on primary.
2) verified secondary status and it is up to date with primary.
3) stopped primary using pg_ctl -mf i.e force.
4) promoted primary using trigger file and verified the status of the new primary; it is not in recovery mode and running on new timeline ID 2.
5) I did create the recovery.conf file on old primary i.e new standby with
primaray_coninfo: new primary(B)
recovery_target_timeline='latest'
primary_slot_name: "slot I created in new_standby"
6) I started the new standby. It is up and running but it is not in recovery mode.
But the wal sender process on primary and receiver process on secondary are not running. i.e it looks like my replication is broken.
what steps am I missing here without rebuilding a new standby i.e old primary I want the replication happen b/w my new primary and secondary.
Could you suggest any good documentation?
thanks in advance.
Thanks,
Veeru.
On Mon, Aug 24, 2020 at 03:56:43PM -0400, rams nalabolu wrote: > I did the following steps to perform switchover > 1) ran a checkpoint on primary. > 2) verified secondary status and it is up to date with primary. > 3) stopped primary using pg_ctl -mf i.e force. This is not a force mode, but the fast mode, where all existing connections are forcibly stopped, and that the shutdown is clean, with a shutdown checkpoint generated before finishing the shutdown sequence. During this shutdown, the primary makes sure that all standbys have flushed WAL up to the point of the shutdown checkpoint. Note that this makes rather unnecessary the checkpoint you ran on the primary in step 1. > 4) promoted primary using trigger file and verified the status of the new > primary; it is not in recovery mode and running on new timeline ID 2. I think that you mean promotion of the standby here. > 5) I did create the recovery.conf file on old primary i.e new standby with > primaray_coninfo: new primary(B) > recovery_target_timeline='latest' > primary_slot_name: "slot I created in new_standby" Typo here. You mean primary_conninfo. > 6) I started the new standby. It is up and running but it is not in > recovery mode. This switchover flow is a good base, so it should be possible to reuse your previous primary as a standby. > But the wal sender process on primary and receiver process on secondary are > not running. i.e it looks like my replication is broken. > what steps am I missing here without rebuilding a new standby i.e old > primary I want the replication happen b/w my new primary and > secondary. It may be many things without more information. Could you check pg_stat_replication on the primary and pg_stat_wal_receiver on the standby? Most likely something is wrong with primary_conninfo, but the logs of the standby should have enough information to let you know what happened. Another thing you are not telling is the version of PostgreSQL you are using here. In 12 and newer versions, support for recovery.conf has been removed, requiring roughly the creation of standby.signal with all recovery parameters set in postgresql.conf if you want to set up a standby. -- Michael
Attachment
On 2020/08/25 11:22, Michael Paquier wrote: > On Mon, Aug 24, 2020 at 03:56:43PM -0400, rams nalabolu wrote: >> I did the following steps to perform switchover >> 1) ran a checkpoint on primary. >> 2) verified secondary status and it is up to date with primary. >> 3) stopped primary using pg_ctl -mf i.e force. > > This is not a force mode, but the fast mode, where all existing > connections are forcibly stopped, and that the shutdown is clean, with > a shutdown checkpoint generated before finishing the shutdown > sequence. During this shutdown, the primary makes sure that all > standbys have flushed WAL up to the point of the shutdown checkpoint. > Note that this makes rather unnecessary the checkpoint you ran on the > primary in step 1. > >> 4) promoted primary using trigger file and verified the status of the new >> primary; it is not in recovery mode and running on new timeline ID 2. > > I think that you mean promotion of the standby here. > >> 5) I did create the recovery.conf file on old primary i.e new standby with >> primaray_coninfo: new primary(B) >> recovery_target_timeline='latest' >> primary_slot_name: "slot I created in new_standby" Did you enable standby_mode? > > Typo here. You mean primary_conninfo. > >> 6) I started the new standby. It is up and running but it is not in >> recovery mode. > > This switchover flow is a good base, so it should be possible to reuse > your previous primary as a standby. > >> But the wal sender process on primary and receiver process on secondary are >> not running. i.e it looks like my replication is broken. >> what steps am I missing here without rebuilding a new standby i.e old >> primary I want the replication happen b/w my new primary and >> secondary. > > It may be many things without more information. Could you check > pg_stat_replication on the primary and pg_stat_wal_receiver on the > standby? Most likely something is wrong with primary_conninfo, but > the logs of the standby should have enough information to let you know > what happened. Another thing you are not telling is the version of > PostgreSQL you are using here. In 12 and newer versions, > support for recovery.conf has been removed, requiring roughly the > creation of standby.signal with all recovery parameters set in > postgresql.conf if you want to set up a standby. > -- > Michael > -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Yes, I did.
On Mon, Aug 24, 2020 at 10:43 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/08/25 11:22, Michael Paquier wrote:
> On Mon, Aug 24, 2020 at 03:56:43PM -0400, rams nalabolu wrote:
>> I did the following steps to perform switchover
>> 1) ran a checkpoint on primary.
>> 2) verified secondary status and it is up to date with primary.
>> 3) stopped primary using pg_ctl -mf i.e force.
>
> This is not a force mode, but the fast mode, where all existing
> connections are forcibly stopped, and that the shutdown is clean, with
> a shutdown checkpoint generated before finishing the shutdown
> sequence. During this shutdown, the primary makes sure that all
> standbys have flushed WAL up to the point of the shutdown checkpoint.
> Note that this makes rather unnecessary the checkpoint you ran on the
> primary in step 1.
>
>> 4) promoted primary using trigger file and verified the status of the new
>> primary; it is not in recovery mode and running on new timeline ID 2.
>
> I think that you mean promotion of the standby here.
>
>> 5) I did create the recovery.conf file on old primary i.e new standby with
>> primaray_coninfo: new primary(B)
>> recovery_target_timeline='latest'
>> primary_slot_name: "slot I created in new_standby"
Did you enable standby_mode?
>
> Typo here. You mean primary_conninfo.
>
>> 6) I started the new standby. It is up and running but it is not in
>> recovery mode.
>
> This switchover flow is a good base, so it should be possible to reuse
> your previous primary as a standby.
>
>> But the wal sender process on primary and receiver process on secondary are
>> not running. i.e it looks like my replication is broken.
>> what steps am I missing here without rebuilding a new standby i.e old
>> primary I want the replication happen b/w my new primary and
>> secondary.
>
> It may be many things without more information. Could you check
> pg_stat_replication on the primary and pg_stat_wal_receiver on the
> standby? Most likely something is wrong with primary_conninfo, but
> the logs of the standby should have enough information to let you know
> what happened. Another thing you are not telling is the version of
> PostgreSQL you are using here. In 12 and newer versions,
> support for recovery.conf has been removed, requiring roughly the
> creation of standby.signal with all recovery parameters set in
> postgresql.conf if you want to set up a standby.
> --
> Michael
>
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION