Home > mailing lists

Thread: PostgreSQL switchover process

PostgreSQL switchover process

From

rams nalabolu

Date:

24 August 2020, 19:56:43

Hi All,

I'm trying to do the Switchover process in postgres.

1) I have my primary cluster running on port 5432 on server A

2) secondary cluster on the port 5432 on server B

I did setup streaming replication b/w primary and secondary using replication slots.

I did the following steps to perform switchover

1) ran a checkpoint on primary.

2) verified secondary status and it is up to date with primary.

3) stopped primary using pg_ctl -mf i.e force.

4) promoted primary using trigger file and verified the status of the new primary; it is not in recovery mode and running on new timeline ID 2.

5) I did create the recovery.conf file on old primary i.e new standby with

primaray_coninfo: new primary(B)

recovery_target_timeline='latest'

primary_slot_name: "slot I created in new_standby"

6) I started the new standby. It is up and running but it is not in recovery mode.

But the wal sender process on primary and receiver process on secondary are not running. i.e it looks like my replication is broken.

what steps am I missing here without rebuilding a new standby i.e old primary I want the replication happen b/w my new primary and secondary.

Could you suggest any good documentation?

thanks in advance.

Thanks,

Veeru.

Re: PostgreSQL switchover process

From

Michael Paquier

Date:

25 August 2020, 02:22:00

On Mon, Aug 24, 2020 at 03:56:43PM -0400, rams nalabolu wrote:
> I did the following steps to perform switchover
> 1) ran a checkpoint on primary.
> 2) verified secondary status and it is up to date with primary.
> 3) stopped primary using pg_ctl -mf i.e force.

This is not a force mode, but the fast mode, where all existing
connections are forcibly stopped, and that the shutdown is clean, with
a shutdown checkpoint generated before finishing the shutdown
sequence.  During this shutdown, the primary makes sure that all
standbys have flushed WAL up to the point of the shutdown checkpoint.
Note that this makes rather unnecessary the checkpoint you ran on the
primary in step 1.

> 4) promoted primary using trigger file and verified the status of the new
> primary; it is not in recovery mode and running on new timeline ID 2.

I think that you mean promotion of the standby here.

> 5) I did create the recovery.conf file on old primary i.e new standby with
>       primaray_coninfo: new primary(B)
>       recovery_target_timeline='latest'
>       primary_slot_name: "slot I created in new_standby"

Typo here.  You mean primary_conninfo.

> 6) I started the new standby. It is up and running but it is not in
> recovery mode.

This switchover flow is a good base, so it should be possible to reuse
your previous primary as a standby.

> But the wal sender process on primary and receiver process on secondary are
> not running. i.e it looks like my replication is broken.
> what steps am I missing here without rebuilding a new standby i.e old
> primary I want the replication happen b/w my new primary and
> secondary.

It may be many things without more information.  Could you check
pg_stat_replication on the primary and pg_stat_wal_receiver on the
standby?  Most likely something is wrong with primary_conninfo, but
the logs of the standby should have enough information to let you know
what happened.  Another thing you are not telling is the version of
PostgreSQL you are using here.  In 12 and newer versions,
support for recovery.conf has been removed, requiring roughly the
creation of standby.signal with all recovery parameters set in
postgresql.conf if you want to set up a standby.
--
Michael

Attachment

signature.asc

Re: PostgreSQL switchover process

From

Fujii Masao

Date:

25 August 2020, 02:43:26


On 2020/08/25 11:22, Michael Paquier wrote:
> On Mon, Aug 24, 2020 at 03:56:43PM -0400, rams nalabolu wrote:
>> I did the following steps to perform switchover
>> 1) ran a checkpoint on primary.
>> 2) verified secondary status and it is up to date with primary.
>> 3) stopped primary using pg_ctl -mf i.e force.
> 
> This is not a force mode, but the fast mode, where all existing
> connections are forcibly stopped, and that the shutdown is clean, with
> a shutdown checkpoint generated before finishing the shutdown
> sequence.  During this shutdown, the primary makes sure that all
> standbys have flushed WAL up to the point of the shutdown checkpoint.
> Note that this makes rather unnecessary the checkpoint you ran on the
> primary in step 1.
> 
>> 4) promoted primary using trigger file and verified the status of the new
>> primary; it is not in recovery mode and running on new timeline ID 2.
> 
> I think that you mean promotion of the standby here.
> 
>> 5) I did create the recovery.conf file on old primary i.e new standby with
>>        primaray_coninfo: new primary(B)
>>        recovery_target_timeline='latest'
>>        primary_slot_name: "slot I created in new_standby"

Did you enable standby_mode?


> 
> Typo here.  You mean primary_conninfo.
> 
>> 6) I started the new standby. It is up and running but it is not in
>> recovery mode.
> 
> This switchover flow is a good base, so it should be possible to reuse
> your previous primary as a standby.
> 
>> But the wal sender process on primary and receiver process on secondary are
>> not running. i.e it looks like my replication is broken.
>> what steps am I missing here without rebuilding a new standby i.e old
>> primary I want the replication happen b/w my new primary and
>> secondary.
> 
> It may be many things without more information.  Could you check
> pg_stat_replication on the primary and pg_stat_wal_receiver on the
> standby?  Most likely something is wrong with primary_conninfo, but
> the logs of the standby should have enough information to let you know
> what happened.  Another thing you are not telling is the version of
> PostgreSQL you are using here.  In 12 and newer versions,
> support for recovery.conf has been removed, requiring roughly the
> creation of standby.signal with all recovery parameters set in
> postgresql.conf if you want to set up a standby.
> --
> Michael
> 

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: PostgreSQL switchover process

From

rams nalabolu

Date:

25 August 2020, 04:01:21

Yes, I did.

On Mon, Aug 24, 2020 at 10:43 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/08/25 11:22, Michael Paquier wrote:
> On Mon, Aug 24, 2020 at 03:56:43PM -0400, rams nalabolu wrote:
>> I did the following steps to perform switchover
>> 1) ran a checkpoint on primary.
>> 2) verified secondary status and it is up to date with primary.
>> 3) stopped primary using pg_ctl -mf i.e force.
>
> This is not a force mode, but the fast mode, where all existing
> connections are forcibly stopped, and that the shutdown is clean, with
> a shutdown checkpoint generated before finishing the shutdown
> sequence. During this shutdown, the primary makes sure that all
> standbys have flushed WAL up to the point of the shutdown checkpoint.
> Note that this makes rather unnecessary the checkpoint you ran on the
> primary in step 1.
>
>> 4) promoted primary using trigger file and verified the status of the new
>> primary; it is not in recovery mode and running on new timeline ID 2.
>
> I think that you mean promotion of the standby here.
>
>> 5) I did create the recovery.conf file on old primary i.e new standby with
>> primaray_coninfo: new primary(B)
>> recovery_target_timeline='latest'
>> primary_slot_name: "slot I created in new_standby"

Did you enable standby_mode?

>
> Typo here. You mean primary_conninfo.
>
>> 6) I started the new standby. It is up and running but it is not in
>> recovery mode.
>
> This switchover flow is a good base, so it should be possible to reuse
> your previous primary as a standby.
>
>> But the wal sender process on primary and receiver process on secondary are
>> not running. i.e it looks like my replication is broken.
>> what steps am I missing here without rebuilding a new standby i.e old
>> primary I want the replication happen b/w my new primary and
>> secondary.
>
> It may be many things without more information. Could you check
> pg_stat_replication on the primary and pg_stat_wal_receiver on the
> standby? Most likely something is wrong with primary_conninfo, but
> the logs of the standby should have enough information to let you know
> what happened. Another thing you are not telling is the version of
> PostgreSQL you are using here. In 12 and newer versions,
> support for recovery.conf has been removed, requiring roughly the
> creation of standby.signal with all recovery parameters set in
> postgresql.conf if you want to set up a standby.
> --
> Michael
>

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION