Thread: Streaming replication and an archivelog
Hello All,
James Sewell,
PostgreSQL Team Lead / Solutions Architect
______________________________________

I am running 9.4 on Centos.
I have three servers, one master and two slaves. The slaves have the following recovery.conf
standby_mode = 'on'
primary_conninfo = 'user=postgres host=mastervip port=5432'
restore_command = 'scp -o BatchMode=yes postgres@backuphost:/archived_wals/%f %p'
recovery_target_timeline= 'latest'
Is there any way to combine following
- a master switch (ie: if node1 dies and node2 is promoted then node3 follows node2)
- using a WAL archive, such that if node2 goes down for two days it will get WALs from the archive if they are no longer on the master
At the moment the master switch works fine, but if I was to have a WAL archive with multiple timelines in it then I would end up in the newest timeline.
I suppose what I want is the following:
If I am a streaming replica only follow streamed timeline switches, not archive timeline switches.
Obviously if I am not a streaming replica I need to follow archive timeline switches so I don't break PIT recovery.
Possible?
James Sewell,
PostgreSQL Team Lead / Solutions Architect
______________________________________

The contents of this email are confidential and may be subject to legal or professional privilege and copyright. No representation is made that this email is free of viruses or other defects. If you have received this communication in error, you may not copy or distribute any part of it or otherwise disclose its contents to anyone. Please advise the sender of your incorrect receipt of this correspondence.
Hey,
This is an example of what I see when starting up a replica and attaching it to a master.
In this case the master is in timeline 3, but the archive log goes up to timeline 6.
2015-05-08 16:23:07 AEST @ ( 0 00000)LOG: database system was interrupted; last known up at 2015-05-08 16:21:14 AEST
2015-05-08 16:23:07 AEST @ ( 0 00000)LOG: restored log file "00000004.history" from archive
2015-05-08 16:23:08 AEST @ ( 0 00000)LOG: restored log file "00000005.history" from archive
2015-05-08 16:23:08 AEST @ ( 0 00000)LOG: restored log file "00000006.history" from archive
scp: /backup/production/archived_wals/00000007.history: No such file or directory
2015-05-08 16:23:08 AEST @ ( 0 00000)LOG: entering standby mode
2015-05-08 16:23:09 AEST @ ( 0 00000)LOG: restored log file "00000006.history" from archive
scp: /backup/production/archived_wals/000000060000000B000000C2: No such file or directory
scp: /backup/production/archived_wals/000000050000000B000000C2: No such file or directory
scp: /backup/production/archived_wals/000000040000000B000000C2: No such file or directory
2015-05-08 16:23:11 AEST @ ( 0 00000)LOG: restored log file "000000030000000B000000C2" from archive
2015-05-08 16:23:11 AEST @ ( 0 XX000)FATAL: requested timeline 6 is not a child of this server's history
2015-05-08 16:23:11 AEST @ ( 0 XX000)DETAIL: Latest checkpoint is at B/C2000060 on timeline 3, but in the history of the requested timeline, the server forked off from that timeline at B/BE000060.
2015-05-08 16:23:11 AEST @ ( 0 00000)LOG: startup process (PID 21893) exited with exit code 1
2015-05-08 16:23:11 AEST @ ( 0 00000)LOG: aborting startup due to startup process failure
Cheers,
James Sewell,
PostgreSQL Team Lead / Solutions Architect
______________________________________

On Fri, May 8, 2015 at 4:28 PM, James Sewell <james.sewell@lisasoft.com> wrote:
Hello All,I am running 9.4 on Centos.I have three servers, one master and two slaves. The slaves have the following recovery.confstandby_mode = 'on'primary_conninfo = 'user=postgres host=mastervip port=5432'restore_command = 'scp -o BatchMode=yes postgres@backuphost:/archived_wals/%f %p'recovery_target_timeline= 'latest'Is there any way to combine following
- a master switch (ie: if node1 dies and node2 is promoted then node3 follows node2)
- using a WAL archive, such that if node2 goes down for two days it will get WALs from the archive if they are no longer on the master
At the moment the master switch works fine, but if I was to have a WAL archive with multiple timelines in it then I would end up in the newest timeline.I suppose what I want is the following:If I am a streaming replica only follow streamed timeline switches, not archive timeline switches.Obviously if I am not a streaming replica I need to follow archive timeline switches so I don't break PIT recovery.Possible?
James Sewell,
PostgreSQL Team Lead / Solutions Architect
______________________________________
The contents of this email are confidential and may be subject to legal or professional privilege and copyright. No representation is made that this email is free of viruses or other defects. If you have received this communication in error, you may not copy or distribute any part of it or otherwise disclose its contents to anyone. Please advise the sender of your incorrect receipt of this correspondence.
OK I think I have a solution, using the following script as the restore_command
echo $1 | grep history > /dev/nullif [ $? -eq 0 ]; thenexit 1fiscp postgres@postgres3:/archived_wals/$1 $2
This seems to work, as it restricts slave servers from switching timeline from the archive by stopping them ever knowing about history files.
Can anyone see any problems with this approach?
Cheers,
James Sewell,
PostgreSQL Team Lead / Solutions Architect
______________________________________

On Fri, May 8, 2015 at 4:47 PM, James Sewell <james.sewell@lisasoft.com> wrote:
Hey,This is an example of what I see when starting up a replica and attaching it to a master.In this case the master is in timeline 3, but the archive log goes up to timeline 6.2015-05-08 16:23:07 AEST @ ( 0 00000)LOG: database system was interrupted; last known up at 2015-05-08 16:21:14 AEST2015-05-08 16:23:07 AEST @ ( 0 00000)LOG: restored log file "00000004.history" from archive2015-05-08 16:23:08 AEST @ ( 0 00000)LOG: restored log file "00000005.history" from archive2015-05-08 16:23:08 AEST @ ( 0 00000)LOG: restored log file "00000006.history" from archivescp: /backup/production/archived_wals/00000007.history: No such file or directory2015-05-08 16:23:08 AEST @ ( 0 00000)LOG: entering standby mode2015-05-08 16:23:09 AEST @ ( 0 00000)LOG: restored log file "00000006.history" from archivescp: /backup/production/archived_wals/000000060000000B000000C2: No such file or directoryscp: /backup/production/archived_wals/000000050000000B000000C2: No such file or directoryscp: /backup/production/archived_wals/000000040000000B000000C2: No such file or directory2015-05-08 16:23:11 AEST @ ( 0 00000)LOG: restored log file "000000030000000B000000C2" from archive2015-05-08 16:23:11 AEST @ ( 0 XX000)FATAL: requested timeline 6 is not a child of this server's history2015-05-08 16:23:11 AEST @ ( 0 XX000)DETAIL: Latest checkpoint is at B/C2000060 on timeline 3, but in the history of the requested timeline, the server forked off from that timeline at B/BE000060.2015-05-08 16:23:11 AEST @ ( 0 00000)LOG: startup process (PID 21893) exited with exit code 12015-05-08 16:23:11 AEST @ ( 0 00000)LOG: aborting startup due to startup process failureCheers,
James Sewell,
PostgreSQL Team Lead / Solutions Architect
______________________________________
On Fri, May 8, 2015 at 4:28 PM, James Sewell <james.sewell@lisasoft.com> wrote:Hello All,I am running 9.4 on Centos.I have three servers, one master and two slaves. The slaves have the following recovery.confstandby_mode = 'on'primary_conninfo = 'user=postgres host=mastervip port=5432'restore_command = 'scp -o BatchMode=yes postgres@backuphost:/archived_wals/%f %p'recovery_target_timeline= 'latest'Is there any way to combine following
- a master switch (ie: if node1 dies and node2 is promoted then node3 follows node2)
- using a WAL archive, such that if node2 goes down for two days it will get WALs from the archive if they are no longer on the master
At the moment the master switch works fine, but if I was to have a WAL archive with multiple timelines in it then I would end up in the newest timeline.I suppose what I want is the following:If I am a streaming replica only follow streamed timeline switches, not archive timeline switches.Obviously if I am not a streaming replica I need to follow archive timeline switches so I don't break PIT recovery.Possible?
James Sewell,
PostgreSQL Team Lead / Solutions Architect
______________________________________
The contents of this email are confidential and may be subject to legal or professional privilege and copyright. No representation is made that this email is free of viruses or other defects. If you have received this communication in error, you may not copy or distribute any part of it or otherwise disclose its contents to anyone. Please advise the sender of your incorrect receipt of this correspondence.
On Fri, May 8, 2015 at 2:29 PM James Sewell <james.sewell@lisasoft.com> wrote:
Hello All,I am running 9.4 on Centos.I have three servers, one master and two slaves. The slaves have the following recovery.confstandby_mode = 'on'primary_conninfo = 'user=postgres host=mastervip port=5432'restore_command = 'scp -o BatchMode=yes postgres@backuphost:/archived_wals/%f %p'recovery_target_timeline= 'latest'Is there any way to combine following
- a master switch (ie: if node1 dies and node2 is promoted then node3 follows node2)
- using a WAL archive, such that if node2 goes down for two days it will get WALs from the archive if they are no longer on the master
At the moment the master switch works fine, but if I was to have a WAL archive with multiple timelines in it then I would end up in the newest timeline.
The timeline switch would happen whenever do a standby promotion (or PITR) in which case you want the 2nd standby to follow the newly promoted master i.e. it should be following the new timeline.
I suppose what I want is the following:If I am a streaming replica only follow streamed timeline switches, not archive timeline switches.
I am trying to guess when would these two be different (unless there are two different masters writing to the same archive which I suppose is wrong anyways).
Can you write down an example for clarity so that others can understand your scenario better?
Obviously if I am not a streaming replica I need to follow archive timeline switches so I don't break PIT recovery.Possible?
James Sewell,
PostgreSQL Team Lead / Solutions Architect
______________________________________
The contents of this email are confidential and may be subject to legal or professional privilege and copyright. No representation is made that this email is free of viruses or other defects. If you have received this communication in error, you may not copy or distribute any part of it or otherwise disclose its contents to anyone. Please advise the sender of your incorrect receipt of this correspondence.