Home > mailing lists

Re: Switching timeline over streaming replication - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: Switching timeline over streaming replication
Date	December 26, 2012 20:31:38
Msg-id	50DB5EA9.7010406@vmware.com Whole thread Raw
In response to	Re: Switching timeline over streaming replication (Fujii Masao <masao.fujii@gmail.com>)
List	pgsql-hackers

Tree view

On 23.12.2012 16:37, Fujii Masao wrote:
> On Fri, Dec 21, 2012 at 1:48 AM, Fujii Masao<masao.fujii@gmail.com>  wrote:
>> On Sat, Dec 15, 2012 at 9:36 AM, Fujii Masao<masao.fujii@gmail.com>  wrote:
>>> I found another "requested timeline does not contain minimum recovery point"
>>> error scenario in HEAD:
>>>
>>> 1. Set up the master 'M', one standby 'S1', and one cascade standby 'S2'.
>>> 2. Shutdown the master 'M' and promote the standby 'S1', and wait for 'S2'
>>>      to reconnect to 'S1'.
>>> 3. Set up new cascade standby 'S3' connecting to 'S2'.
>>>      Then 'S3' fails to start the recovery because of the following error:
>>>
>>>      FATAL:  requested timeline 2 does not contain minimum recovery
>>> point 0/3000000 on timeline 1
>>>      LOG:  startup process (PID 33104) exited with exit code 1
>>>      LOG:  aborting startup due to startup process failure
>>>
>>> The result of pg_controldata of 'S3' is:
>>>
>>> Latest checkpoint location:           0/3000088
>>> Prior checkpoint location:            0/2000060
>>> Latest checkpoint's REDO location:    0/3000088
>>> Latest checkpoint's REDO WAL file:    000000020000000000000003
>>> Latest checkpoint's TimeLineID:       2
>>> <snip>
>>> Min recovery ending location:         0/3000000
>>> Min recovery ending loc's timeline:   1
>>> Backup start location:                0/0
>>> Backup end location:                  0/0
>>>
>>> The content of the timeline history file '00000002.history' is:
>>>
>>> 1       0/3000088       no recovery target specified
>>
>> I still could reproduce this problem. Attached is the shell script
>> which reproduces the problem.
>
> This problem happens when new standby starts up from the backup
> taken from another standby and its recovery starts from the shutdown
> checkpoint record which causes timeline switch. In this case,
> the timeline of minimum recovery point can be different from that of
> latest checkpoint (i.e., shutdown checkpoint). But the following check
> in StartupXLOG() assumes that they are always the same wrongly.
> So the problem happens.
>
>     /*
>      * The min recovery point should be part of the requested timeline's
>      * history, too.
>      */
>     if (!XLogRecPtrIsInvalid(ControlFile->minRecoveryPoint)&&
>         tliOfPointInHistory(ControlFile->minRecoveryPoint - 1, expectedTLEs) !=
>             ControlFile->minRecoveryPointTLI)
>         ereport(FATAL,
>                 (errmsg("requested timeline %u does not contain minimum recovery
> point %X/%X on timeline %u",
>                         recoveryTargetTLI,
>                         (uint32) (ControlFile->minRecoveryPoint>>  32),
>                         (uint32) ControlFile->minRecoveryPoint,
>                         ControlFile->minRecoveryPointTLI)));

No, it doesn't assume that min recovery point is on the same timeline as 
the checkpoint record. This is another variant of the "timeline history 
files are not included in the backup" problem discussed on the other 
thread with subject "pg_basebackup from cascading standby after timeline 
switch". If you remove the min recovery point check above, the test case 
still fails, with a different error message:

LOG:  unexpected timeline ID 1 in log segment 000000020000000000000003, 
offset 0

If you modify the test script to copy the 00000002.history file to the 
data-standby3/pg_xlog after running pg_basebackup, the test case works. 
(we still need to fix it, of course)

- Heikki

pgsql-hackers by date:

From: Heikki Linnakangas
Date: 26 December 2012, 20:14:29
Subject: Re: Feature Request: pg_replication_master()

From: Josh Berkus
Date: 26 December 2012, 20:32:43
Subject: Re: Feature Request: pg_replication_master()

Re: Switching timeline over streaming replication - Mailing list pgsql-hackers

Previous

Next