Thread: Postgresql 9.3.4 Streaming Replication Standby invalid Page block
PostgreSQL version: 9.3.4 Operating system: rhel 6.4 linux Action: stream replication Master/Slave Description: Last entries in the PostgreSQL log file before the standby crashed, the pri= mary seems unaffected LOG: restored log file "0000000100001127000000cc" from archive FATAL: invalid page in block 464698 of relation pg_tblspc/16435/PG_9.3_2013= 06121/16444/125127698 CONTEXT: xlog redo vacuum: rel 16435/16444/125127698; blk 512019, lastBlock= Vacuumed 0 LOG: startup process (PID 27797) exited with exit code 1 LOG: terminating any other active server processes We did re-started the database and the process of restoring the log file ha= s continued beyond this point, but is are standby server corrupted? thanks
On 07/02/2014 02:03 AM, Burgess, Freddie wrote: > PostgreSQL version: 9.3.4 > Operating system: rhel 6.4 linux > Action: stream replication Master/Slave > Description: > > Last entries in the PostgreSQL log file before the standby crashed, the primary seems unaffected > > LOG: restored log file "0000000100001127000000cc" from archive > FATAL: invalid page in block 464698 of relation pg_tblspc/16435/PG_9.3_201306121/16444/125127698 > CONTEXT: xlog redo vacuum: rel 16435/16444/125127698; blk 512019, lastBlockVacuumed 0 > LOG: startup process (PID 27797) exited with exit code 1 > LOG: terminating any other active server processes > > We did re-started the database and the process of restoring the log file has continued beyond this point, but is are standbyserver corrupted? Sounds exactly like this bug: http://www.postgresql.org/message-id/flat/CAL_0b1s4QCkFy_55kk_8XWcJPs7wsgVWf8vn4=jXe6V4R7Hxmg@mail.gmail.com but that was fixed in 9.3.3 already. Are you sure you're running 9.3.4 in the standby too? - Heikki
On 2014-07-02 14:02:27 +0300, Heikki Linnakangas wrote: > On 07/02/2014 02:03 AM, Burgess, Freddie wrote: > > PostgreSQL version: 9.3.4 > > Operating system: rhel 6.4 linux > > Action: stream replication Master/Slave > > Description: > > > >Last entries in the PostgreSQL log file before the standby crashed, the primary seems unaffected > > > >LOG: restored log file "0000000100001127000000cc" from archive > >FATAL: invalid page in block 464698 of relation pg_tblspc/16435/PG_9.3_201306121/16444/125127698 > >CONTEXT: xlog redo vacuum: rel 16435/16444/125127698; blk 512019, lastBlockVacuumed 0 > >LOG: startup process (PID 27797) exited with exit code 1 > >LOG: terminating any other active server processes > > > >We did re-started the database and the process of restoring the log file has continued beyond this point, but is are standbyserver corrupted? Do you run with data checksums enabled? > Sounds exactly like this bug: > > http://www.postgresql.org/message-id/flat/CAL_0b1s4QCkFy_55kk_8XWcJPs7wsgVWf8vn4=jXe6V4R7Hxmg@mail.gmail.com > > but that was fixed in 9.3.3 already. Are you sure you're running 9.3.4 in > the standby too? Hm - that bug was about uninitialized pages, not invalid ones. I don't immediately see why it'd be legal to have a invalid page (as in !PageIsVerified()) somewhere? At least not after reaching consistency. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
show data_checksums;=0A= data_checksums =0A= ----------------=0A= off=0A= =0A= tabsdb=3D# select version();=0A= = version=0A= ---------------------------------------------------------------------------= ---------------------------------------------------------------------------= ----------=0A= PostgreSQL 9.3.4 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 2= 0120313 (Red Hat 4.4.7-4). 64-bit=0A= =0A= On both Master/Standby=0A= ---------------------------------------------------------------------------= ---------------------------------------------------------------------------= ---------------------------------------------------------------------------= ---------------=0A= The standby replayed all of the outstanding WAL logs overnight and we have = caught up with the primary database now, and streaming replication is runni= ng fine now.=0A= =0A= The relation "pg_tblspc/16435/PG_9.3_201306121/16444/125127698" points to a= Partition tablespace with data from the year 2007. I verified that the row= counts match up between the master/slave on the tables that reside on that= tablespace.=0A= =0A= Is there anything else I can do to verify the consistency on the standby?= =0A= =0A= thanks=0A= =0A= ________________________________________=0A= From: Andres Freund [andres@2ndquadrant.com]=0A= Sent: Wednesday, July 02, 2014 7:09 AM=0A= To: Heikki Linnakangas=0A= Cc: Burgess, Freddie; "PostgreSQL Bugs =FD[pgsql-bugs@postgresql.org]=FD"= =0A= Subject: Re: [BUGS] Postgresql 9.3.4 Streaming Replication Standby invalid = Page block=0A= =0A= On 2014-07-02 14:02:27 +0300, Heikki Linnakangas wrote:=0A= > On 07/02/2014 02:03 AM, Burgess, Freddie wrote:=0A= > > PostgreSQL version: 9.3.4=0A= > > Operating system: rhel 6.4 linux=0A= > > Action: stream replication Master/Slave=0A= > > Description:=0A= > >=0A= > >Last entries in the PostgreSQL log file before the standby crashed, the = primary seems unaffected=0A= > >=0A= > >LOG: restored log file "0000000100001127000000cc" from archive=0A= > >FATAL: invalid page in block 464698 of relation pg_tblspc/16435/PG_9.3_2= 01306121/16444/125127698=0A= > >CONTEXT: xlog redo vacuum: rel 16435/16444/125127698; blk 512019, lastBl= ockVacuumed 0=0A= > >LOG: startup process (PID 27797) exited with exit code 1=0A= > >LOG: terminating any other active server processes=0A= > >=0A= > >We did re-started the database and the process of restoring the log file= has continued beyond this point, but is are standby server corrupted?=0A= =0A= Do you run with data checksums enabled?=0A= =0A= > Sounds exactly like this bug:=0A= >=0A= > http://www.postgresql.org/message-id/flat/CAL_0b1s4QCkFy_55kk_8XWcJPs7wsg= VWf8vn4=3DjXe6V4R7Hxmg@mail.gmail.com=0A= >=0A= > but that was fixed in 9.3.3 already. Are you sure you're running 9.3.4 in= =0A= > the standby too?=0A= =0A= Hm - that bug was about uninitialized pages, not invalid ones. I don't=0A= immediately see why it'd be legal to have a invalid page (as in=0A= !PageIsVerified()) somewhere? At least not after reaching consistency.=0A= =0A= Greetings,=0A= =0A= Andres Freund=0A= =0A= --=0A= Andres Freund http://www.2ndQuadrant.com/=0A= PostgreSQL Development, 24x7 Support, Training & Services=0A=
Today, we have the same error in the logs, but now the standby server will = not re-start at all. This error is referring to a static partition holding = historical data from 2006, so the problem has to be related to autovaccum= =0A= =0A= FATAL: invalid page in block 420538 of relation pg_tblspc/16434/PG_9.3_2013= 06121/16444/125127662=0A= CONTEXT: xlog redo vacuum: rel 16434/16444/125127662; blk 582590, lastBlock= Vacuumed 0=0A= LOG: startup process (PID 14307) exited with exit code 1=0A= LOG: terminating any other active server processes=0A= =0A= Are there any solutions?=0A= =0A= thanks=0A= ________________________________________=0A= From: pgsql-bugs-owner@postgresql.org [pgsql-bugs-owner@postgresql.org] on = behalf of Burgess, Freddie [FBurgess@Radiantblue.com]=0A= Sent: Wednesday, July 02, 2014 4:04 PM=0A= To: Andres Freund; Heikki Linnakangas=0A= Cc: "PostgreSQL Bugs =FD[pgsql-bugs@postgresql.org]=FD"=0A= Subject: Re: [BUGS] Postgresql 9.3.4 Streaming Replication Standby invalid = Page block=0A= =0A= show data_checksums;=0A= data_checksums=0A= ----------------=0A= off=0A= =0A= tabsdb=3D# select version();=0A= = version=0A= ---------------------------------------------------------------------------= ---------------------------------------------------------------------------= ----------=0A= PostgreSQL 9.3.4 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 2= 0120313 (Red Hat 4.4.7-4). 64-bit=0A= =0A= On both Master/Standby=0A= ---------------------------------------------------------------------------= ---------------------------------------------------------------------------= ---------------------------------------------------------------------------= ---------------=0A= The standby replayed all of the outstanding WAL logs overnight and we have = caught up with the primary database now, and streaming replication is runni= ng fine now.=0A= =0A= The relation "pg_tblspc/16435/PG_9.3_201306121/16444/125127698" points to a= Partition tablespace with data from the year 2007. I verified that the row= counts match up between the master/slave on the tables that reside on that= tablespace.=0A= =0A= Is there anything else I can do to verify the consistency on the standby?= =0A= =0A= thanks=0A= =0A= ________________________________________=0A= From: Andres Freund [andres@2ndquadrant.com]=0A= Sent: Wednesday, July 02, 2014 7:09 AM=0A= To: Heikki Linnakangas=0A= Cc: Burgess, Freddie; "PostgreSQL Bugs =FD[pgsql-bugs@postgresql.org]=FD"= =0A= Subject: Re: [BUGS] Postgresql 9.3.4 Streaming Replication Standby invalid = Page block=0A= =0A= On 2014-07-02 14:02:27 +0300, Heikki Linnakangas wrote:=0A= > On 07/02/2014 02:03 AM, Burgess, Freddie wrote:=0A= > > PostgreSQL version: 9.3.4=0A= > > Operating system: rhel 6.4 linux=0A= > > Action: stream replication Master/Slave=0A= > > Description:=0A= > >=0A= > >Last entries in the PostgreSQL log file before the standby crashed, the = primary seems unaffected=0A= > >=0A= > >LOG: restored log file "0000000100001127000000cc" from archive=0A= > >FATAL: invalid page in block 464698 of relation pg_tblspc/16435/PG_9.3_2= 01306121/16444/125127698=0A= > >CONTEXT: xlog redo vacuum: rel 16435/16444/125127698; blk 512019, lastBl= ockVacuumed 0=0A= > >LOG: startup process (PID 27797) exited with exit code 1=0A= > >LOG: terminating any other active server processes=0A= > >=0A= > >We did re-started the database and the process of restoring the log file= has continued beyond this point, but is are standby server corrupted?=0A= =0A= Do you run with data checksums enabled?=0A= =0A= > Sounds exactly like this bug:=0A= >=0A= > http://www.postgresql.org/message-id/flat/CAL_0b1s4QCkFy_55kk_8XWcJPs7wsg= VWf8vn4=3DjXe6V4R7Hxmg@mail.gmail.com=0A= >=0A= > but that was fixed in 9.3.3 already. Are you sure you're running 9.3.4 in= =0A= > the standby too?=0A= =0A= Hm - that bug was about uninitialized pages, not invalid ones. I don't=0A= immediately see why it'd be legal to have a invalid page (as in=0A= !PageIsVerified()) somewhere? At least not after reaching consistency.=0A= =0A= Greetings,=0A= =0A= Andres Freund=0A= =0A= --=0A= Andres Freund http://www.2ndQuadrant.com/=0A= PostgreSQL Development, 24x7 Support, Training & Services=0A= =0A= =0A= --=0A= Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)=0A= To make changes to your subscription:=0A= http://www.postgresql.org/mailpref/pgsql-bugs=0A=