BUG #10432: failed to re-find parent key in index - Mailing list pgsql-bugs
From | m.sakrejda@gmail.com |
---|---|
Subject | BUG #10432: failed to re-find parent key in index |
Date | |
Msg-id | 20140523175217.2692.50568@wrigleys.postgresql.org Whole thread Raw |
Responses |
Re: BUG #10432: failed to re-find parent key in index
|
List | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 10432 Logged by: Maciek Sakrejda Email address: m.sakrejda@gmail.com PostgreSQL version: 9.3.4 Operating system: Ubuntu 12.04 LTS 64-bit Description: Hi, I just got the error `failed to re-find parent key in index "1665279" for split pages 175193/193740`. From the list archives [1], it looks like there's no easy recovery, but this happened on a replica right after the recovery trigger file was touched and recovery completed, so that's not a huge concern (unless it also happens to the next replica). Is there something to be learned from the failed cluster? The only odd thing I see in the logs is that it looks like we tried to take a base backup before recovery actually completed, though the attempt appears to have failed cleanly. Here are the logs from around the time of the failure, including WAL-E logs: May 23 16:21:28 postgres[10]: [1244-1] LOG: trigger file found: /etc/postgresql/wal-e.d/pull-env/STANDBY_OFF May 23 16:21:28 postgres[7545]: [5-1] FATAL: terminating walreceiver process due to administrator command May 23 16:21:28 wal_e.operator.backup INFO MSG: begin wal restore#012 STRUCTURED: time=2014-05-23T16:21:28.169498-00 pid=7757 action=wal-fetch key=s3://.../wal_005/000000020000033900000066.lzo prefix=.../ seg=000000020000033900000066 state=begin May 23 16:21:28 wal_e.blobstore.s3.s3_util WARNING MSG: could no longer locate object while performing wal restore#012 DETAIL: The absolute URI that could not be located is s3://.../wal_005/000000020000033900000066.lzo.#012 HINT: This can be normal when Postgres is trying to detect what timelines are available during restoration.#012 STRUCTURED: time=2014-05-23T16:21:28.364020-00 pid=7757 May 23 16:21:28 wal_e.operator.backup INFO MSG: complete wal restore#012 STRUCTURED: time=2014-05-23T16:21:28.365570-00 pid=7757 action=wal-fetch key=s3://.../wal_005/000000020000033900000066.lzo prefix=.../ seg=000000020000033900000066 state=complete May 23 16:21:28 postgres[10]: [1245-1] LOG: redo done at 339/650000C8 May 23 16:21:28 postgres[10]: [1246-1] LOG: last completed transaction was at log time 2014-05-23 14:43:31.251167+00 May 23 16:21:28 wal_e.operator.backup INFO MSG: begin wal restore#012 STRUCTURED: time=2014-05-23T16:21:28.535075-00 pid=7763 action=wal-fetch key=s3://.../wal_005/000000020000033900000065.lzo prefix=.../ seg=000000020000033900000065 state=begin May 23 16:21:29 wal_e.blobstore.s3.s3_util INFO MSG: completed download and decompression#012 DETAIL: Downloaded and decompressed "s3://.../wal_005/000000020000033900000065.lzo" to "pg_xlog/RECOVERYXLOG"#012 STRUCTURED: time=2014-05-23T16:21:29.094826-00 pid=7763 May 23 16:21:29 wal_e.operator.backup INFO MSG: complete wal restore#012 STRUCTURED: time=2014-05-23T16:21:29.096962-00 pid=7763 action=wal-fetch key=s3://.../wal_005/000000020000033900000065.lzo prefix=.../ seg=000000020000033900000065 state=complete May 23 16:21:29 postgres[10]: [1247-1] LOG: restored log file "000000020000033900000065" from archive May 23 16:21:29 wal_e.operator.backup INFO MSG: begin wal restore#012 STRUCTURED: time=2014-05-23T16:21:29.269909-00 pid=7769 action=wal-fetch key=s3://.../wal_005/00000003.history.lzo prefix=.../ seg=00000003.history state=begin May 23 16:21:29 wal_e.blobstore.s3.s3_util WARNING MSG: could no longer locate object while performing wal restore#012 DETAIL: The absolute URI that could not be located is s3://.../wal_005/00000003.history.lzo.#012 HINT: This can be normal when Postgres is trying to detect what timelines are available during restoration.#012 STRUCTURED: time=2014-05-23T16:21:29.424936-00 pid=7769 May 23 16:21:29 wal_e.operator.backup INFO MSG: complete wal restore#012 STRUCTURED: time=2014-05-23T16:21:29.426295-00 pid=7769 action=wal-fetch key=s3://.../wal_005/00000003.history.lzo prefix=.../ seg=00000003.history state=complete May 23 16:21:29 postgres[10]: [1248-1] LOG: selected new timeline ID: 3 May 23 16:21:29 wal_e.operator.backup INFO MSG: begin wal restore#012 STRUCTURED: time=2014-05-23T16:21:29.593188-00 pid=7775 action=wal-fetch key=s3://.../wal_005/00000002.history.lzo prefix=.../ seg=00000002.history state=begin May 23 16:21:29 postgres[7779]: [4-1] [unknown] LOG: connection received: host=[local] May 23 16:21:29 postgres[7779]: [5-1] postgres LOG: connection authorized: user=postgres database=postgres May 23 16:21:29 postgres[7781]: [4-1] [unknown] LOG: connection received: host=[local] May 23 16:21:29 postgres[7781]: [5-1] postgres LOG: connection authorized: user=postgres database=postgres May 23 16:21:29 postgres[7781]: [6-1] postgres ERROR: recovery is in progress May 23 16:21:29 postgres[7781]: [6-2] postgres HINT: WAL control functions cannot be executed during recovery. May 23 16:21:29 postgres[7781]: [6-3] postgres STATEMENT: COPY (SELECT file_name, lpad(file_offset::text, 8, '0') AS file_offset FROM pg_xlogfile_name_offset( pg_start_backup('freeze_start_2014-05-23T16:21:29.681458+00:00'))) TO STDOUT WITH CSV HEADER; May 23 16:21:29 wal_e.operator.backup WARNING MSG: blocking on sending WAL segments#012 DETAIL: The backup was not completed successfully, but we have to wait anyway. See README: TODO about pg_cancel_backup#012 STRUCTURED: time=2014-05-23T16:21:29.764625-00 pid=18790 May 23 16:21:29 postgres[7782]: [4-1] [unknown] LOG: connection received: host=[local] May 23 16:21:29 postgres[7782]: [5-1] postgres LOG: connection authorized: user=postgres database=postgres May 23 16:21:29 postgres[7782]: [6-1] postgres ERROR: recovery is in progress May 23 16:21:29 postgres[7782]: [6-2] postgres HINT: WAL control functions cannot be executed during recovery. May 23 16:21:29 postgres[7782]: [6-3] postgres STATEMENT: COPY (SELECT file_name, lpad(file_offset::text, 8, '0') AS file_offset FROM pg_xlogfile_name_offset( pg_stop_backup())) TO STDOUT WITH CSV HEADER; May 23 16:21:29 wal_e.main ERROR MSG: Could not stop hot backup#012 STRUCTURED: time=2014-05-23T16:21:29.844455-00 pid=18790 May 23 16:21:29 wal_e.blobstore.s3.s3_util INFO MSG: completed download and decompression#012 DETAIL: Downloaded and decompressed "s3://.../wal_005/00000002.history.lzo" to "pg_xlog/RECOVERYHISTORY"#012 STRUCTURED: time=2014-05-23T16:21:29.866686-00 pid=7775 May 23 16:21:29 wal_e.operator.backup INFO MSG: complete wal restore#012 STRUCTURED: time=2014-05-23T16:21:29.868954-00 pid=7775 action=wal-fetch key=s3://.../wal_005/00000002.history.lzo prefix=.../ seg=00000002.history state=complete May 23 16:21:29 postgres[10]: [1249-1] LOG: restored log file "00000002.history" from archive May 23 16:21:30 postgres[10]: [1250-1] LOG: archive recovery complete May 23 16:21:30 postgres[10]: [1251-1] FATAL: failed to re-find parent key in index "1665279" for split pages 175193/193740 May 23 16:21:31 postgres[8]: [4-1] LOG: startup process (PID 10) exited with exit code 1 May 23 16:21:31 postgres[8]: [5-1] LOG: terminating any other active server processes [1]: http://www.postgresql.org/message-id/7753.1291075332@sss.pgh.pa.us
pgsql-bugs by date: