BUG #8579: CoreDump of background writer process - Mailing list pgsql-bugs
From | rgr@cslab.de |
---|---|
Subject | BUG #8579: CoreDump of background writer process |
Date | |
Msg-id | E1Vdyst-000374-Fd@wrigleys.postgresql.org Whole thread Raw |
Responses |
Re: BUG #8579: CoreDump of background writer process
Re: BUG #8579: CoreDump of background writer process |
List | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 8579 Logged by: Rene Gruen Email address: rgr@cslab.de PostgreSQL version: 9.0.13 Operating system: QNX 6.5 Description: Greetings everybody. We are using postgresql for round about 10 years in our software. Starting with our update from version 8.3 to 9.0 we are receveing coredumps of the server process. This seems to happen while inserting different data into different tables via COPY. I do not add some example data / table-descriptions, because the error does not ocure each time (also using the same data and the same table) I have seen the core using ecpg (examples one and two) and using psql (example 3). Each time there is a message like "could not seek to block XXX in file YYY" just before the server terminates. We are using QNX 6.5 SP 1 and a port based on the pkgsrc. I hope the attached data will help to locate the problem. Example one (using a "normal" version): Logfile: ---633794700-2013-11-01 00:09:58 CET:CONTEXT: automatic analyze of table "gltdb.glttab001.glt_dp_para_trend_1hour" ---1507368-2013-11-01 00:10:23 CET:LOG: received SIGHUP, reloading configuration files ---1519657-2013-11-01 00:10:23 CET:ERROR: could not seek to block 149256 in file "pg_tblspc/17639/PG_9.0_201008051/16387/28109.1": Interrupted function call ---1519657-2013-11-01 00:10:23 CET:CONTEXT: writing block 149256 of relation pg_tblspc/17639/PG_9.0_201008051/16387/28109 ---1507368-2013-11-01 00:11:58 CET:LOG: background writer process (PID 1519657) was terminated by signal 11: segmentation violation ---1507368-2013-11-01 00:11:58 CET:LOG: terminating any other active server processes contents of the dump-file: #0 0x00000000 in ?? () #1 0x08205ef4 in BackgroundWriterMain () #2 0x080e2759 in AuxiliaryProcessMain () #3 0x0820bb87 in StartChildProcess () #4 0x0820deca in reaper () #5 <signal handler called> #6 0xb033e412 in SignalWaitinfo () from /lib/libc.so.3 #7 0xb0371ef2 in _poll () from /lib/libc.so.3 #8 0xb0372253 in select () from /lib/libc.so.3 #9 0x0820f46b in PostmasterMain () #10 0x081b6bfb in main () Example two (using a debug-version): Logfile: ODBChist[5689466]-gltdb-[local]-9003121-2013-11-05 16:31:46 CET:LOG: duration: 65108.037 ms execute copy_stmt: COPY glt_dp_para_trend_1min FROM '/tmp/odbc5689466-65732-1-000002' ODBChist[5689466]-gltdb-[local]-9003121-2013-11-05 16:33:21 CET:LOG: duration: 94909.477 ms execute copy_stmt: COPY glt_dp_para_trend_1min FROM '/tmp/odbc5689466-65732-1-000003' ODBChist[5689466]-gltdb-[local]-9003121-2013-11-05 16:35:24 CET:LOG: duration: 122674.228 ms execute copy_stmt: COPY glt_dp_para_trend_1min FROM '/tmp/odbc5689466-65732-1-000004' ODBChist[5689466]-gltdb-[local]-9003121-2013-11-05 16:37:44 CET:LOG: duration: 140464.505 ms execute copy_stmt: COPY glt_dp_para_trend_1min FROM '/tmp/odbc5689466-65732-1-000005' ODBChist[5689466]-gltdb-[local]-9003121-2013-11-05 16:40:06 CET:LOG: duration: 141908.285 ms execute copy_stmt: COPY glt_dp_para_trend_1min FROM '/tmp/odbc5689466-65732-1-000006' ODBChist[5689466]-gltdb-[local]-9003121-2013-11-05 16:42:34 CET:LOG: duration: 147574.417 ms execute copy_stmt: COPY glt_dp_para_trend_1min FROM '/tmp/odbc5689466-65732-1-000007' ODBChist[5689466]-gltdb-[local]-9003121-2013-11-05 16:44:46 CET:LOG: duration: 131793.832 ms execute copy_stmt: COPY glt_dp_para_trend_1min FROM '/tmp/odbc5689466-65732-1-000008' ODBChist[5689466]-gltdb-[local]-9003121-2013-11-05 16:48:08 CET:LOG: duration: 202173.063 ms execute copy_stmt: COPY glt_dp_para_trend_1min FROM '/tmp/odbc5689466-65732-1-000009' ODBChist[5689466]-gltdb-[local]-9003121-2013-11-05 16:51:25 CET:LOG: duration: 196979.858 ms execute copy_stmt: COPY glt_dp_para_trend_1min FROM '/tmp/odbc5689466-65732-1-000010' ---1478697-2013-11-05 16:53:23 CET:ERROR: could not seek to block 267907 in file "pg_tblspc/17639/PG_9.0_201008051/16387/28106.2": Interrupted function call ---1478697-2013-11-05 16:53:23 CET:CONTEXT: writing block 267907 of relation pg_tblspc/17639/PG_9.0_201008051/16387/28106 ---1466408-2013-11-05 16:54:52 CET:LOG: background writer process (PID 1478697) was terminated by signal 11: segmentation violation ---1466408-2013-11-05 16:54:52 CET:LOG: terminating any other active server processes ---1184063623-2013-11-05 16:54:52 CET:WARNING: terminating connection because of crash of another server process ---1184063623-2013-11-05 16:54:52 CET:DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. ---1184063623-2013-11-05 16:54:52 CET:HINT: In a moment you should be able to reconnect to the database and repeat your command. GLTlogger[2777125]-gltdb-[local]-2908210-2013-11-05 16:54:52 CET:WARNING: terminating connection because of crash of another server process GLTlogger[2777125]-gltdb-[local]-2908210-2013-11-05 16:54:52 CET:DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. GLTlogger[2777125]-gltdb-[local]-2908210-2013-11-05 16:54:52 CET:HINT: In a moment you should be able to reconnect to the database and repeat your command. ODBChist[5689466]-gltdb-[local]-9003121-2013-11-05 16:54:52 CET:WARNING: terminating connection because of crash of another server process ODBChist[5689466]-gltdb-[local]-9003121-2013-11-05 16:54:52 CET:DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. ODBChist[5689466]-gltdb-[local]-9003121-2013-11-05 16:54:52 CET:HINT: In a moment you should be able to reconnect to the database and repeat your command. ODBChist[5689466]-gltdb-[local]-9003121-2013-11-05 16:54:52 CET:CONTEXT: COPY glt_dp_para_trend_1min, line 117201: "65732 2012-12-19 13:45:00+01 23016900.005000 1 I" ---1466408-2013-11-05 16:54:53 CET:LOG: all server processes terminated; reinitializing ---1752227881-2013-11-05 16:54:53 CET:LOG: database system was interrupted; last known up at 2013-11-05 16:52:06 CET ---1752227881-2013-11-05 16:54:53 CET:LOG: database system was not properly shut down; automatic recovery in progress ---1752227881-2013-11-05 16:54:53 CET:LOG: redo starts at 38/D300E054 [unknown]-gltdb-[local]-1752231980-2013-11-05 16:55:00 CET:FATAL: the database system is in recovery mode ---1752227881-2013-11-05 16:55:10 CET:LOG: invalid magic number 0000 in log file 56, segment 221, offset 11935744 ---1752227881-2013-11-05 16:55:10 CET:LOG: redo done at 38/DDB60ED4 ---1752227881-2013-11-05 16:55:10 CET:LOG: last completed transaction was at log time 2013-11-05 16:53:27.692009+01 ---1752391757-2013-11-05 16:55:13 CET:LOG: autovacuum launcher started ---1466408-2013-11-05 16:55:13 CET:LOG: database system is ready to accept connections coredump: Program terminated with signal 11, Segmentation fault. [New pid 1478697 tid 1] #0 0x00000000 in ?? () (gdb) bt #0 0x00000000 in ?? () #1 0x08205ef4 in BackgroundWriterMain () at bgwriter.c:329 #2 0x080e2759 in AuxiliaryProcessMain (argc=2, argv=0x8047460) at bootstrap.c:417 #3 0x0820bb87 in StartChildProcess (type=<value optimized out>) at postmaster.c:4458 #4 0x0820deca in reaper (postgres_signal_arg=18) at postmaster.c:2418 #5 <signal handler called> #6 0xb033e412 in SignalWaitinfo () from /lib/libc.so.3 #7 0xb0371ef2 in _poll () from /lib/libc.so.3 #8 0xb0372253 in select () from /lib/libc.so.3 #9 0x0820f46b in PostmasterMain (argc=8, argv=0x84b5100) at postmaster.c:1414 #10 0x081b6bfb in main (argc=8, argv=0x84b5100) at main.c:188 (gdb) up #1 0x08205ef4 in BackgroundWriterMain () at bgwriter.c:329 329 bgwriter.c: No such file or directory. in bgwriter.c 329@bgwriter.c: MemoryContextResetAndDeleteChildren(bgwriter_context); (gdb) print bgwriter_context $1 = (MemoryContext) 0x84c0c00 (gdb) print *bgwriter_context $3 = {type = T_Invalid, methods = 0x84c0d00, parent = 0x10, firstchild = 0x0, nextchild = 0x0, name = 0x0} (gdb) Example Tree (debug version): psql-gltdb-[local]-29880389-2013-11-05 22:27:39 CET:LOG: duration: 244502.586 ms statement: COPY glttab018.pdbt_d4_contobject FROM STDIN; psql-gltdb-[local]-30502981-2013-11-05 22:32:14 CET:LOG: duration: 274090.058 ms statement: COPY glttab018.pdbt_d4_contparameter FROM STDIN; psql-gltdb-[local]-31166533-2013-11-05 22:36:52 CET:LOG: duration: 276984.615 ms statement: COPY glttab018.pdbt_d4_contparameter FROM STDIN; psql-gltdb-[local]-31858757-2013-11-05 22:41:40 CET:LOG: duration: 288205.898 ms statement: COPY glttab018.pdbt_d4_contparameter FROM STDIN; psql-gltdb-[local]-32526405-2013-11-05 22:46:31 CET:LOG: duration: 290317.574 ms statement: COPY glttab018.pdbt_d4_contparameter FROM STDIN; psql-gltdb-[local]-33275974-2013-11-05 22:51:23 CET:LOG: duration: 291812.346 ms statement: COPY glttab018.pdbt_d4_contparameter FROM STDIN; psql-gltdb-[local]-33972293-2013-11-05 22:56:12 CET:LOG: duration: 288272.888 ms statement: COPY glttab018.pdbt_d4_contparameter FROM STDIN; psql-gltdb-[local]-34664517-2013-11-05 23:01:19 CET:LOG: duration: 306371.118 ms statement: COPY glttab018.pdbt_d4_contparameter FROM STDIN; psql-gltdb-[local]-35409990-2013-11-05 23:06:18 CET:LOG: duration: 298632.302 ms statement: COPY glttab018.pdbt_d4_contparameter FROM STDIN; psql-gltdb-[local]-36130886-2013-11-05 23:11:18 CET:LOG: duration: 298665.298 ms statement: COPY glttab018.pdbt_d4_contparameter FROM STDIN; psql-gltdb-[local]-36851782-2013-11-05 23:16:22 CET:LOG: duration: 304064.471 ms statement: COPY glttab018.pdbt_d4_contparameter FROM STDIN; psql-gltdb-[local]-37576773-2013-11-05 23:21:38 CET:LOG: duration: 315617.703 ms statement: COPY glttab018.pdbt_d4_contparameter FROM STDIN; psql-gltdb-[local]-38326341-2013-11-05 23:26:54 CET:LOG: duration: 315162.772 ms statement: COPY glttab018.pdbt_d4_contparameter FROM STDIN; psql-gltdb-[local]-39100485-2013-11-05 23:32:13 CET:LOG: duration: 317958.345 ms statement: COPY glttab018.pdbt_d4_contparameter FROM STDIN; psql-gltdb-[local]-39850053-2013-11-05 23:37:45 CET:LOG: duration: 331930.207 ms statement: COPY glttab018.pdbt_d4_contparameter FROM STDIN; psql-gltdb-[local]-40628293-2013-11-05 23:43:03 CET:LOG: duration: 317776.373 ms statement: COPY glttab018.pdbt_d4_contparameter FROM STDIN; psql-gltdb-[local]-41427014-2013-11-05 23:48:48 CET:LOG: duration: 343849.383 ms statement: COPY glttab018.pdbt_d4_contparameter FROM STDIN; ---1634347-2013-11-05 23:50:58 CET:ERROR: could not seek to block 13243 in file "pg_tblspc/78600/PG_9.0_201008051/16387/79270": Interrupted function call ---1634347-2013-11-05 23:50:58 CET:CONTEXT: writing block 13243 of relation pg_tblspc/78600/PG_9.0_201008051/16387/79270 psql-gltdb-[local]-42233926-2013-11-05 23:57:52 CET:LOG: duration: 544057.747 ms statement: COPY glttab018.pdbt_d4_contparameter FROM STDIN; ---1613866-2013-11-05 23:59:21 CET:LOG: background writer process (PID 1634347) was terminated by signal 11: segmentation violation ---1613866-2013-11-05 23:59:21 CET:LOG: terminating any other active server processes psql-gltdb-[local]-43360325-2013-11-05 23:59:21 CET:WARNING: terminating connection because of crash of another server process psql-gltdb-[local]-43360325-2013-11-05 23:59:21 CET:DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. psql-gltdb-[local]-43360325-2013-11-05 23:59:21 CET:HINT: In a moment you should be able to reconnect to the database and repeat your command. psql-gltdb-[local]-43360325-2013-11-05 23:59:21 CET:CONTEXT: COPY pdbt_d4_contparameter, line 82696: "919360 14666 2 0 1 1 0 0 0 Kühlzeit 0 0 0 I 10001 5224 0 -1" ---1613866-2013-11-05 23:59:23 CET:LOG: all server processes terminated; reinitializing [unknown]-gltdb-[local]-43606062-2013-11-05 23:59:24 CET:FATAL: the database system is in recovery mode ---43606059-2013-11-05 23:59:24 CET:LOG: database system was interrupted; last known up at 2013-11-05 23:50:10 CET ---1613866-2013-11-05 23:59:24 CET:LOG: received SIGHUP, reloading configuration files ---43606059-2013-11-05 23:59:25 CET:LOG: could not remove cache file "base/849886/pg_internal.init": Interrupted function call ---43606059-2013-11-05 23:59:26 CET:LOG: database system was not properly shut down; automatic recovery in progress ---43606059-2013-11-05 23:59:26 CET:LOG: redo starts at 14/B300F554 ---1613866-2013-11-05 23:59:27 CET:LOG: received SIGHUP, reloading configuration files ---1613866-2013-11-05 23:59:27 CET:LOG: parameter "autovacuum" changed to "on" [unknown]-gltdb-[local]-43773970-2013-11-05 23:59:32 CET:FATAL: the database system is in recovery mode [unknown]-gltdb-[local]-43810834-2013-11-05 23:59:33 CET:FATAL: the database system is in recovery mode [unknown]-gltdb-[local]-44539922-2013-11-06 00:04:35 CET:FATAL: the database system is in recovery mode ---43606059-2013-11-06 00:07:24 CET:LOG: could not open file "pg_xlog/0000000100000014000000C1" (log file 20, segment 193): No such file or directory ---43606059-2013-11-06 00:07:24 CET:LOG: redo done at 14/C0FFFFB0 ---43606059-2013-11-06 00:07:24 CET:LOG: last completed transaction was at log time 2013-11-05 23:57:52.607898+01 ---44941357-2013-11-06 00:07:29 CET:LOG: autovacuum launcher started ---1613866-2013-11-06 00:07:29 CET:LOG: database system is ready to accept connections ---1613866-2013-11-06 00:13:02 CET:LOG: received SIGHUP, reloading configuration files vacuumdb-gltdb-[local]-45871174-2013-11-06 01:14:08 CET:WARNING: relation "pdbt_d4_contparameter" page 21075 is uninitialized --- fixing vacuumdb-gltdb-[local]-45871174-2013-11-06 01:14:08 CET:WARNING: relation "pdbt_d4_contparameter" page 21076 is uninitialized --- fixing vacuumdb-gltdb-[local]-45871174-2013-11-06 01:14:08 CET:WARNING: relation "pdbt_d4_contparameter" page 21077 is uninitialized --- fixing vacuumdb-gltdb-[local]-45871174-2013-11-06 01:18:02 CET:LOG: duration: 3896493.745 ms statement: VACUUM (ANALYZE); contents of the core file: Program terminated with signal 11, Segmentation fault. [New pid 1634347 tid 1] #0 0x00000000 in ?? () (gdb) bt #0 0x00000000 in ?? () #1 0x08205ef4 in BackgroundWriterMain () at bgwriter.c:329 #2 0x080e2759 in AuxiliaryProcessMain (argc=2, argv=0x8047460) at bootstrap.c:417 #3 0x0820bb87 in StartChildProcess (type=<value optimized out>) at postmaster.c:4458 #4 0x0820deca in reaper (postgres_signal_arg=18) at postmaster.c:2418 #5 <signal handler called> #6 0xb033e412 in SignalWaitinfo () from /lib/libc.so.3 #7 0xb0371ef2 in _poll () from /lib/libc.so.3 #8 0xb0372253 in select () from /lib/libc.so.3 #9 0x0820f46b in PostmasterMain (argc=8, argv=0x84b5100) at postmaster.c:1414 #10 0x081b6bfb in main (argc=8, argv=0x84b5100) at main.c:188 (gdb) up #1 0x08205ef4 in BackgroundWriterMain () at bgwriter.c:329 329 bgwriter.c: No such file or directory. in bgwriter.c (gdb) print *bgwriter_context $1 = {type = T_Invalid, methods = 0x84c0d00, parent = 0x10, firstchild = 0x0, nextchild = 0x0, name = 0x0}
pgsql-bugs by date: