Please help me debug regular segfaults on 8.3.10 - Mailing list pgsql-general
From | pgsql |
---|---|
Subject | Please help me debug regular segfaults on 8.3.10 |
Date | |
Msg-id | hrq276$1lu$1@news.hub.org Whole thread Raw |
Responses |
Re: Please help me debug regular segfaults on 8.3.10
Re: Please help me debug regular segfaults on 8.3.10 Re: Please help me debug regular segfaults on 8.3.10 |
List | pgsql-general |
Hi, one of our pgsql instances recently started to segfault multiple times a week. I tried a couple of things to pin it down to a certain query or job but failed to find any pattern. All I can offer is some notes and a set of similar looking back traces. Thanks in advance. Machine details --------------- * CentOS release 5.4 (Final) * Linux 2.6.18-164.15.1.el5 #1 SMP Wed Mar 17 11:30:06 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux * 4x Quad-Core AMD Opteron 8354 * 64GB RAM (ECC) PostgreSQL packages ------------------- * postgresql-8.3.10-2PGDG.el5 * postgresql-contrib-8.3.10-2PGDG.el5 * postgresql-devel-8.3.10-2PGDG.el5 * postgresql-libs-8.3.10-2PGDG.el5 * postgresql-plperl-8.3.10-2PGDG.el5 * postgresql-plpython-8.3.10-2PGDG.el5 * postgresql-pltcl-8.3.10-2PGDG.el5 * postgresql-server-8.3.10-2PGDG.el5 Environment ----------- * Multiple databases with a total of 1TB in size * So far the back traces show three different databases * Some larger hash indexes exist (requiring reindex after each crash) * The only loaded PL is pl/pgsql * The system is doing around 3000 TPS constantly Things that didn't make any change ---------------------------------- * Updated from 8.3.7 to 8.3.10 * Updated OS kernel 2010-05-04 | core.21207 ----------------------- Core was generated by `postgres: <user> <database_1> <client ip>('. Program terminated with signal 11, Segmentation fault. [New process 21207] #0 0x000000000066acae in pfree () (gdb) bt #0 0x000000000066acae in pfree () #1 0x0000000000648c6e in ?? () #2 0x0000000000648f34 in ?? () #3 0x00000000006493d4 in RelationCacheInvalidateEntry () #4 0x0000000000644fcd in ?? () #5 0x0000000000644882 in ?? () #6 0x00000000006448be in CommandEndInvalidationMessages () #7 0x0000000000472993 in CommandCounterIncrement () #8 0x00000000005342ea in ?? () #9 0x0000000000534543 in SPI_execute_plan () #10 0x00002ad2863f0148 in ?? () from /usr/lib64/pgsql/plpgsql.so #11 0x00002ad2863f1a26 in ?? () from /usr/lib64/pgsql/plpgsql.so #12 0x00002ad2863f3372 in ?? () from /usr/lib64/pgsql/plpgsql.so #13 0x00002ad2863f3ce5 in plpgsql_exec_function () from /usr/lib64/pgsql/plpgsql.so #14 0x00002ad2863ea7be in plpgsql_call_handler () from /usr/lib64/pgsql/plpgsql.so #15 0x000000000052023e in ExecMakeFunctionResult () #16 0x000000000051d1f3 in ExecProject () #17 0x000000000052df13 in ExecResult () #18 0x000000000051cc66 in ExecProcNode () #19 0x000000000051bedf in ExecutorRun () #20 0x00000000005b1481 in ?? () #21 0x00000000005b2689 in PortalRun () #22 0x00000000005ae3b0 in ?? () #23 0x00000000005af038 in PostgresMain () #24 0x00000000005856a7 in ?? () #25 0x000000000058632b in PostmasterMain () #26 0x000000000053eece in main () 2010-04-29 | core.20832 ----------------------- Core was generated by `postgres: <user> <database_1> <client ip>('. Program terminated with signal 11, Segmentation fault. [New process 20832] #0 0x000000000066acae in pfree () (gdb) bt #0 0x000000000066acae in pfree () #1 0x0000000000648c6e in ?? () #2 0x0000000000648f34 in ?? () #3 0x00000000006493d4 in RelationCacheInvalidateEntry () #4 0x0000000000644fcd in ?? () #5 0x0000000000644882 in ?? () #6 0x00000000006448be in CommandEndInvalidationMessages () #7 0x0000000000472993 in CommandCounterIncrement () #8 0x00000000005342ea in ?? () #9 0x0000000000534543 in SPI_execute_plan () #10 0x00002b41879e1148 in ?? () from /usr/lib64/pgsql/plpgsql.so #11 0x00002b41879e2a26 in ?? () from /usr/lib64/pgsql/plpgsql.so #12 0x00002b41879e4372 in ?? () from /usr/lib64/pgsql/plpgsql.so #13 0x00002b41879e4ce5 in plpgsql_exec_function () from /usr/lib64/pgsql/plpgsql.so #14 0x00002b41879db7be in plpgsql_call_handler () from /usr/lib64/pgsql/plpgsql.so #15 0x000000000052023e in ExecMakeFunctionResult () #16 0x000000000051d1f3 in ExecProject () #17 0x000000000052df13 in ExecResult () #18 0x000000000051cc66 in ExecProcNode () #19 0x000000000051bedf in ExecutorRun () #20 0x00000000005b1481 in ?? () #21 0x00000000005b2689 in PortalRun () #22 0x00000000005ae3b0 in ?? () #23 0x00000000005af038 in PostgresMain () #24 0x00000000005856a7 in ?? () #25 0x000000000058632b in PostmasterMain () #26 0x000000000053eece in main () 2010-04-27 | core.25421 ----------------------- Core was generated by `postgres: <user> <database_1> <client ip>('. Program terminated with signal 11, Segmentation fault. [New process 25421] #0 0x000000000066acae in pfree () (gdb) bt #0 0x000000000066acae in pfree () #1 0x0000000000648c6e in ?? () #2 0x0000000000648f34 in ?? () #3 0x00000000006493d4 in RelationCacheInvalidateEntry () #4 0x0000000000644fcd in ?? () #5 0x0000000000644882 in ?? () #6 0x00000000006448be in CommandEndInvalidationMessages () #7 0x0000000000472993 in CommandCounterIncrement () #8 0x00000000005342ea in ?? () #9 0x0000000000534543 in SPI_execute_plan () #10 0x00002b41879e1148 in ?? () from /usr/lib64/pgsql/plpgsql.so #11 0x00002b41879e2a26 in ?? () from /usr/lib64/pgsql/plpgsql.so #12 0x00002b41879e4372 in ?? () from /usr/lib64/pgsql/plpgsql.so #13 0x00002b41879e4ce5 in plpgsql_exec_function () from /usr/lib64/pgsql/plpgsql.so #14 0x00002b41879db7be in plpgsql_call_handler () from /usr/lib64/pgsql/plpgsql.so #15 0x000000000052023e in ExecMakeFunctionResult () #16 0x000000000051d1f3 in ExecProject () #17 0x000000000052df13 in ExecResult () #18 0x000000000051cc66 in ExecProcNode () #19 0x000000000051bedf in ExecutorRun () #20 0x00000000005b1481 in ?? () #21 0x00000000005b2689 in PortalRun () #22 0x00000000005ae3b0 in ?? () #23 0x00000000005af038 in PostgresMain () #24 0x00000000005856a7 in ?? () #25 0x000000000058632b in PostmasterMain () #26 0x000000000053eece in main () 2010-04-24 | core.23631 ----------------------- Core was generated by `postgres: <user> <database_2> <client ip>('. Program terminated with signal 11, Segmentation fault. [New process 23631] #0 0x000000000066acae in pfree () (gdb) bt #0 0x000000000066acae in pfree () #1 0x0000000000648c6e in ?? () #2 0x0000000000648f34 in ?? () #3 0x00000000006493d4 in RelationCacheInvalidateEntry () #4 0x0000000000644fcd in ?? () #5 0x0000000000644882 in ?? () #6 0x00000000006448be in CommandEndInvalidationMessages () #7 0x0000000000472993 in CommandCounterIncrement () #8 0x00000000005342ea in ?? () #9 0x0000000000534543 in SPI_execute_plan () #10 0x00002b41879a0148 in ?? () from /usr/lib64/pgsql/plpgsql.so #11 0x00002b41879a1a26 in ?? () from /usr/lib64/pgsql/plpgsql.so #12 0x00002b41879a3372 in ?? () from /usr/lib64/pgsql/plpgsql.so #13 0x00002b41879a3ce5 in plpgsql_exec_function () from /usr/lib64/pgsql/plpgsql.so #14 0x00002b418799a7be in plpgsql_call_handler () from /usr/lib64/pgsql/plpgsql.so #15 0x000000000052023e in ExecMakeFunctionResult () #16 0x000000000051d1f3 in ExecProject () #17 0x000000000052df13 in ExecResult () #18 0x000000000051cc66 in ExecProcNode () #19 0x000000000051bedf in ExecutorRun () #20 0x00000000005b1481 in ?? () #21 0x00000000005b2689 in PortalRun () #22 0x00000000005ae3b0 in ?? () #23 0x00000000005af038 in PostgresMain () #24 0x00000000005856a7 in ?? () #25 0x000000000058632b in PostmasterMain () #26 0x000000000053eece in main () 2010-04-23 | core.9419 ----------------------- Core was generated by `postgres: <user> <database_1> <client ip>('. Program terminated with signal 11, Segmentation fault. [New process 9419] #0 0x000000000066acae in pfree () (gdb) bt #0 0x000000000066acae in pfree () #1 0x0000000000648c6e in ?? () #2 0x0000000000648f34 in ?? () #3 0x00000000006493d4 in RelationCacheInvalidateEntry () #4 0x0000000000644fcd in ?? () #5 0x0000000000644882 in ?? () #6 0x00000000006448be in CommandEndInvalidationMessages () #7 0x0000000000472993 in CommandCounterIncrement () #8 0x00000000005342ea in ?? () #9 0x0000000000534543 in SPI_execute_plan () #10 0x00002b3acaef4148 in ?? () from /usr/lib64/pgsql/plpgsql.so #11 0x00002b3acaef5a26 in ?? () from /usr/lib64/pgsql/plpgsql.so #12 0x00002b3acaef7372 in ?? () from /usr/lib64/pgsql/plpgsql.so #13 0x00002b3acaef7ce5 in plpgsql_exec_function () from /usr/lib64/pgsql/plpgsql.so #14 0x00002b3acaeee7be in plpgsql_call_handler () from /usr/lib64/pgsql/plpgsql.so #15 0x000000000052023e in ExecMakeFunctionResult () #16 0x000000000051d1f3 in ExecProject () #17 0x000000000052df13 in ExecResult () #18 0x000000000051cc66 in ExecProcNode () #19 0x000000000051bedf in ExecutorRun () #20 0x00000000005b1481 in ?? () #21 0x00000000005b2689 in PortalRun () #22 0x00000000005ae3b0 in ?? () #23 0x00000000005af038 in PostgresMain () #24 0x00000000005856a7 in ?? () #25 0x000000000058632b in PostmasterMain () #26 0x000000000053eece in main () 2010-04-22 | core.16801 ----------------------- Core was generated by `postgres: <user> <database_2> <client ip>('. Program terminated with signal 11, Segmentation fault. [New process 16801] #0 0x000000000066acae in pfree () (gdb) bt #0 0x000000000066acae in pfree () #1 0x0000000000648c6e in ?? () #2 0x0000000000648f34 in ?? () #3 0x00000000006493d4 in RelationCacheInvalidateEntry () #4 0x0000000000644fcd in ?? () #5 0x0000000000644882 in ?? () #6 0x00000000006448be in CommandEndInvalidationMessages () #7 0x0000000000472993 in CommandCounterIncrement () #8 0x00000000005342ea in ?? () #9 0x0000000000534543 in SPI_execute_plan () #10 0x00002b3acaeb3148 in ?? () from /usr/lib64/pgsql/plpgsql.so #11 0x00002b3acaeb4a26 in ?? () from /usr/lib64/pgsql/plpgsql.so #12 0x00002b3acaeb6372 in ?? () from /usr/lib64/pgsql/plpgsql.so #13 0x00002b3acaeb6ce5 in plpgsql_exec_function () from /usr/lib64/pgsql/plpgsql.so #14 0x00002b3acaead7be in plpgsql_call_handler () from /usr/lib64/pgsql/plpgsql.so #15 0x000000000052023e in ExecMakeFunctionResult () #16 0x000000000051d1f3 in ExecProject () #17 0x000000000052df13 in ExecResult () #18 0x000000000051cc66 in ExecProcNode () #19 0x000000000051bedf in ExecutorRun () #20 0x00000000005b1481 in ?? () #21 0x00000000005b2689 in PortalRun () #22 0x00000000005ae3b0 in ?? () #23 0x00000000005af038 in PostgresMain () #24 0x00000000005856a7 in ?? () #25 0x000000000058632b in PostmasterMain () #26 0x000000000053eece in main () 2010-04-15 | core.32242 ----------------------- Core was generated by `postgres: <user> <database_3> <client ip>('. Program terminated with signal 11, Segmentation fault. [New process 32242] #0 0x000000000066acae in pfree () (gdb) bt #0 0x000000000066acae in pfree () #1 0x0000000000648c6e in ?? () #2 0x0000000000648f34 in ?? () #3 0x00000000006493d4 in RelationCacheInvalidateEntry () #4 0x0000000000644fcd in ?? () #5 0x0000000000644882 in ?? () #6 0x00000000006448be in CommandEndInvalidationMessages () #7 0x0000000000472993 in CommandCounterIncrement () #8 0x0000000000525c25 in fmgr_sql () #9 0x000000000052023e in ExecMakeFunctionResult () #10 0x000000000051d1f3 in ExecProject () #11 0x000000000052df13 in ExecResult () #12 0x000000000051cc66 in ExecProcNode () #13 0x000000000051bedf in ExecutorRun () #14 0x00000000005b1481 in ?? () #15 0x00000000005b2689 in PortalRun () #16 0x00000000005ae3b0 in ?? () #17 0x00000000005af038 in PostgresMain () #18 0x00000000005856a7 in ?? () #19 0x000000000058632b in PostmasterMain () #20 0x000000000053eece in main () 2010-04-14 | core.10776 ----------------------- Core was generated by `postgres: <user> <database_1> <client ip>('. Program terminated with signal 11, Segmentation fault. [New process 10776] #0 0x000000000066acae in pfree () (gdb) bt #0 0x000000000066acae in pfree () #1 0x0000000000648c6e in ?? () #2 0x0000000000648f34 in ?? () #3 0x00000000006493d4 in RelationCacheInvalidateEntry () #4 0x0000000000644fcd in ?? () #5 0x0000000000644882 in ?? () #6 0x00000000006448be in CommandEndInvalidationMessages () #7 0x0000000000472993 in CommandCounterIncrement () #8 0x00000000005342ea in ?? () #9 0x0000000000534543 in SPI_execute_plan () #10 0x00002b3acaeb3148 in ?? () from /usr/lib64/pgsql/plpgsql.so #11 0x00002b3acaeb4a26 in ?? () from /usr/lib64/pgsql/plpgsql.so #12 0x00002b3acaeb6372 in ?? () from /usr/lib64/pgsql/plpgsql.so #13 0x00002b3acaeb6ce5 in plpgsql_exec_function () from /usr/lib64/pgsql/plpgsql.so #14 0x00002b3acaead7be in plpgsql_call_handler () from /usr/lib64/pgsql/plpgsql.so #15 0x000000000052023e in ExecMakeFunctionResult () #16 0x000000000051d1f3 in ExecProject () #17 0x000000000052df13 in ExecResult () #18 0x000000000051cc66 in ExecProcNode () #19 0x000000000051bedf in ExecutorRun () #20 0x00000000005b1481 in ?? () #21 0x00000000005b2689 in PortalRun () #22 0x00000000005ae3b0 in ?? () #23 0x00000000005af038 in PostgresMain () #24 0x00000000005856a7 in ?? () #25 0x000000000058632b in PostmasterMain () #26 0x000000000053eece in main ()
pgsql-general by date: