Re: The segmentation fault of Postgresql 9.6.24 - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: The segmentation fault of Postgresql 9.6.24 |
Date | |
Msg-id | 9c8b5912-9290-420c-57a4-c6185397c74a@enterprisedb.com Whole thread Raw |
In response to | The segmentation fault of Postgresql 9.6.24 (Kevin Wang <kevinpgcloud@gmail.com>) |
Responses |
Re: The segmentation fault of Postgresql 9.6.24
|
List | pgsql-hackers |
On 12/28/23 21:09, Kevin Wang wrote: > Hello hackers, > > Our prod databases are still PG 9.6.24. We have one primary plus 3 > stream replications that are all working well for a long time. Everything is working well until the day it breaks ... > However, when I promoted one standby database to the primary role, > we the the below error message from the PG log: > ======================= > 2023-12-01 06:57:35.541 UTC,,,1553,,6569738f.611,639,,2023-12-01 > 05:47:59 UTC,,0,LOG,00000,"server process (PID 31839) was terminated by > signal 11: Segmentation fault","Failed process was running: UPDATE xxxx > SET employee_id = (9489910) WHERE id = (1162120221)",,,,,,,,"" > > > > Here is the message from dmesg: > ======================= > [ 3676.406247] postgres[27789]: segfault at 0 ip 00005618bf79bfe4 sp > 00007ffcd9a75dc8 error 4 in postgres[5618bf3db000+3f7000] > [ 3676.406265] Code: ff ff 48 83 c2 40 ff d0 e8 19 9c ff ff e8 44 0f c4 > ff 0f 1f 40 00 f3 0f 1e fa e9 27 be cc ff 0f 1f 80 00 00 00 00 f3 0f 1e > fa <0f> b6 17 89 d1 > 83 e1 03 80 f9 02 74 0f 80 fa 01 74 0a 48 89 f8 c3 > [ 3715.937850] postgres[27928]: segfault at 0 ip 00005618bf79bfe4 sp > 00007ffcd9a75dc8 error 4 in postgres[5618bf3db000+3f7000] > [ 3715.937858] Code: ff ff 48 83 c2 40 ff d0 e8 19 9c ff ff e8 44 0f c4 > ff 0f 1f 40 00 f3 0f 1e fa e9 27 be cc ff 0f 1f 80 00 00 00 00 f3 0f 1e > fa <0f> b6 17 89 d1 > 83 e1 03 80 f9 02 74 0f 80 fa 01 74 0a 48 89 f8 c3 > [ 3732.278367] postgres[28212]: segfault at 0 ip 00005618bf79bfe4 sp > 00007ffcd9a75dc8 error 4 in postgres[5618bf3db000+3f7000] > [ 3732.278384] Code: ff ff 48 83 c2 40 ff d0 e8 19 9c ff ff e8 44 0f c4 > ff 0f 1f 40 00 f3 0f 1e fa e9 27 be cc ff 0f 1f 80 00 00 00 00 f3 0f 1e > fa <0f> b6 17 89 d1 > 83 e1 03 80 f9 02 74 0f 80 fa 01 74 0a 48 89 f8 c3 > > Error 4 is the error related to unmapping memory. But the database works > well for long time as the standby database. After it was promoted to the > primary role, no memory parameter change at all. > Why do you think "4" means unmapping memory? 4 is error code for "user-mode access" (i.e. not invalid memory access from kernel). > Could you give us some hint where to fix this issue? > This could be pretty much anything, and without seeing where exactly it fails it's impossible to say. I see you apparently hit the issue repeatedly, and tall the information is *exactly* the same - addresses, code, etc. Try decoding the addresses with addr2line, or even better get a proper backtrace - either from a core file, or using gdb. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: