Re: [HACKERS] logical replication launcher crash on buildfarm - Mailing list pgsql-hackers
From | Petr Jelinek |
---|---|
Subject | Re: [HACKERS] logical replication launcher crash on buildfarm |
Date | |
Msg-id | 368e64f6-ee9a-f09f-82b4-a33b61b28d36@2ndquadrant.com Whole thread Raw |
In response to | Re: [HACKERS] logical replication launcher crash on buildfarm (Andres Freund <andres@anarazel.de>) |
Responses |
Re: [HACKERS] logical replication launcher crash on buildfarm
Re: logical replication launcher crash on buildfarm |
List | pgsql-hackers |
On 16/03/17 09:53, Andres Freund wrote: > On 2017-03-16 09:40:48 +0100, Petr Jelinek wrote: >> On 16/03/17 04:42, Andres Freund wrote: >>> On 2017-03-15 20:28:33 -0700, Andres Freund wrote: >>>> Hi, >>>> >>>> I just unstuck a bunch of my buildfarm animals. That triggered some >>>> spurious failures (on piculet, calliphoridae, mylodon), but also one >>>> that doesn't really look like that: >>>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2017-03-16%2002%3A40%3A03 >>>> >>>> with the pertinent point being: >>>> >>>> ================== stack trace: pgsql.build/src/test/regress/tmp_check/data/core ================== >>>> [New LWP 1894] >>>> [Thread debugging using libthread_db enabled] >>>> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". >>>> Core was generated by `postgres: bgworker: logical replication launcher '. >>>> Program terminated with signal SIGSEGV, Segmentation fault. >>>> #0 0x000055e265bff5e3 in ?? () >>>> #0 0x000055e265bff5e3 in ?? () >>>> #1 0x000055d3ccabed0d in StartBackgroundWorker () at /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/bgworker.c:792 >>>> #2 0x000055d3ccacf4fc in SubPostmasterMain (argc=3, argv=0x55d3cdbb71c0) at /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:4878 >>>> #3 0x000055d3cca443ea in main (argc=3, argv=0x55d3cdbb71c0) at /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/main/main.c:205 >>>> >>>> it's possible that me killing things and upgrading caused this, but >>>> given this is a backend running EXEC_BACKEND, I'm a bit suspicous that >>>> it's more than that. The machine is a bit backed up at the moment, so >>>> it'll probably be a while till it's at that animal/branch again, >>>> otherwise I'd not have mentioned this. >>> >>> For some reason it ran again pretty soon. And I'm afraid it's indeed an >>> issue: >>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2017-03-16%2003%3A30%3A02 >>> >> >> Hmm, I tried with EXEC_BACKEND (and with --disable-spinlocks) and it >> seems to work fine on my two machines. I don't see anything else >> different on culicidae though. Sadly the backtrace is not that >> informative either. I'll try to investigate more but it will take time... > > Worthwhile additional failure: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2017-03-16%2002%3A55%3A01 > > Same animal, also EXEC_BACKEND, but 9.6. > > A quick look at the relevant line: > /* > * If bgw_main is set, we use that value as the initial entrypoint. > * However, if the library containing the entrypoint wasn't loaded at > * postmaster startup time, passing it as a direct function pointer is not > * possible. To work around that, we allow callers for whom a function > * pointer is not available to pass a library name (which will be loaded, > * if necessary) and a function name (which will be looked up in the named > * library). > */ > if (worker->bgw_main != NULL) > entrypt = worker->bgw_main; > > makes the issue clear - we appear to be assuming that bgw_main is > meaningful across processes. Which it isn't in the EXEC_BACKEND case > when ASLR is in use... > > This kinda sounds familiar, but a quick google search doesn't find > anything relevant. Hmm now that you mention it, I remember discussing something similar with you last year in Dallas in regards to parallel query. IIRC Windows should not have this problem but other systems with EXEC_BACKEND do. Don't remember the details though. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: