Re: [HACKERS] Unportable implementation of background worker start - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: [HACKERS] Unportable implementation of background worker start |
Date | |
Msg-id | 9205.1492833041@sss.pgh.pa.us Whole thread Raw |
In response to | Re: [HACKERS] Unportable implementation of background worker start (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: [HACKERS] Unportable implementation of background worker start
Re: [HACKERS] Unportable implementation of background worker start |
List | pgsql-hackers |
I wrote: > Attached is a lightly-tested draft patch that converts the postmaster to > use a WaitEventSet for waiting in ServerLoop. I've got mixed emotions > about whether this is the direction to proceed, though. Attached are a couple of patches that represent a plausible Plan B. The first one changes the postmaster to run its signal handlers without specifying SA_RESTART. I've confirmed that that seems to fix the select_parallel-test-takes-a-long-time problem on gaur/pademelon. The second one uses pselect, if available, to replace the unblock-signals/ select()/block-signals dance in ServerLoop. On platforms where pselect exists and works properly, that should fix the race condition I described previously. On platforms where it doesn't, we're no worse off than before. As mentioned in the comments for the second patch, even if we don't have working pselect(), the only problem is that ServerLoop's response to an interrupt might be delayed by as much as the up-to-1-minute timeout. The only existing case where that's really bad is launching multiple bgworkers. I would therefore advocate also changing maybe_start_bgworker to start up to N bgworkers per call, where N is large enough to pretty much always satisfy simultaneously-arriving requests. I'd pick 100 or so, but am willing to negotiate. I think that these patches represent something we could back-patch without a lot of trepidation, unlike the WaitEventSet-based approach. Therefore, my proposal is to apply and backpatch these changes, and call it good for v10. For v11, we could work on changing the postmaster to not do work in signal handlers, as discussed upthread. That would supersede these two patches completely, though I'd still advocate for keeping the change in maybe_start_bgworker. Note: for testing purposes, these patches are quite independent; just ignore the hunk in the second patch that changes a comment added by the first one. regards, tom lane diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index c382345..52b5e2c 100644 *** a/src/backend/postmaster/postmaster.c --- b/src/backend/postmaster/postmaster.c *************** PostmasterMain(int argc, char *argv[]) *** 610,615 **** --- 610,624 ---- /* * Set up signal handlers for the postmaster process. * + * In the postmaster, we want to install non-ignored handlers *without* + * SA_RESTART. This is because they'll be blocked at all times except + * when ServerLoop is waiting for something to happen, and during that + * window, we want signals to exit the select(2) wait so that ServerLoop + * can respond if anything interesting happened. On some platforms, + * signals marked SA_RESTART would not cause the select() wait to end. + * Child processes will generally want SA_RESTART, but we expect them to + * set up their own handlers before unblocking signals. + * * CAUTION: when changing this list, check for side-effects on the signal * handling setup of child processes. See tcop/postgres.c, * bootstrap/bootstrap.c, postmaster/bgwriter.c, postmaster/walwriter.c, *************** PostmasterMain(int argc, char *argv[]) *** 620,635 **** pqinitmask(); PG_SETMASK(&BlockSig); ! pqsignal(SIGHUP, SIGHUP_handler); /* reread config file and have ! * children do same */ ! pqsignal(SIGINT, pmdie); /* send SIGTERM and shut down */ ! pqsignal(SIGQUIT, pmdie); /* send SIGQUIT and die */ ! pqsignal(SIGTERM, pmdie); /* wait for children and shut down */ pqsignal(SIGALRM, SIG_IGN); /* ignored */ pqsignal(SIGPIPE, SIG_IGN); /* ignored */ ! pqsignal(SIGUSR1, sigusr1_handler); /* message from child process */ ! pqsignal(SIGUSR2, dummy_handler); /* unused, reserve for children */ ! pqsignal(SIGCHLD, reaper); /* handle child termination */ pqsignal(SIGTTIN, SIG_IGN); /* ignored */ pqsignal(SIGTTOU, SIG_IGN); /* ignored */ /* ignore SIGXFSZ, so that ulimit violations work like disk full */ --- 629,648 ---- pqinitmask(); PG_SETMASK(&BlockSig); ! pqsignal_no_restart(SIGHUP, SIGHUP_handler); /* reread config file ! * and have children do ! * same */ ! pqsignal_no_restart(SIGINT, pmdie); /* send SIGTERM and shut down */ ! pqsignal_no_restart(SIGQUIT, pmdie); /* send SIGQUIT and die */ ! pqsignal_no_restart(SIGTERM, pmdie); /* wait for children and shut ! * down */ pqsignal(SIGALRM, SIG_IGN); /* ignored */ pqsignal(SIGPIPE, SIG_IGN); /* ignored */ ! pqsignal_no_restart(SIGUSR1, sigusr1_handler); /* message from child ! * process */ ! pqsignal_no_restart(SIGUSR2, dummy_handler); /* unused, reserve for ! * children */ ! pqsignal_no_restart(SIGCHLD, reaper); /* handle child termination */ pqsignal(SIGTTIN, SIG_IGN); /* ignored */ pqsignal(SIGTTOU, SIG_IGN); /* ignored */ /* ignore SIGXFSZ, so that ulimit violations work like disk full */ diff --git a/src/include/port.h b/src/include/port.h index c6937e5..52910ed 100644 *** a/src/include/port.h --- b/src/include/port.h *************** extern int pg_mkdir_p(char *path, int om *** 469,474 **** --- 469,479 ---- /* port/pqsignal.c */ typedef void (*pqsigfunc) (int signo); extern pqsigfunc pqsignal(int signo, pqsigfunc func); + #ifndef WIN32 + extern pqsigfunc pqsignal_no_restart(int signo, pqsigfunc func); + #else + #define pqsignal_no_restart(signo, func) pqsignal(signo, func) + #endif /* port/quotes.c */ extern char *escape_single_quotes_ascii(const char *src); diff --git a/src/port/pqsignal.c b/src/port/pqsignal.c index 5d366da..e7e4451 100644 *** a/src/port/pqsignal.c --- b/src/port/pqsignal.c *************** *** 32,38 **** #if !defined(WIN32) || defined(FRONTEND) /* ! * Set up a signal handler for signal "signo" * * Returns the previous handler. */ --- 32,38 ---- #if !defined(WIN32) || defined(FRONTEND) /* ! * Set up a signal handler, with SA_RESTART, for signal "signo" * * Returns the previous handler. */ *************** pqsignal(int signo, pqsigfunc func) *** 58,61 **** --- 58,90 ---- #endif } + /* + * Set up a signal handler, without SA_RESTART, for signal "signo" + * + * Returns the previous handler. + * + * On Windows, this would be identical to pqsignal(), so don't bother. + */ + #ifndef WIN32 + + pqsigfunc + pqsignal_no_restart(int signo, pqsigfunc func) + { + struct sigaction act, + oact; + + act.sa_handler = func; + sigemptyset(&act.sa_mask); + act.sa_flags = 0; + #ifdef SA_NOCLDSTOP + if (signo == SIGCHLD) + act.sa_flags |= SA_NOCLDSTOP; + #endif + if (sigaction(signo, &act, &oact) < 0) + return SIG_ERR; + return oact.sa_handler; + } + + #endif /* !WIN32 */ + #endif /* !defined(WIN32) || defined(FRONTEND) */ diff --git a/configure b/configure index 99d05bf..42ff097 100755 *** a/configure --- b/configure *************** fi *** 12892,12898 **** LIBS_including_readline="$LIBS" LIBS=`echo "$LIBS" | sed -e 's/-ledit//g' -e 's/-lreadline//g'` ! for ac_func in cbrt clock_gettime dlopen fdatasync getifaddrs getpeerucred getrlimit mbstowcs_l memmove poll pstat pthread_is_threaded_npreadlink setproctitle setsid shm_open symlink sync_file_range towlower utime utimes wcstombs wcstombs_l do : as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh` ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var" --- 12892,12898 ---- LIBS_including_readline="$LIBS" LIBS=`echo "$LIBS" | sed -e 's/-ledit//g' -e 's/-lreadline//g'` ! for ac_func in cbrt clock_gettime dlopen fdatasync getifaddrs getpeerucred getrlimit mbstowcs_l memmove poll pselect pstatpthread_is_threaded_np readlink setproctitle setsid shm_open symlink sync_file_range towlower utime utimes wcstombswcstombs_l do : as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh` ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var" diff --git a/configure.in b/configure.in index c36c503..871edd8 100644 *** a/configure.in --- b/configure.in *************** PGAC_FUNC_WCSTOMBS_L *** 1429,1435 **** LIBS_including_readline="$LIBS" LIBS=`echo "$LIBS" | sed -e 's/-ledit//g' -e 's/-lreadline//g'` ! AC_CHECK_FUNCS([cbrt clock_gettime dlopen fdatasync getifaddrs getpeerucred getrlimit mbstowcs_l memmove poll pstat pthread_is_threaded_npreadlink setproctitle setsid shm_open symlink sync_file_range towlower utime utimes wcstombs wcstombs_l]) AC_REPLACE_FUNCS(fseeko) case $host_os in --- 1429,1435 ---- LIBS_including_readline="$LIBS" LIBS=`echo "$LIBS" | sed -e 's/-ledit//g' -e 's/-lreadline//g'` ! AC_CHECK_FUNCS([cbrt clock_gettime dlopen fdatasync getifaddrs getpeerucred getrlimit mbstowcs_l memmove poll pselect pstatpthread_is_threaded_np readlink setproctitle setsid shm_open symlink sync_file_range towlower utime utimes wcstombswcstombs_l]) AC_REPLACE_FUNCS(fseeko) case $host_os in diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index 52b5e2c..382e6a8 100644 *** a/src/backend/postmaster/postmaster.c --- b/src/backend/postmaster/postmaster.c *************** PostmasterMain(int argc, char *argv[]) *** 613,621 **** * In the postmaster, we want to install non-ignored handlers *without* * SA_RESTART. This is because they'll be blocked at all times except * when ServerLoop is waiting for something to happen, and during that ! * window, we want signals to exit the select(2) wait so that ServerLoop * can respond if anything interesting happened. On some platforms, ! * signals marked SA_RESTART would not cause the select() wait to end. * Child processes will generally want SA_RESTART, but we expect them to * set up their own handlers before unblocking signals. * --- 613,621 ---- * In the postmaster, we want to install non-ignored handlers *without* * SA_RESTART. This is because they'll be blocked at all times except * when ServerLoop is waiting for something to happen, and during that ! * window, we want signals to exit the pselect(2) wait so that ServerLoop * can respond if anything interesting happened. On some platforms, ! * signals marked SA_RESTART would not cause the pselect() wait to end. * Child processes will generally want SA_RESTART, but we expect them to * set up their own handlers before unblocking signals. * *************** ServerLoop(void) *** 1669,1674 **** --- 1669,1676 ---- for (;;) { fd_set rmask; + fd_set *rmask_p; + struct timeval timeout; int selres; time_t now; *************** ServerLoop(void) *** 1678,1714 **** * We block all signals except while sleeping. That makes it safe for * signal handlers, which again block all signals while executing, to * do nontrivial work. - * - * If we are in PM_WAIT_DEAD_END state, then we don't want to accept - * any new connections, so we don't call select(), and just sleep. */ - memcpy((char *) &rmask, (char *) &readmask, sizeof(fd_set)); - if (pmState == PM_WAIT_DEAD_END) { ! PG_SETMASK(&UnBlockSig); ! ! pg_usleep(100000L); /* 100 msec seems reasonable */ ! selres = 0; ! ! PG_SETMASK(&BlockSig); } else { ! /* must set timeout each time; some OSes change it! */ ! struct timeval timeout; /* Needs to run with blocked signals! */ DetermineSleepTime(&timeout); PG_SETMASK(&UnBlockSig); ! selres = select(nSockets, &rmask, NULL, NULL, &timeout); PG_SETMASK(&BlockSig); } ! /* Now check the select() result */ if (selres < 0) { if (errno != EINTR && errno != EWOULDBLOCK) --- 1680,1743 ---- * We block all signals except while sleeping. That makes it safe for * signal handlers, which again block all signals while executing, to * do nontrivial work. */ if (pmState == PM_WAIT_DEAD_END) { ! /* ! * If we are in PM_WAIT_DEAD_END state, then we don't want to ! * accept any new connections, so pass a null rmask. ! */ ! rmask_p = NULL; ! timeout.tv_sec = 0; ! timeout.tv_usec = 100000; /* 100 msec seems reasonable */ } else { ! /* Normal case: check sockets, and compute a suitable timeout */ ! memcpy(&rmask, &readmask, sizeof(fd_set)); ! rmask_p = &rmask; /* Needs to run with blocked signals! */ DetermineSleepTime(&timeout); + } + /* + * We prefer to wait with pselect(2) if available, as using that, + * together with *not* using SA_RESTART for signals, guarantees that + * we will get kicked off the wait if a signal occurs. + * + * If we lack pselect(2), fake it with select(2). This has a race + * condition: a signal that was already pending will be delivered + * before we reach the select(), and therefore the select() will wait, + * even though we might wish to do something in response. Therefore, + * beware of putting any time-critical signal response logic into + * ServerLoop rather than into the signal handler itself. It will run + * eventually, but maybe not till after a timeout delay. + * + * It is rumored that some implementations of pselect() are not + * atomic, making the first alternative here functionally equivalent + * to the second. Not much we can do about that though. + */ + { + #ifdef HAVE_PSELECT + /* pselect uses a randomly different timeout API, sigh */ + struct timespec ptimeout; + + ptimeout.tv_sec = timeout.tv_sec; + ptimeout.tv_nsec = timeout.tv_usec * 1000; + + selres = pselect(nSockets, rmask_p, NULL, NULL, + &ptimeout, &UnBlockSig); + #else PG_SETMASK(&UnBlockSig); ! selres = select(nSockets, rmask_p, NULL, NULL, &timeout); PG_SETMASK(&BlockSig); + #endif } ! /* Now check the select()/pselect() result */ if (selres < 0) { if (errno != EINTR && errno != EWOULDBLOCK) diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in index 03e9803..9bbf1b1 100644 *** a/src/include/pg_config.h.in --- b/src/include/pg_config.h.in *************** *** 400,405 **** --- 400,408 ---- /* Define to 1 if the assembler supports PPC's LWARX mutex hint bit. */ #undef HAVE_PPC_LWARX_MUTEX_HINT + /* Define to 1 if you have the `pselect' function. */ + #undef HAVE_PSELECT + /* Define to 1 if you have the `pstat' function. */ #undef HAVE_PSTAT diff --git a/src/include/pg_config.h.win32 b/src/include/pg_config.h.win32 index f9588b0..5d36768 100644 *** a/src/include/pg_config.h.win32 --- b/src/include/pg_config.h.win32 *************** *** 267,272 **** --- 267,275 ---- /* Define to 1 if you have the <poll.h> header file. */ /* #undef HAVE_POLL_H */ + /* Define to 1 if you have the `pselect' function. */ + /* #undef HAVE_PSELECT */ + /* Define to 1 if you have the `pstat' function. */ /* #undef HAVE_PSTAT */ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
pgsql-hackers by date: