Re: [HACKERS] parallel.c oblivion of worker-startup failures - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: [HACKERS] parallel.c oblivion of worker-startup failures |
Date | |
Msg-id | CA+TgmoYnBgXgdTu6wk5YPdWhmgabYc9nY_pFLq=tB=FSLYkD8Q@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] parallel.c oblivion of worker-startup failures (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: [HACKERS] parallel.c oblivion of worker-startup failures
|
List | pgsql-hackers |
On Mon, Dec 11, 2017 at 10:41 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: > In particular, if the worker crashed (say somebody forcefully killed > the worker), then I think it will lead to a restart of the server. So > not sure, adding anything about that in the comment will make much > sense. I think it's dangerous to rely on that assumption. If somebody stuck proc_exit(1) into the very beginning of the worker startup sequence, the worker would exit but the server would not restart. As I started experimenting with this, I discovered that your patch is actually unsafe. Here's a test case that shows the breakage: set force_parallel_mode = on; select 1; The expected outcome is: ?column? ---------- 1 (1 row) If I hack do_start_bgworker() to make worker startup fail, like this: --- a/src/backend/postmaster/postmaster.c +++ b/src/backend/postmaster/postmaster.c @@ -5701,7 +5701,7 @@ do_start_bgworker(RegisteredBgWorker *rw) #ifdef EXEC_BACKEND switch ((worker_pid = bgworker_forkexec(rw->rw_shmem_slot))) #else - switch ((worker_pid = fork_process())) + switch ((worker_pid = -1)) #endif { case -1: ...then it hangs forever. With your patch it no longer hangs forever, which is good, but it also returns the wrong answer, which is bad: ?column? ---------- (0 rows) The problem is that Gather sees nworkers_launched > 0 and assumes that it is therefore entitled not to perform a local scan. I'm pretty sure that there could also be cases where this patch makes us miss an error reported by a worker; suppose the worker error is reported and the worker exits after WaitForParallelWorkersToFinish does CHECK_FOR_INTERRUPTS() but before we reach GetBackgroundWorkerPid(). The code will simply conclude that the worker is dead and stop waiting, never mind the fact that there's a pending error in the queue. This is exactly why I made parallel workers send a Terminate message to the leader instead of just disappearing -- the leader needs to see that message to know that it's reached the end of what the worker is going to send. So, I think we need to regard fork failure and related failure modes as ERROR conditions. It's not equivalent to failing to register the background worker. When we fail to register the background worker, the parallel.c machinery knows at LaunchParallelWorkers() time that the worker will never show up and lets the calling code by setting nworkers_launched, and the calling code can then choose to proceed with that number of workers or ERROR out as it chooses. But once we decide on the value of nworkers_launched to report to callers, it's really not OK for workers to just silently not ever start, as the test case above shows. We could make it the job of code that uses the ParallelContext machinery to cope with that situation, but I think that's a bad plan, because every user would have to worry about this obscure case. Right now, the only in-core use of the ParallelContext machinery is Gather/Gather Merge, but there are pending patches for parallel index scan and parallel VACUUM, so we'll just end up adding this handling to more and more places. And for what? If you've got max_connections and/or max_parallel_workers set so high that your system can't fork(), you have a big problem, and forcing things that would have run using parallel query to run serially instead of erroring out is not going to fix it. I think that deciding we're going to handle fork() failure by reporting an ERROR is just fine. So, here's a patch. This patch keeps track of whether we've ever received a message from each worker via the error queue. If there are no remaining workers which have neither exited cleanly nor sent us at least one message, then it calls GetBackgroundWorkerPid() for each one. If any worker reports that it is BGWH_STOPPED, then it double-checks whether the worker has attached to the error queue. If it did, then it assumes that we'll detect clean shutdown or an error on the next iteration and ignores the problem. If not, then the worker died without ever attaching the error queue, so it throws an ERROR. I tested that this causes the test case above to throw a proper ERROR when fork_process() returns -1. Please review... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
pgsql-hackers by date: