Re: ExecGather() + nworkers - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: ExecGather() + nworkers |
Date | |
Msg-id | CA+Tgmoajv7C8vvOajtsBuesP0ge07LFEY4vz+4ciRCd5FCdnGg@mail.gmail.com Whole thread Raw |
In response to | ExecGather() + nworkers (Peter Geoghegan <pg@heroku.com>) |
Responses |
Re: ExecGather() + nworkers
|
List | pgsql-hackers |
On Sun, Jan 10, 2016 at 12:29 AM, Peter Geoghegan <pg@heroku.com> wrote: > The Gather node executor function ExecGather() does this: > [ code ] > I'm not sure why the test for nworkers following the > LaunchParallelWorkers() call doesn't look like this, though: > > /* Set up tuple queue readers to read the results. */ > if (pcxt->nworkers_launched > 0) > { > ... > } Hmm, yeah, I guess it could do that. > But going to this additional trouble (detecting no workers launched on > the basis of !nworkers_launched) suggests that simply testing > nworkers_launched would be wrong, which AFAICT it isn't. Can't we just > do that, and in so doing also totally remove the "for" loop shown > here? I don't see how the for loop goes away. > In the case of parallel sequential scan, it looks like one worker can > be helpful, because then the gather node (leader process) can run the > plan itself to some degree, and so there are effectively 2 processes > scanning at a minimum (unless 0 workers could be allocated to begin > with). How useful is it to have a parallel scan when this happens, > though? Empirically, that's really quite useful. When you have 3 or 4 workers, the leader really doesn't make a significant contribution to the work, but what I've seen in my testing is that 1 worker often runs almost twice as fast as 0 workers. > I guess it isn't obvious to me how to reliably back out of not being > able to launch at least 2 workers in the case of my parallel index > build patch, because I suspect 2 workers (plus the leader process) are > the minimum number that will make index builds faster. Right now, it > looks like I'll have to check nworkers_launched in the leader (which > will be the only process to access the ParallelContext, since it's in > its local memory). Then, having established that there are at least > the minimum useful number of worker processes launched for sorting, > the leader can "permit" worker processes to "really" start based on > changing some state in the TOC/segment in common use. Otherwise, the > leader must call the whole thing off and do a conventional, serial > index build, even though technically the main worker process function > has started execution in worker processes. I don't really understand why this should be so. I thought the idea of parallel sort is (roughly) that each worker should read data until it fills work_mem, sort that data, and write a tape. Repeat until no data remains. Then, merge the tapes. I don't see any reason at all why this shouldn't work just fine with a leader and 1 worker. > I think what might be better is a general solution to my problem, > which I imagine will crop up again and again as new clients are added. > I would like an API that lets callers of LaunchParallelWorkers() only > actually launch *any* worker on the basis of having been able to > launch some minimum sensible number (typically 2). Otherwise, indicate > failure, allowing callers to call the whole thing off in a general > way, without the postmaster having actually launched anything, and > without custom "call it all off" code for parallel index builds. This > would probably involve introducing a distinction between a > BackgroundWorkerSlot being "reserved" rather than "in_use", lest the > postmaster accidentally launch 1 worker process before we established > definitively that launching any is really a good idea. I think that's probably over-engineered. I mean, it wouldn't be that hard to have the workers just exit if you decide you don't want them, and I don't really want to make the signaling here more complicated than it really needs to be. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: