Re: Parallel Foreign Scans - need advice - Mailing list pgsql-hackers

From Korry Douglas
Subject Re: Parallel Foreign Scans - need advice
Date
Msg-id 598BD6D6-B5CF-450D-A0D1-5886602FF0AA@me.com
Whole thread Raw
In response to Re: Parallel Foreign Scans - need advice  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Parallel Foreign Scans - need advice
List pgsql-hackers
> That's only a superficial problem.  You don't even know if or when the
> workers that are launched will all finish up running your particular
> node, because (for example) they might be sent to different children
> of a Parallel Append node above you (AFAICS there is no way for a
> participant to indicate "I've finished all the work allocated to me,
> but I happen to know that some other worker #3 is needed here" -- as
> soon as any participant reports that it has executed the plan to
> completion, pa_finished[] will prevent new workers from picking that
> node to execute).  Suppose we made a rule that *every* worker must
> visit *every* partial child of a Parallel Append and run it to
> completion (and any similar node in the future must do the same)...
> then I think there is still a higher level design problem: if you do
> allocate work up front rather than on demand, then work could be
> unevenly distributed, and parallel query would be weakened.

What I really need (for the scheme I’m using at the moment) is to know how many workers will be used to execute my
particularPlan.  I understand that some workers will naturally end up idle while the last (busy) worker finishes up.
I’mdividing the workload (the number of row groups to scan) by the number of workers to get an even distribution.    

I’m willing to pay that price (at least, I haven’t seen a problem so far… famous last words)

I do plan to switch over to get-next-chunk allocator as you mentioned below, but I’d like to get the minimized-seek
mechanismworking first. 

It sounds like there is no reliable way to get the information that I’m looking for, is that right?

> So I think you ideally need a simple get-next-chunk work allocator
> (like Parallel Seq Scan and like the file_fdw patch I posted[1]), or a
> pass-the-baton work allocator when there is a dependency between
> chunks (like Parallel Index Scan for btrees), or a more complicated
> multi-phase system that counts participants arriving and joining in
> (like Parallel Hash) so that participants can coordinate and wait for
> each other in controlled circumstances.

I haven’t looked at Parallel Hash - will try to understand that next.

> If this compressed data doesn't have natural chunks designed for this
> purpose (like, say, ORC stripes), perhaps you could have a dedicated
> workers streaming data (compressed? decompressed?) into shared memory,
> and parallel query participants coordinating to consume chunks of
> that?


I’ll give that some thought.  Thanks for the ideas.

                    — Korry




pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: PSA: New intel MDS vulnerability mitigations cause measurable slowdown
Next
From: Dean Rasheed
Date:
Subject: Re: Multivariate MCV stats can leak data to unprivileged users