Re: Parallelize correlated subqueries that execute within each worker - Mailing list pgsql-hackers
From | James Coleman |
---|---|
Subject | Re: Parallelize correlated subqueries that execute within each worker |
Date | |
Msg-id | CAAaqYe-_TObm5KwmZLYXBJ3BJGh4cUZWM0v1mY1gWTMkRNQXDQ@mail.gmail.com Whole thread Raw |
In response to | Re: Parallelize correlated subqueries that execute within each worker (Richard Guo <guofenglinux@gmail.com>) |
Responses |
Re: Parallelize correlated subqueries that execute within each worker
Re: Parallelize correlated subqueries that execute within each worker |
List | pgsql-hackers |
On Tue, Jun 6, 2023 at 4:36 AM Richard Guo <guofenglinux@gmail.com> wrote: > > > On Mon, Jan 23, 2023 at 10:00 PM James Coleman <jtc331@gmail.com> wrote: >> >> Which this patch we do in fact now see (as expected) rels with >> non-empty lateral_relids showing up in generate_[useful_]gather_paths. >> And the partial paths can now have non-empty required outer rels. I'm >> not able to come up with a plan that would actually be caught by those >> checks; I theorize that because of the few places we actually call >> generate_[useful_]gather_paths we are in practice already excluding >> those, but for now I've left these as a conditional rather than an >> assertion because it seems like the kind of guard we'd want to ensure >> those methods are safe. > > > I'm trying to understand this part. AFAICS we will not create partial > paths for a rel, base or join, if it has lateral references. So it > seems to me that in generate_[useful_]gather_paths after we've checked > that there are partial paths, the checks for lateral_relids are not > necessary because lateral_relids should always be empty in this case. > Maybe I'm missing something. At first I was thinking "isn't the point of the patch to generate partial paths for rels with lateral references" given what I'd written back in January, but I added "Assert(bms_is_empty(required_outer));" to both of those functions and the assertion never fails running the tests (including my newly parallelizable queries). I'm almost positive I'd checked this back in January (not only had I'd explicitly written that I'd confirmed we had non-empty lateral_relids there, but also it was the entire based of the alternate approach to the patch), but...I can't go back to 5 months ago and remember what I'd done. Ah! Your comment about "after we've checked that there are partial paths" triggered a thought. I think originally I'd had the "bms_is_subset(required_outer, rel->relids)" check first in these functions. And indeed if I run the tests with that the assertion moved to above the partial paths check, I get failures in generate_useful_gather_paths specifically. Mystery solved! > And while trying the v9 patch I came across a crash with the query > below. > > set min_parallel_table_scan_size to 0; > set parallel_setup_cost to 0; > set parallel_tuple_cost to 0; > > explain (costs off) > select * from pg_description t1 where objoid in > (select objoid from pg_description t2 where t2.description = t1.description); > QUERY PLAN > -------------------------------------------------------- > Seq Scan on pg_description t1 > Filter: (SubPlan 1) > SubPlan 1 > -> Gather > Workers Planned: 2 > -> Parallel Seq Scan on pg_description t2 > Filter: (description = t1.description) > (7 rows) > > select * from pg_description t1 where objoid in > (select objoid from pg_description t2 where t2.description = t1.description); > WARNING: terminating connection because of crash of another server process > > Seems something is wrong when extracting the argument from the Param in > parallel worker. With what I'm trying to change I don't think this plan should ever be generated since it means we'd have to pass a param from the outer seq scan into the parallel subplan, which we can't do (currently). I've attached the full backtrace to the email, but as you hinted at the parallel worker is trying to get the param (in this case detoasting it), but the param doesn't exist on the worker, so it seg faults. Looking at this further I think there's an existing test case that exposes the misplanning here (the one right under the comment "Parallel Append is not to be used when the subpath depends on the outer param" in select_parallel.sql), but it doesn't seg fault because the param is an integer, doesn't need to be detoasted, and therefore (I think) we skate by (but probably with wrong results in depending on the dataset). Interestingly this is one of the existing test queries my original patch approach didn't change, so this gives me something specific to work with improving the path. Thanks for testing this and bringing this to my attention! BTW are you by any chance testing on ARM macOS? I reproduced the issue there, but for some reason I did not reproduce the error (and the plan wasn't parallelized) when I tested this on linux. Perhaps I missed setting something up; it seems odd. > BTW another rebase is needed as it no longer applies to HEAD. Apologies; I'd rebased, but hadn't updated the thread. See attached for an updated series (albeit still broken on your test query). Thanks, James
Attachment
pgsql-hackers by date: