Thread: print_path is missing GatherMerge and CustomScan support
Hi, While debugging planner I realized that print_path() function is not aware of both GatherMerge path and CustomScan path. Attached small patch fixes it. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Attachment
On Wed, Jul 18, 2018 at 02:35:23PM +0900, Masahiko Sawada wrote: > Hi, > > While debugging planner I realized that print_path() function is not > aware of both GatherMerge path and CustomScan path. Attached small > patch fixes it. Good catch. Those should be backpatched. While I am looking at this stuff, I have noticed that pathnode.c/reparameterize_path_by_child uses T_MergeAppend and not T_MergeAppendPath. --- a/src/backend/optimizer/util/pathnode.c +++ b/src/backend/optimizer/util/pathnode.c @@ -3817,7 +3817,7 @@ do { \ } break; - case T_MergeAppend: + case T_MergeAppendPath: { MergeAppendPath *mapath This is new as of f49842d1 in v11. Robert, Ashutosh, am I missing something? -- Michael
Attachment
On Wed, Jul 18, 2018 at 11:52 AM, Michael Paquier <michael@paquier.xyz> wrote: > On Wed, Jul 18, 2018 at 02:35:23PM +0900, Masahiko Sawada wrote: >> Hi, >> >> While debugging planner I realized that print_path() function is not >> aware of both GatherMerge path and CustomScan path. Attached small >> patch fixes it. > > Good catch. Those should be backpatched. While I am looking at this > stuff, I have noticed that pathnode.c/reparameterize_path_by_child uses > T_MergeAppend and not T_MergeAppendPath. > > --- a/src/backend/optimizer/util/pathnode.c > +++ b/src/backend/optimizer/util/pathnode.c > @@ -3817,7 +3817,7 @@ do { \ > } > break; > > - case T_MergeAppend: > + case T_MergeAppendPath: > { > MergeAppendPath *mapath > > This is new as of f49842d1 in v11. Yes that's right. Thanks for taking care of it. > Robert, Ashutosh, am I missing > something? You used my personal email id by mistake, I think. I have removed it and added by EDB email address. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
On Wed, Jul 18, 2018 at 12:15:25PM +0530, Ashutosh Bapat wrote: > On Wed, Jul 18, 2018 at 11:52 AM, Michael Paquier <michael@paquier.xyz> wrote: >> On Wed, Jul 18, 2018 at 02:35:23PM +0900, Masahiko Sawada wrote: >>> Hi, >>> >>> While debugging planner I realized that print_path() function is not >>> aware of both GatherMerge path and CustomScan path. Attached small >>> patch fixes it. >> >> Good catch. Those should be backpatched. While I am looking at this >> stuff, I have noticed that pathnode.c/reparameterize_path_by_child uses >> T_MergeAppend and not T_MergeAppendPath. >> >> This is new as of f49842d1 in v11. > > Yes that's right. Thanks for taking care of it. Thanks for the confirmation. Robert, do you want to take care of this issue or should I? >> Robert, Ashutosh, am I missing >> something? > > You used my personal email id by mistake, I think. I have removed it > and added by EDB email address. My sincere apologies. I have not been careful. -- Michael
Attachment
On Wed, Jul 18, 2018 at 12:15:25PM +0530, Ashutosh Bapat wrote: > Yes that's right. Thanks for taking care of it. Okay, I have pushed a fix for this one as that's wrong and back-patched to v11. The coverage of reparameterize_path_by_child is actually quite poor if you look at the reports: https://coverage.postgresql.org/src/backend/optimizer/util/pathnode.c.gcov.html Could it be possible to stress that more? This way mistakes like this one could have been avoided from the start. -- Michael
Attachment
On Wed, Jul 18, 2018 at 03:22:02PM +0900, Michael Paquier wrote: > Good catch. Those should be backpatched. While I am looking at this > stuff, I have noticed that pathnode.c/reparameterize_path_by_child uses > T_MergeAppend and not T_MergeAppendPath. Okay, I have checked the full list of path nodes and the two ones you mentioned are the only missing. CustomPath has been added in 9.5, so this has been patched down to this version. GatherMergePath is new as of 10. The order of the items in print_path and nodes.h was a bit messed up as well which made unnecessarily harder to check the list, so I fixed the order at the same time to ease future lookups and back-patching effort. -- Michael
Attachment
On Thu, Jul 19, 2018 at 5:37 AM, Michael Paquier <michael@paquier.xyz> wrote: > On Wed, Jul 18, 2018 at 12:15:25PM +0530, Ashutosh Bapat wrote: >> Yes that's right. Thanks for taking care of it. > > Okay, I have pushed a fix for this one as that's wrong and > back-patched to v11. The coverage of reparameterize_path_by_child is > actually quite poor if you look at the reports: > https://coverage.postgresql.org/src/backend/optimizer/util/pathnode.c.gcov.html > > Could it be possible to stress that more? This way mistakes like this > one could have been avoided from the start. I had extensive testcases in my original patch-set to exercise that code but 1. that testset was too extensive; even today partition_join.sql is a separate testcase and it's quite large. 2. that function returns NULL rather than throwing an error, if it can not produce a parameterized path. So, unless we check whether each of those paths get created no test is useful and that can only be done through an EXPLAIN OUTPUT which means that the testcase becomes fragile. I fine if we want to add more tests just to cover the code if those are not as fragile and do not blow up partition_join too much. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
On Thu, Jul 19, 2018 at 10:41:10AM +0530, Ashutosh Bapat wrote: > I had extensive testcases in my original patch-set to exercise that > code but 1. that testset was too extensive; even today > partition_join.sql is a separate testcase and it's quite large. 2. > that function returns NULL rather than throwing an error, if it can > not produce a parameterized path. So, unless we check whether each of > those paths get created no test is useful and that can only be done > through an EXPLAIN OUTPUT which means that the testcase becomes > fragile. I fine if we want to add more tests just to cover the code if > those are not as fragile and do not blow up partition_join too much. Did those really check the reparameterization of a MergeAppendPath for a child of the parent? I'd be surprised if this one was stressed as there was no match in the list. -- Michael
Attachment
On Thu, Jul 19, 2018 at 11:56 AM, Michael Paquier <michael@paquier.xyz> wrote: > On Thu, Jul 19, 2018 at 10:41:10AM +0530, Ashutosh Bapat wrote: >> I had extensive testcases in my original patch-set to exercise that >> code but 1. that testset was too extensive; even today >> partition_join.sql is a separate testcase and it's quite large. 2. >> that function returns NULL rather than throwing an error, if it can >> not produce a parameterized path. So, unless we check whether each of >> those paths get created no test is useful and that can only be done >> through an EXPLAIN OUTPUT which means that the testcase becomes >> fragile. I fine if we want to add more tests just to cover the code if >> those are not as fragile and do not blow up partition_join too much. > > Did those really check the reparameterization of a MergeAppendPath for a > child of the parent? I'd be surprised if this one was stressed as there > was no match in the list. I don't remember but given this evident, it was not. But there are some testcases in partition_join which test this code using LATERAL joins. But those do not cover the entire function. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
On Thu, Jul 19, 2018 at 9:58 AM, Michael Paquier <michael@paquier.xyz> wrote: > On Wed, Jul 18, 2018 at 03:22:02PM +0900, Michael Paquier wrote: >> Good catch. Those should be backpatched. While I am looking at this >> stuff, I have noticed that pathnode.c/reparameterize_path_by_child uses >> T_MergeAppend and not T_MergeAppendPath. > > Okay, I have checked the full list of path nodes and the two ones you > mentioned are the only missing. CustomPath has been added in 9.5, so > this has been patched down to this version. GatherMergePath is new as > of 10. > > The order of the items in print_path and nodes.h was a bit messed up as > well which made unnecessarily harder to check the list, so I fixed the > order at the same time to ease future lookups and back-patching effort. Thank you for committing the patch! Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
(2018/07/19 14:11), Ashutosh Bapat wrote: > On Thu, Jul 19, 2018 at 5:37 AM, Michael Paquier<michael@paquier.xyz> wrote: >> On Wed, Jul 18, 2018 at 12:15:25PM +0530, Ashutosh Bapat wrote: >>> Yes that's right. Thanks for taking care of it. >> >> Okay, I have pushed a fix for this one as that's wrong and >> back-patched to v11. The coverage of reparameterize_path_by_child is >> actually quite poor if you look at the reports: >> https://coverage.postgresql.org/src/backend/optimizer/util/pathnode.c.gcov.html >> >> Could it be possible to stress that more? This way mistakes like this >> one could have been avoided from the start. > > I had extensive testcases in my original patch-set to exercise that > code but 1. that testset was too extensive; even today > partition_join.sql is a separate testcase and it's quite large. 2. > that function returns NULL rather than throwing an error, if it can > not produce a parameterized path. So, unless we check whether each of > those paths get created no test is useful and that can only be done > through an EXPLAIN OUTPUT which means that the testcase becomes > fragile. I fine if we want to add more tests just to cover the code if > those are not as fragile and do not blow up partition_join too much. It would not be possible to cover these cases: case T_GatherPath: { GatherPath *gpath; FLAT_COPY_PATH(gpath, path, GatherPath); REPARAMETERIZE_CHILD_PATH(gpath->subpath); new_path = (Path *) gpath; } break; case T_GatherMergePath: { GatherMergePath *gmpath; FLAT_COPY_PATH(gmpath, path, GatherMergePath); REPARAMETERIZE_CHILD_PATH(gmpath->subpath); new_path = (Path *) gmpath; } break; because we currently don't consider gathering partial child-scan or child-join paths. I think we might consider that in future, though. Best regards, Etsuro Fujita
On Thu, Jul 26, 2018 at 1:14 AM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote: > because we currently don't consider gathering partial child-scan or > child-join paths. I think we might consider that in future, though. You generally want to put the Gather node as high up in the plan tree as possible. I think the only case in which this is beneficial is if you can't put the Gather or Gather Merge node above the Append because only some of the children are parallel-safe. In that case, a separate Gather per child can be better than no parallelism at all. It's a rare case, but it can happen. Actually, I thought we had code for this already: see the end of apply_scanjoin_target_to_paths(). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
(2018/07/27 4:50), Robert Haas wrote: > On Thu, Jul 26, 2018 at 1:14 AM, Etsuro Fujita > <fujita.etsuro@lab.ntt.co.jp> wrote: >> because we currently don't consider gathering partial child-scan or >> child-join paths. I think we might consider that in future, though. > > You generally want to put the Gather node as high up in the plan tree > as possible. I think the only case in which this is beneficial is if > you can't put the Gather or Gather Merge node above the Append because > only some of the children are parallel-safe. In that case, a separate > Gather per child can be better than no parallelism at all. It's a > rare case, but it can happen. Actually, I thought we had code for this > already: see the end of apply_scanjoin_target_to_paths(). Agreed. Thanks for the explanation! (I think that at least currently, there is no need for the Gather and GatherMerge cases in reparameterize_path_by_child, but I don't object to keeping those as-is there.) Best regards, Etsuro Fujita
On Fri, Jul 27, 2018 at 03:40:48PM +0900, Etsuro Fujita wrote: > (I think that at least currently, there is no need for the Gather and > GatherMerge cases in reparameterize_path_by_child, but I don't object to > keeping those as-is there.) Let's keep them. As far as my understanding goes, which is way lower than any of you by the way, those don't hurt and would automatically help. -- Michael