Re: [Proposal] Table partition + join pushdown - Mailing list pgsql-hackers
From | Kouhei Kaigai |
---|---|
Subject | Re: [Proposal] Table partition + join pushdown |
Date | |
Msg-id | 9A28C8860F777E439AA12E8AEA7694F8011578A9@BPXM15GP.gisp.nec.co.jp Whole thread Raw |
In response to | Re: [Proposal] Table partition + join pushdown (Taiki Kondo <tai-kondo@yk.jp.nec.com>) |
Responses |
Re: [Proposal] Table partition + join pushdown
|
List | pgsql-hackers |
> -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Taiki Kondo > Sent: Thursday, October 08, 2015 5:28 PM > To: Kyotaro HORIGUCHI > Cc: Kaigai Kouhei(海外 浩平); Iwaasa Akio(岩浅 晃郎); > pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] [Proposal] Table partition + join pushdown > > Hello, Horiguchi-san. > > Thank you for your comment. > > > I got some warning on compilation on unused variables and wrong > > arguemtn type. > > OK, I'll fix it. > > > I failed to have a query that this patch works on. Could you let > > me have some specific example for this patch? > > Please find attached. > And also make sure that setting of work_mem is '64kB' (not 64MB). > > If there is the large work_mem enough to create hash table for > relation after appending, its cost may be better than pushed-down > plan's cost, then planner will not choose pushed-down plan this patch makes. > So, to make this patch working fine, work_mem size must be smaller than > the hash table size for relation after appending. > > > This patch needs more comments. Please put comment about not only > > what it does but also the reason and other things for it. > > OK, I'll put more comments in the code. > But it will take a long time, maybe... > People (including me) can help. Even though your English capability is not enough, it is significant to put intention of the code. > > -- about namings > > > > Names for functions and variables are needed to be more > > appropriate, in other words, needed to be what properly informs > > what they are. The followings are the examples of such names. > > Thank you for your suggestion. > > I also think these names are not good much. > I'll try to make names better , but it maybe take a long time... > Of course, I will use your suggestion as reference. > > > "added_restrictlist"'s widely distributed as many function > > arguemnts and JoinPathExtraData makes me feel > > dizzy.. > > "added_restrictinfo" will be deleted from almost functions > other than try_join_pushdown() in next (v2) patch because > the place of filtering using this info will be changed > from Join node to Scan node and not have to place it > into other than try_join_pushdown(). > This restrictinfo intends to filter out obviously unrelated rows in this join, due to the check constraint of other side of the join. So, correct but redundant name is: restrictlist_to_drop_unrelated_rows_because_of_check_constraint How about 'restrictlist_by_constraint' instead? > > In make_restrictinfos_from_check_constr, the function returns > > modified constraint predicates correspond to vars under > > hashjoinable join clauses. I don't think expression_tree_mutator > > is suitable to do that since it could allow unwanted result when > > constraint predicates or join clauses are not simple OpExpr's. > > Do you have any example of this situation? > I am trying to find unwanted results you mentioned, but I don't have > any results in this timing. I have a hunch that it will allow unwanted > results because I have thought only about very simple situation for > this function. > check_constraint_mutator makes modified restrictlist with relacing Var node only when join clause is hash-joinable. It implies <expr> = <expr> form, thus we can safely replace the expression by the other side. Of course, we still have cases we cannot replace expressions simply. - If function (or function called by operators) has volatile attribute (who use volatile function on CHECK constraint ofpartitioning?) - If it is uncertain whether expression returns always same result. (is it possible to contain SubLink in the constraint?) I'd like to suggest to use white-list approach in this mutator routine. It means that only immutable expression node are allowed to include the modified restrictlist. Things to do is: check_constraint_mutator(...) { if (node == NULL) return NULL; if (IsA(node, Var)) { : } else if (node is not obviously immutable) { context->is_mutated= false; <-- prohibit to make if expression } contains uncertain node.return expression_tree_mutator(...) } > > Otherwise could you give me clear explanation on what it does? > > This function transfers CHECK() constraint to filter expression by following > procedures. > (1) Get outer table's CHECK() constraint by using get_relation_constraints(). > (2) Walk through expression tree got in (1) by using expression_tree_mutator() > with check_constraint_mutator() and change only outer's Var node to > inner's one according to join clause. > > For example, when CHECK() constraint of table A is "num % 4 = 0" and > join clause between table A and B is "A.num = B.data", > then we can get "B.data % 4 = 0" for filtering purpose. > > This also accepts more complex join clause like "A.num = B.data * 2", > then we can get "(B.data * 2) % 4 = 0". > > In procedure (2), to decide whether to use each join clause for changing > Var node or not, I implement check_constraint_mutator() to judge whether > join clause is hash-joinable or not. > > Actually, I want to judge whether OpExpr as top expression tree of > join clause means "=" or not, but I can't find how to do it. > > If you know how to do it, please let me know. > > Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com> > -----Original Message----- > From: Kyotaro HORIGUCHI [mailto:horiguchi.kyotaro@lab.ntt.co.jp] > Sent: Tuesday, October 06, 2015 8:35 PM > To: tai-kondo@yk.jp.nec.com > Cc: kaigai@ak.jp.nec.com; aki-iwaasa@vt.jp.nec.com; > pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] [Proposal] Table partition + join pushdown > > Hello. > > I tried to read this and had some random comments on this. > > -- general > > I got some warning on compilation on unused variables and wrong arguemtn type. > > I failed to have a query that this patch works on. Could you let me have some > specific example for this patch? > > This patch needs more comments. Please put comment about not only what it does > but also the reason and other things for it. > > > -- about namings > > Names for functions and variables are needed to be more appropriate, in other > words, needed to be what properly informs what they are. The followings are the > examples of such names. > > "added_restrictlist"'s widely distributed as many function arguemnts and > JoinPathExtraData makes me feel dizzy.. create_mergejoin_path takes it as > "filtering_clauses", which looks far better. > > try_join_pushdown() is also the name with much wider meaning. This patch tries > to move hashjoins on inheritance parent to under append paths. It could be > generically called 'pushdown' > but this would be better be called such like 'transform appended hashjoin' or > 'hashjoin distribution'. The latter would be better. > (The function name would be try_distribute_hashjoin for the > case.) > > The name make_restrictinfos_from_check_contr() also tells me wrong thing. For > example, > extract_constraints_for_hashjoin_distribution() would inform me about what it > returns. > > > -- about what make_restrictinfos_from_check_constr() does > > In make_restrictinfos_from_check_constr, the function returns modified > constraint predicates correspond to vars under hashjoinable join clauses. I don't > think expression_tree_mutator is suitable to do that since it could allow unwanted > result when constraint predicates or join clauses are not simple OpExpr's. > > Could you try more simple and straight-forward way to do that? > Otherwise could you give me clear explanation on what it does? > > regards, > > -- > Kyotaro Horiguchi > NTT Open Source Software Center > > >
pgsql-hackers by date: