Re: Extend more usecase for planning time partition pruning and init partition pruning. - Mailing list pgsql-hackers
From | Andy Fan |
---|---|
Subject | Re: Extend more usecase for planning time partition pruning and init partition pruning. |
Date | |
Msg-id | CAKU4AWqeHe=ZudHvp=QikBujzfDrhZ8=Azkh4KTmeExFmcbcNA@mail.gmail.com Whole thread Raw |
In response to | Re: Extend more usecase for planning time partition pruning and init partition pruning. (Andy Fan <zhihui.fan1213@gmail.com>) |
Responses |
Re: Extend more usecase for planning time partition pruning and init partition pruning.
|
List | pgsql-hackers |
On Mon, Jan 25, 2021 at 10:21 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:
On Sun, Jan 24, 2021 at 6:34 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:Hi:
I recently found a use case like this. SELECT * FROM p, q WHERE p.partkey =
q.colx AND (q.colx = $1 OR q.colx = $2); Then we can't do either planning time
partition prune or init partition prune. Even though we have run-time
partition pruning work at last, it is too late in some cases since we have
to init all the plan nodes in advance. In my case, there are 10+
partitioned relation in one query and the execution time is short, so the
init plan a lot of plan nodes cares a lot.
The attached patches fix this issue. It just get the "p.partkey = q.colx"
case in root->eq_classes or rel->joinlist (outer join), and then check if there
is some baserestrictinfo in another relation which can be used for partition
pruning. To make the things easier, both partkey and colx must be Var
expression in implementation.
- v1-0001-Make-some-static-functions-as-extern-and-extend-C.patch
Just some existing refactoring and extending ChangeVarNodes to be ableto change var->attno.
- v1-0002-Build-some-implied-pruning-quals-to-extend-the-us.patch
Do the real job.
Thought?--Best RegardsAndy Fan (https://www.aliyun.com/)Some results from this patch.create table p (a int, b int, c character varying(8)) partition by list(c);
create table p1 partition of p for values in ('000001');
create table p2 partition of p for values in ('000002');
create table p3 partition of p for values in ('000003');
create table q (a int, c character varying(8), b int) partition by list(c);
create table q1 partition of q for values in ('000001');
create table q2 partition of q for values in ('000002');
create table q3 partition of q for values in ('000003');Before the patch:
postgres=# explain (costs off) select * from p inner join q on p.c = q.c and q.c > '000002';
QUERY PLAN
----------------------------------------------------
Hash Join
Hash Cond: ((p.c)::text = (q.c)::text)
-> Append
-> Seq Scan on p1 p_1
-> Seq Scan on p2 p_2
-> Seq Scan on p3 p_3
-> Hash
-> Seq Scan on q3 q
Filter: ((c)::text > '000002'::text)
(9 rows)
After the patch:
QUERY PLAN
----------------------------------------------------
Hash Join
Hash Cond: ((p.c)::text = (q.c)::text)
-> Seq Scan on p3 p
-> Hash
-> Seq Scan on q3 q
Filter: ((c)::text > '000002'::text)
(6 rows)
Before the patch:
postgres=# explain (costs off) select * from p inner join q on p.c = q.c and (q.c = '000002' or q.c = '000001');
QUERY PLAN
--------------------------------------------------------------------------------------------
Hash Join
Hash Cond: ((p.c)::text = (q.c)::text)
-> Append
-> Seq Scan on p1 p_1
-> Seq Scan on p2 p_2
-> Seq Scan on p3 p_3
-> Hash
-> Append
-> Seq Scan on q1 q_1
Filter: (((c)::text = '000002'::text) OR ((c)::text = '000001'::text))
-> Seq Scan on q2 q_2
Filter: (((c)::text = '000002'::text) OR ((c)::text = '000001'::text))
(12 rows)
After the patch:
QUERY PLAN
--------------------------------------------------------------------------------------------
Hash Join
Hash Cond: ((p.c)::text = (q.c)::text)
-> Append
-> Seq Scan on p1 p_1
-> Seq Scan on p2 p_2
-> Hash
-> Append
-> Seq Scan on q1 q_1
Filter: (((c)::text = '000002'::text) OR ((c)::text = '000001'::text))
-> Seq Scan on q2 q_2
Filter: (((c)::text = '000002'::text) OR ((c)::text = '000001'::text))
(11 rows)
Before the patch:
postgres=# explain (costs off) select * from p left join q on p.c = q.c where (q.c = '000002' or q.c = '000001');
QUERY PLAN
--------------------------------------------------------------------------------------------
Hash Join
Hash Cond: ((p.c)::text = (q.c)::text)
-> Append
-> Seq Scan on p1 p_1
-> Seq Scan on p2 p_2
-> Seq Scan on p3 p_3
-> Hash
-> Append
-> Seq Scan on q1 q_1
Filter: (((c)::text = '000002'::text) OR ((c)::text = '000001'::text))
-> Seq Scan on q2 q_2
Filter: (((c)::text = '000002'::text) OR ((c)::text = '000001'::text))
(12 rows)
After the patch:
QUERY PLAN
--------------------------------------------------------------------------------------------
Hash Join
Hash Cond: ((p.c)::text = (q.c)::text)
-> Append
-> Seq Scan on p1 p_1
-> Seq Scan on p2 p_2
-> Hash
-> Append
-> Seq Scan on q1 q_1
Filter: (((c)::text = '000002'::text) OR ((c)::text = '000001'::text))
-> Seq Scan on q2 q_2
Filter: (((c)::text = '000002'::text) OR ((c)::text = '000001'::text))
(11 rows)--Best RegardsAndy Fan (https://www.aliyun.com/)
Here is a performance test regarding this patch. In the following simple case,
we can get 3x faster than before.
create table p (a int, b int, c int) partition by list(c);
select 'create table p_'||i||' partition of p for values in (' || i || ');' from generate_series(1, 100)i; \gexec
insert into p select i, i, i from generate_series(1, 100)i;
create table m as select * from p;
analyze m;
analyze p;
select 'create table p_'||i||' partition of p for values in (' || i || ');' from generate_series(1, 100)i; \gexec
insert into p select i, i, i from generate_series(1, 100)i;
create table m as select * from p;
analyze m;
analyze p;
test sql: select * from m, p where m.c = p.c and m.c in (3, 10);
With this patch: 1.1ms
Without this patch: 3.4ms
I'm happy with the result and the implementation, I have add this into
commitfest https://commitfest.postgresql.org/32/2975/
Thanks.
Best Regards
Andy Fan (https://www.aliyun.com/)
pgsql-hackers by date: