Home > mailing lists

Re: Extend more usecase for planning time partition pruning and init partition pruning. - Mailing list pgsql-hackers

From	Andy Fan
Subject	Re: Extend more usecase for planning time partition pruning and init partition pruning.
Date	February 8, 2021 07:43:47
Msg-id	CAKU4AWqeHe=ZudHvp=QikBujzfDrhZ8=Azkh4KTmeExFmcbcNA@mail.gmail.com Whole thread Raw
In response to	Re: Extend more usecase for planning time partition pruning and init partition pruning. (Andy Fan <zhihui.fan1213@gmail.com>)
Responses	Re: Extend more usecase for planning time partition pruning and init partition pruning.
List	pgsql-hackers

Tree view

On Mon, Jan 25, 2021 at 10:21 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:

On Sun, Jan 24, 2021 at 6:34 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:
Hi:

I recently found a use case like this. SELECT * FROM p, q WHERE p.partkey =
q.colx AND (q.colx = $1 OR q.colx = $2); Then we can't do either planning time
partition prune or init partition prune. Even though we have run-time
partition pruning work at last, it is too late in some cases since we have
to init all the plan nodes in advance. In my case, there are 10+
partitioned relation in one query and the execution time is short, so the
init plan a lot of plan nodes cares a lot.

The attached patches fix this issue. It just get the "p.partkey = q.colx"
case in root->eq_classes or rel->joinlist (outer join), and then check if there
is some baserestrictinfo in another relation which can be used for partition
pruning. To make the things easier, both partkey and colx must be Var
expression in implementation.

- v1-0001-Make-some-static-functions-as-extern-and-extend-C.patch

Just some existing refactoring and extending ChangeVarNodes to be able
to change var->attno.

- v1-0002-Build-some-implied-pruning-quals-to-extend-the-us.patch

Do the real job.

Thought?

--
Best Regards
Andy Fan (https://www.aliyun.com/)

Some results from this patch.

create table p (a int, b int, c character varying(8)) partition by list(c);
create table p1 partition of p for values in ('000001');
create table p2 partition of p for values in ('000002');
create table p3 partition of p for values in ('000003');
create table q (a int, c character varying(8), b int) partition by list(c);
create table q1 partition of q for values in ('000001');
create table q2 partition of q for values in ('000002');
create table q3 partition of q for values in ('000003');

Before the patch:
postgres=# explain (costs off) select * from p inner join q on p.c = q.c and q.c > '000002';
QUERY PLAN
----------------------------------------------------
Hash Join
Hash Cond: ((p.c)::text = (q.c)::text)
-> Append
-> Seq Scan on p1 p_1
-> Seq Scan on p2 p_2
-> Seq Scan on p3 p_3
-> Hash
-> Seq Scan on q3 q
Filter: ((c)::text > '000002'::text)
(9 rows)

After the patch:

QUERY PLAN
----------------------------------------------------
Hash Join
Hash Cond: ((p.c)::text = (q.c)::text)
-> Seq Scan on p3 p
-> Hash
-> Seq Scan on q3 q
Filter: ((c)::text > '000002'::text)
(6 rows)

Before the patch:
postgres=# explain (costs off) select * from p inner join q on p.c = q.c and (q.c = '000002' or q.c = '000001');
QUERY PLAN
--------------------------------------------------------------------------------------------
Hash Join
Hash Cond: ((p.c)::text = (q.c)::text)
-> Append
-> Seq Scan on p1 p_1
-> Seq Scan on p2 p_2
-> Seq Scan on p3 p_3
-> Hash
-> Append
-> Seq Scan on q1 q_1
Filter: (((c)::text = '000002'::text) OR ((c)::text = '000001'::text))
-> Seq Scan on q2 q_2
Filter: (((c)::text = '000002'::text) OR ((c)::text = '000001'::text))
(12 rows)

After the patch:
QUERY PLAN
--------------------------------------------------------------------------------------------
Hash Join
Hash Cond: ((p.c)::text = (q.c)::text)
-> Append
-> Seq Scan on p1 p_1
-> Seq Scan on p2 p_2
-> Hash
-> Append
-> Seq Scan on q1 q_1
Filter: (((c)::text = '000002'::text) OR ((c)::text = '000001'::text))
-> Seq Scan on q2 q_2
Filter: (((c)::text = '000002'::text) OR ((c)::text = '000001'::text))
(11 rows)

Before the patch:
postgres=# explain (costs off) select * from p left join q on p.c = q.c where (q.c = '000002' or q.c = '000001');
QUERY PLAN
--------------------------------------------------------------------------------------------
Hash Join
Hash Cond: ((p.c)::text = (q.c)::text)
-> Append
-> Seq Scan on p1 p_1
-> Seq Scan on p2 p_2
-> Seq Scan on p3 p_3
-> Hash
-> Append
-> Seq Scan on q1 q_1
Filter: (((c)::text = '000002'::text) OR ((c)::text = '000001'::text))
-> Seq Scan on q2 q_2
Filter: (((c)::text = '000002'::text) OR ((c)::text = '000001'::text))
(12 rows)

After the patch:
QUERY PLAN
--------------------------------------------------------------------------------------------
Hash Join
Hash Cond: ((p.c)::text = (q.c)::text)
-> Append
-> Seq Scan on p1 p_1
-> Seq Scan on p2 p_2
-> Hash
-> Append
-> Seq Scan on q1 q_1
Filter: (((c)::text = '000002'::text) OR ((c)::text = '000001'::text))
-> Seq Scan on q2 q_2
Filter: (((c)::text = '000002'::text) OR ((c)::text = '000001'::text))
(11 rows)

--
Best Regards
Andy Fan (https://www.aliyun.com/)

Here is a performance test regarding this patch. In the following simple case,

we can get 3x faster than before.

create table p (a int, b int, c int) partition by list(c);
select 'create table p_'||i||' partition of p for values in (' || i || ');' from generate_series(1, 100)i; \gexec
insert into p select i, i, i from generate_series(1, 100)i;
create table m as select * from p;
analyze m;
analyze p;

test sql: select * from m, p where m.c = p.c and m.c in (3, 10);

With this patch: 1.1ms

Without this patch: 3.4ms

I'm happy with the result and the implementation, I have add this into

commitfest https://commitfest.postgresql.org/32/2975/

Thanks.

Best Regards

Andy Fan (https://www.aliyun.com/)

pgsql-hackers by date:

From: Michael Paquier
Date: 08 February 2021, 07:35:19
Subject: Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM

From: Kyotaro Horiguchi
Date: 08 February 2021, 08:02:12
Subject: Re: Support tab completion for upper character inputs in psql

Re: Extend more usecase for planning time partition pruning and init partition pruning. - Mailing list pgsql-hackers

Previous

Next