Thread: PG11 - Multiple Key Range Partition
Hello,



Currently, I'm trying to define a table with a range partition of 3 keys Recent, Deleted and Played At.
There are multiple partitions:
1. scores_2018 (recent - false, deleted - false, played_at in (2018-01-01 / 2019-01-01)
2. scores_2017 (recent - false, deleted - false, played_at in (2017-01-01 / 2018-01-01)
3. scores_recent (recent - true, deleted - false, played_at in (1990-01-01 / 2090-01-01)
4. scores_deleted (recent - false, deleted - true, played_at in (1990-01-01 / 2090-01-01)
When inserting all works fine, but when doing a select statement it does not correctly query over the correct partition, It searches on all partitions.
explain select * from scores where played_at = '2018-03-01'

explain select * from scores where recent = true and deleted = false and played_at = '2018-03-01'

When using the recent key the partition is selected correctly. However, when using the deleted key it does a full search.
explain select * from scores where deleted = true

Note: If I only create the table with only 2 partition keys Recent and Played at, all works as expected.
The 3 key range partition is not supported, or is it a bug? Can you please suggest an alternative?
The main idea is to have multiple partitions, one for each year and an extra 2, for deleted scores, and recent scores.
Thank you and have a great day!
Rares
Attachment
On Mon, 8 Jul 2019 at 21:17, Rares Salcudean <rares.salcudean@takeofflabs.com> wrote: > Currently, I'm trying to define a table with a range partition of 3 keys Recent, Deleted and Played At. > > There are multiple partitions: > > 1. scores_2018 (recent - false, deleted - false, played_at in (2018-01-01 / 2019-01-01) > 2. scores_2017 (recent - false, deleted - false, played_at in (2017-01-01 / 2018-01-01) > 3. scores_recent (recent - true, deleted - false, played_at in (1990-01-01 / 2090-01-01) > 4. scores_deleted (recent - false, deleted - true, played_at in (1990-01-01 / 2090-01-01) > > When inserting all works fine, but when doing a select statement it does not correctly query over the correct partition,It searches on all partitions. > > explain select * from scores where played_at = '2018-03-01' RANGE partitioning pruning works by the planner having knowledge that your WHERE clause cannot yield rows that are within a partition's range. Take your scores_2017 partition as an example, the range there is (false, false, '2017-01-01') to (false, false, '2018-01-01'). The planner cannot match your WHERE clause to that range since it's missing any predicate that matches a prefix of the range. This is similar to how a btree index on (recent, deleted, played_at) couldn't be used efficiently to give you just rows with played_at on any given date. You'd need something like: WHERE NOT recent AND NOT deleted AND played_at = '2018-03-01' for it to know only the scores_2018 partition can match. (There was a bug fixed recently that caused some partitions in a range partitioned table to be pruned accidentally, but you're not complaining about that.) You might want to look into sub-partitioning the table, however, see the note in [1] about that. [1] https://www.postgresql.org/docs/current/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE-BEST-PRACTICES -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Hi David,
Yield the same result:

I'm not sure what is your timeline, but can you provide an example with 3 key Range (2 booleans and 1 date) partition with a working select statement on the date?
First of all, thank you for the quick response, I highly appreciate it!
Currently, I'm trying to test out different query patterns to understand the key constraints and make the planner recognize the RANGE.
The example you suggested:
explain select * from scores where NOT(recent = true) and NOT(deleted = true) and played_at = '2018-03-02'
explain select * from scores where NOT(recent) and NOT(deleted) and played_at = '2018-03-02'
Yield the same result:

It does a search over all partitions, mainly because I think the planner still cannot match the where clause to the range.
I can not find any queries that output the desired results. And for the moment I like our strategy of having a partition for each year, plus the extra two (recent and deleted). In terms of Insert / Update / Deleted / Upsert / etc.. operations all works fine. I populated a DB with around 600 million rows and the partitions are populated correctly + operations except SELECT are working perfectly.
Furthermore, I am a bit confused about the fact that with 2 key range partitions (recent and played_at), all works perfectly as expected (event SELECT).
I saw you recommended sub-partitioning, we are defining each partition as in the example from the documentation.
I'm not sure what is your timeline, but can you provide an example with 3 key Range (2 booleans and 1 date) partition with a working select statement on the date?
Thank you very much and have a great day!
Rares
On Tue, Jul 9, 2019 at 12:49 AM David Rowley <david.rowley@2ndquadrant.com> wrote:
On Mon, 8 Jul 2019 at 21:17, Rares Salcudean
<rares.salcudean@takeofflabs.com> wrote:
> Currently, I'm trying to define a table with a range partition of 3 keys Recent, Deleted and Played At.
>
> There are multiple partitions:
>
> 1. scores_2018 (recent - false, deleted - false, played_at in (2018-01-01 / 2019-01-01)
> 2. scores_2017 (recent - false, deleted - false, played_at in (2017-01-01 / 2018-01-01)
> 3. scores_recent (recent - true, deleted - false, played_at in (1990-01-01 / 2090-01-01)
> 4. scores_deleted (recent - false, deleted - true, played_at in (1990-01-01 / 2090-01-01)
>
> When inserting all works fine, but when doing a select statement it does not correctly query over the correct partition, It searches on all partitions.
>
> explain select * from scores where played_at = '2018-03-01'
RANGE partitioning pruning works by the planner having knowledge that
your WHERE clause cannot yield rows that are within a partition's
range. Take your scores_2017 partition as an example, the range there
is (false, false, '2017-01-01') to (false, false, '2018-01-01'). The
planner cannot match your WHERE clause to that range since it's
missing any predicate that matches a prefix of the range. This is
similar to how a btree index on (recent, deleted, played_at) couldn't
be used efficiently to give you just rows with played_at on any given
date. You'd need something like: WHERE NOT recent AND NOT deleted AND
played_at = '2018-03-01' for it to know only the scores_2018 partition
can match.
(There was a bug fixed recently that caused some partitions in a range
partitioned table to be pruned accidentally, but you're not
complaining about that.)
You might want to look into sub-partitioning the table, however, see
the note in [1] about that.
[1] https://www.postgresql.org/docs/current/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE-BEST-PRACTICES
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On Tue, 9 Jul 2019 at 18:53, Rares Salcudean <rares.salcudean@takeofflabs.com> wrote: > The example you suggested: > > explain select * from scores where NOT(recent = true) and NOT(deleted = true) and played_at = '2018-03-02' > explain select * from scores where NOT(recent) and NOT(deleted) and played_at = '2018-03-02' > > Yield the same result: Works okay for me with: create table rangep (a bool, b bool, c date) partition by range (a,b,c); create table rangep1 partition of rangep for values from (false, false, '2019-01-01') to (false,false,'2020-01-01'); create table rangep2 partition of rangep for values from (true, true, '2019-01-01') to (true,true,'2020-01-01'); explain select * from rangep where not a and not b and c = '2019-07-10'; QUERY PLAN -------------------------------------------------------------- Seq Scan on rangep1 (cost=0.00..40.00 rows=3 width=6) Filter: ((NOT a) AND (NOT b) AND (c = '2019-07-10'::date)) (2 rows) That's on master, but v10 and v11 still prune away rangep2. It might help if you share which version you're using and a cutdown version of the schema, just enough to show the issue. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Hi David,

I tested out your example and all works fine for me as well. But I live the main issue lies in the following example:
create table rangep (a bool, b bool, c date) partition by range (a,b,c);
create table rangep1_recent partition of rangep for values from (true,false, '1990-01-01') to (true,false,'2090-01-01');
create table rangep1_deleted partition of rangep for values from (false,true, '1990-01-01') to (false,true,'2090-01-01');
create table rangep1_2019 partition of rangep for values from (false,false, '2019-01-01') to (false,false,'2020-01-01');
create table rangep1_2018 partition of rangep for values from (false,false, '2018-01-01') to (false,false,'2019-01-01');
create table rangep1_2017 partition of rangep for values from (false,false, '2017-01-01') to (false,false,'2018-01-01');
create table rangep1_2019 partition of rangep for values from (false,false, '2019-01-01') to (false,false,'2020-01-01');
create table rangep1_2018 partition of rangep for values from (false,false, '2018-01-01') to (false,false,'2019-01-01');
create table rangep1_2017 partition of rangep for values from (false,false, '2017-01-01') to (false,false,'2018-01-01');
explain select * from rangep where not a and not b and c = '2019-07-10';

When doing:
explain select * from rangep where a and not b an and c = '2019-07-10';
or simply
explain select * from rangep where a and c = '2019-07-10';
It successfully searches only in the 'recent' partition.
I believe the planner gets confused when we introduce the following two combinations: (true, false, '1990-01-01') - (true, false, '2090-01-01') and (false, true, '1990-01-01') - (false, true, '2090-01-01').
My current Postgres version is PostgreSQL 11, 2.2.4. ("PostgreSQL 11.4 on x86_64-apple-darwin16.7.0, compiled by Apple LLVM version 8.1.0 (clang-802.0.42), 64-bit")
Thank you for the quick response, I’m eager to receive your feedback.
Have a great day!
Rares
On Wed, Jul 10, 2019 at 2:33 AM David Rowley <david.rowley@2ndquadrant.com> wrote:
On Tue, 9 Jul 2019 at 18:53, Rares Salcudean
<rares.salcudean@takeofflabs.com> wrote:
> The example you suggested:
>
> explain select * from scores where NOT(recent = true) and NOT(deleted = true) and played_at = '2018-03-02'
> explain select * from scores where NOT(recent) and NOT(deleted) and played_at = '2018-03-02'
>
> Yield the same result:
Works okay for me with:
create table rangep (a bool, b bool, c date) partition by range (a,b,c);
create table rangep1 partition of rangep for values from (false,
false, '2019-01-01') to (false,false,'2020-01-01');
create table rangep2 partition of rangep for values from (true, true,
'2019-01-01') to (true,true,'2020-01-01');
explain select * from rangep where not a and not b and c = '2019-07-10';
QUERY PLAN
--------------------------------------------------------------
Seq Scan on rangep1 (cost=0.00..40.00 rows=3 width=6)
Filter: ((NOT a) AND (NOT b) AND (c = '2019-07-10'::date))
(2 rows)
That's on master, but v10 and v11 still prune away rangep2.
It might help if you share which version you're using and a cutdown
version of the schema, just enough to show the issue.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On Wed, 10 Jul 2019 at 18:39, Rares Salcudean <rares.salcudean@takeofflabs.com> wrote: > I tested out your example and all works fine for me as well. But I live the main issue lies in the following example: > > create table rangep (a bool, b bool, c date) partition by range (a,b,c); > > create table rangep1_recent partition of rangep for values from (true,false, '1990-01-01') to (true,false,'2090-01-01'); > create table rangep1_deleted partition of rangep for values from (false,true, '1990-01-01') to (false,true,'2090-01-01'); > create table rangep1_2019 partition of rangep for values from (false,false, '2019-01-01') to (false,false,'2020-01-01'); > create table rangep1_2018 partition of rangep for values from (false,false, '2018-01-01') to (false,false,'2019-01-01'); > create table rangep1_2017 partition of rangep for values from (false,false, '2017-01-01') to (false,false,'2018-01-01'); > > explain select * from rangep where not a and not b and c = '2019-07-10'; Thanks for making the test case. It is a bug. There's a problem in match_clause_to_partition_key() where because match_boolean_partition_clause() returns false for the "NOT b" condition when comparing to the first partition key this causes the function to return PARTCLAUSE_UNSUPPORTED which causes the calling function to not bother trying to match that qual up to any other partition key. Effectively, for pruning the WHERE clause is just WHERE NOT a AND c = '2019-07-10', so only the rangep1_recent partition is pruned. I'll look into fixing it. Thanks for the report. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Jul 10, 2019 at 5:37 PM David Rowley <david.rowley@2ndquadrant.com> wrote: > On Wed, 10 Jul 2019 at 18:39, Rares Salcudean > <rares.salcudean@takeofflabs.com> wrote: > > I tested out your example and all works fine for me as well. But I live the main issue lies in the following example: > > > > create table rangep (a bool, b bool, c date) partition by range (a,b,c); > > > > create table rangep1_recent partition of rangep for values from (true,false, '1990-01-01') to (true,false,'2090-01-01'); > > create table rangep1_deleted partition of rangep for values from (false,true, '1990-01-01') to (false,true,'2090-01-01'); > > create table rangep1_2019 partition of rangep for values from (false,false, '2019-01-01') to (false,false,'2020-01-01'); > > create table rangep1_2018 partition of rangep for values from (false,false, '2018-01-01') to (false,false,'2019-01-01'); > > create table rangep1_2017 partition of rangep for values from (false,false, '2017-01-01') to (false,false,'2018-01-01'); > > > > explain select * from rangep where not a and not b and c = '2019-07-10'; > > Thanks for making the test case. It is a bug. There's a problem in > match_clause_to_partition_key() where because > match_boolean_partition_clause() returns false for the "NOT b" > condition when comparing to the first partition key this causes the > function to return PARTCLAUSE_UNSUPPORTED which causes the calling > function to not bother trying to match that qual up to any other > partition key. Oops, indeed a bug. Should've returned PARTCLAUSE_NOMATCH there. Thanks for taking it up. Regards, Amit
On Wed, 10 Jul 2019 at 21:02, Amit Langote <amitlangote09@gmail.com> wrote: > > On Wed, Jul 10, 2019 at 5:37 PM David Rowley > > Thanks for making the test case. It is a bug. There's a problem in > > match_clause_to_partition_key() where because > > match_boolean_partition_clause() returns false for the "NOT b" > > condition when comparing to the first partition key this causes the > > function to return PARTCLAUSE_UNSUPPORTED which causes the calling > > function to not bother trying to match that qual up to any other > > partition key. > > Oops, indeed a bug. Should've returned PARTCLAUSE_NOMATCH there. Here's a first draft fix for this, based on master. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Thank you guys for the quick response.
Which would be the simplest way to "import/apply" this fix locally for me? Do I have to wait until the new version is released?
By the way, keep it up, awesome work!
Rares
On Wed, Jul 10, 2019 at 1:07 PM David Rowley <david.rowley@2ndquadrant.com> wrote:
On Wed, 10 Jul 2019 at 21:02, Amit Langote <amitlangote09@gmail.com> wrote:
>
> On Wed, Jul 10, 2019 at 5:37 PM David Rowley
> > Thanks for making the test case. It is a bug. There's a problem in
> > match_clause_to_partition_key() where because
> > match_boolean_partition_clause() returns false for the "NOT b"
> > condition when comparing to the first partition key this causes the
> > function to return PARTCLAUSE_UNSUPPORTED which causes the calling
> > function to not bother trying to match that qual up to any other
> > partition key.
>
> Oops, indeed a bug. Should've returned PARTCLAUSE_NOMATCH there.
Here's a first draft fix for this, based on master.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Wed, 10 Jul 2019 at 23:14, Rares Salcudean <rares.salcudean@takeofflabs.com> wrote: > Which would be the simplest way to "import/apply" this fix locally for me? Do I have to wait until the new version is released? That's really up to you. I'd at least wait for the patch to be committed though. From then, if you're comfortable patching and building yourself then you could get the fix sooner. Otherwise, there should be a release around the end of September. I'm about to take a more serious look at the patch I posted a few days ago. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Jul 12, 2019 at 03:54:01PM +1200, David Rowley wrote: > That's really up to you. I'd at least wait for the patch to be > committed though. From then, if you're comfortable patching and > building yourself then you could get the fix sooner. Otherwise, there > should be a release around the end of September. The next minor version release is planned for the beginning of August per the roadmap: https://www.postgresql.org/developer/roadmap/ So assuming that this gets merged by then, the fix will be released in a couple of weeks. -- Michael
Attachment
On Fri, Jul 12, 2019 at 12:54 PM David Rowley <david.rowley@2ndquadrant.com> wrote: > I'm about to take a more serious look at the patch I posted a few days ago. I have looked at the patch. It seems right to make match_boolean_partition_clause() return PartClauseMatchStatus instead of bool. Didn't find anything to complain about. Thanks, Amit
On Fri, 12 Jul 2019 at 18:53, Amit Langote <amitlangote09@gmail.com> wrote: > I have looked at the patch. It seems right to make > match_boolean_partition_clause() return PartClauseMatchStatus instead > of bool. Didn't find anything to complain about. Thanks for having a look. I've just pushed roughly that patch with just some additional comments. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Jul 12, 2019 at 4:15 PM David Rowley <david.rowley@2ndquadrant.com> wrote: > On Fri, 12 Jul 2019 at 18:53, Amit Langote <amitlangote09@gmail.com> wrote: > > I have looked at the patch. It seems right to make > > match_boolean_partition_clause() return PartClauseMatchStatus instead > > of bool. Didn't find anything to complain about. > > Thanks for having a look. I've just pushed roughly that patch with > just some additional comments. Thanks for taking care of this. Regards, Amit