Thread: literal vs dynamic partition constraint in plan execution
Ok, the title is a little buzz, however I've got a partitioned table and one "leaf" has a set of checks against a timestamp field to ensure that tuples within such table belongs to the year and month: testdb=# \d respi.y2019m08 ... Partition of: respi.y2019 FOR VALUES IN ('8') Check constraints: "y2019_mis_ora_check" CHECK (date_part('year'::text, mis_ora) = 2019::double precision) "y2019_mis_ora_check1" CHECK (mis_ora >= make_timestamp(2019, 1, 1, 0, 0, 0::double precision) AND mis_ora <= make_timestamp(2019, 12, 31, 23, 59, 59::double precision)) "y2019m08_mis_ora_check" CHECK (date_part('month'::text, mis_ora) = 8::double precision) "y2019m08_mis_ora_check1" CHECK (date_part('year'::text, mis_ora) = 2019::double precision) "y2019m08_mis_ora_check2" CHECK (mis_ora >= make_timestamp(2019, 8, 1, 0, 0, 0::double precision) AND mis_ora <= make_timestamp(2019, 8, 31, 23, 59, 59::double precision)) "y2019m08_mis_ora_check3" CHECK (mis_ora >= make_timestamp(2019, 8, 1, 0, 0, 0::double precision) AND mis_ora <= make_timestamp(2019, 8, 31, 23, 59, 59::double precision)) So y2019m08 accepts only tuples where 'mis_ora' has a timestamp that is contained into the eigth month of the year. Now if I look at the plan for this query everything works as expected (I disabled parallel scans for better see the plan): testdb=# explain select * from respi.root where ts >= '2019-08-28 23:35:00.007245' and mis_ora >= '2019-08-29 16:28:48.711482' order by ts; ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- Sort (cost=353986.27..353991.59 rows=2129 width=40) Sort Key: y2019m08.ts -> Append (cost=0.00..353868.58 rows=2129 width=40) -> Seq Scan on y2019m08 (cost=0.00..353409.93 rows=1 width=40) Filter: ((ts >= '2019-08-28 23:35:00.007245'::timestamp without time zone) AND (mis_ora >= '2019-08-29 16:28:48.711482'::timestamp without time zone)) -> Seq Scan on y2019m09 (cost=0.00..28.00 rows=133 width=40) Filter: ((ts >= '2019-08-28 23:35:00.007245'::timestamp without time zone) AND (mis_ora >= '2019-08-29 16:28:48.711482'::timestamp without time zone)) -> Seq Scan on y2019m10 (cost=0.00..28.00 rows=133 width=40) Filter: ((ts >= '2019-08-28 23:35:00.007245'::timestamp without time zone) AND (mis_ora >= '2019-08-29 16:28:48.711482'::timestamp without time zone)) -> Seq Scan on y2019m11 (cost=0.00..28.00 rows=133 width=40) Filter: ((ts >= '2019-08-28 23:35:00.007245'::timestamp without time zone) AND (mis_ora >= '2019-08-29 16:28:48.711482'::timestamp without time zone)) -> Seq Scan on y2019m12 (cost=0.00..28.00 rows=133 width=40) Filter: ((ts >= '2019-08-28 23:35:00.007245'::timestamp without time zone) AND (mis_ora >= '2019-08-29 16:28:48.711482'::timestamp without time zone)) The "as I expected" means that the system starts scanning from y2019m08 and following (in time) tables, and does not scan previous time tables. This works if the mis_ora is compared against a literal timestamp, but if I simply change it with a dynamic timestamp: testdb=# explain select * from respi.root where ts >= '2019-08-28 23:35:00.007245' and mis_ora >= current_timestamp order by ts; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------ Sort (cost=4654860.37..4654865.25 rows=1952 width=36) Sort Key: r.ts -> Nested Loop (cost=0.00..4654753.69 rows=1952 width=36) Join Filter: (r.sen_id = s.sen_id) -> Append (cost=0.00..4638927.56 rows=3204 width=32) -> Seq Scan on y2018m01 r (cost=0.00..31.00 rows=133 width=32) Filter: ((ts >= '2019-08-28 23:35:00.007245'::timestamp without time zone) AND (mis_ora >= CURRENT_TIMESTAMP)) -> Seq Scan on y2018m02 r_1 (cost=0.00..31.00 rows=133 width=32) Filter: ((ts >= '2019-08-28 23:35:00.007245'::timestamp without time zone) AND (mis_ora >= CURRENT_TIMESTAMP)) -> Seq Scan on y2018m03 r_2 (cost=0.00..31.00 rows=133 width=32) Filter: ((ts >= '2019-08-28 23:35:00.007245'::timestamp without time zone) AND (mis_ora >= CURRENT_TIMESTAMP)) -> Seq Scan on y2018m04 r_3 (cost=0.00..31.00 rows=133 width=32) Filter: ((ts >= '2019-08-28 23:35:00.007245'::timestamp without time zone) AND (mis_ora >= CURRENT_TIMESTAMP)) also the tables for the past year are scanned. Moreover, the planner thinks I will get 133 rows out of, for instance, y2018m01 which is impossible. So, do I have defined the constraint on each table in a wrong manner? testdb=# select version(); version --------------------------------------------------------------------------------------------------------- PostgreSQL 11.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36), 64-bit (1 row) testdb=# show constraint_exclusion ; constraint_exclusion ---------------------- partition Thanks, Luca
On Thu, Aug 29, 2019 at 4:45 PM Luca Ferrari <fluca1978@gmail.com> wrote: > > Ok, the title is a little buzz, however I've got a partitioned table > and one "leaf" has a set of checks against a timestamp field to ensure > that tuples within such table belongs to the year and month: Of course, all the siblings have similar constraints. So my partition starts at a table named "root", then it it has a level for the year, and each year has subpartitions for months: - root - y2018 - y2018m01, y2018m02, ... - y2019 - y2019m01, y2019m02 .... All partitions have been created equally, and constraints seem fine to me: testdb=# \d respi.y2018m01 ... Partition of: respi.y2018 FOR VALUES IN ('1') Check constraints: "y2018_mis_ora_check" CHECK (date_part('year'::text, mis_ora) = 2018::double precision) "y2018_mis_ora_check1" CHECK (mis_ora >= make_timestamp(2018, 1, 1, 0, 0, 0::double precision) AND mis_ora <= make_timestamp(2018, 12, 31, 23, 59, 59::double precision)) "y2018m01_mis_ora_check" CHECK (date_part('year'::text, mis_ora) = 2018::double precision) "y2018m01_mis_ora_check1" CHECK (mis_ora >= make_timestamp(2018, 1, 1, 0, 0, 0::double precision) AND mis_ora <= make_timestamp(2018, 1, 31, 23, 59, 59::double precision)) testdb=># \d+ respi.y2018 ... Partition of: respi.root FOR VALUES IN ('2018') Partition constraint: ((date_part('year'::text, mis_ora) IS NOT NULL) AND (date_part('year'::text, mis_ora) = '2018'::double precision)) Partition key: LIST (date_part('month'::text, mis_ora)) Check constraints: "y2018_mis_ora_check" CHECK (date_part('year'::text, mis_ora) = 2018::double precision) "y2018_mis_ora_check1" CHECK (mis_ora >= make_timestamp(2018, 1, 1, 0, 0, 0::double precision) AND mis_ora <= make_timestamp(2018, 12, 31, 23, 59, 59::double precision)) Partitions: respi.y2018m01 FOR VALUES IN ('1'), respi.y2018m02 FOR VALUES IN ('2'), respi.y2018m03 FOR VALUES IN ('3'), respi.y2018m04 FOR VALUES IN ('4'), respi.y2018m05 FOR VALUES IN ('5'), respi.y2018m06 FOR VALUES IN ('6'), respi.y2018m07 FOR VALUES IN ('7'), respi.y2018m08 FOR VALUES IN ('8'), respi.y2018m09 FOR VALUES IN ('9'), ... With the above constraint, all the branch starting at y2018 should be excluded when selecting with mis_ora >= CURRENT_TIMESTAMP (the date of the server is right, of course). Why is instead scanned (as reported by the execution plan in the previous email)? Thanks, Luca
On Fri, Aug 30, 2019 at 8:29 AM Luca Ferrari <fluca1978@gmail.com> wrote: > testdb=># \d+ respi.y2018 > ... > Partition of: respi.root FOR VALUES IN ('2018') > Partition constraint: ((date_part('year'::text, mis_ora) IS NOT NULL) > AND (date_part('year'::text, mis_ora) = '2018'::double precision)) > Partition key: LIST (date_part('month'::text, mis_ora)) > Check constraints: > "y2018_mis_ora_check" CHECK (date_part('year'::text, mis_ora) = > 2018::double precision) > "y2018_mis_ora_check1" CHECK (mis_ora >= make_timestamp(2018, 1, > 1, 0, 0, 0::double precision) AND mis_ora <= make_timestamp(2018, 12, > 31, 23, 59, 59::double precision)) > Partitions: respi.y2018m01 FOR VALUES IN ('1'), > respi.y2018m02 FOR VALUES IN ('2'), > respi.y2018m03 FOR VALUES IN ('3'), > respi.y2018m04 FOR VALUES IN ('4'), > respi.y2018m05 FOR VALUES IN ('5'), > respi.y2018m06 FOR VALUES IN ('6'), > respi.y2018m07 FOR VALUES IN ('7'), > respi.y2018m08 FOR VALUES IN ('8'), > respi.y2018m09 FOR VALUES IN ('9'), > ... > While the condition mis_ora >= current_timestamp does not cut off the 2018 branch, the following does =# explain select * from respi.root where ts >= '2019-08-28 23:35:00.007245' and extract( year from mis_ora ) = extract( year from current_timestamp ) and extract( month from mis_ora ) >= extract( month from current_timestamp ) order by ts; Sort (cost=7246692.21..7246692.28 rows=26 width=36) Sort Key: r.ts -> Nested Loop (cost=0.00..7246691.60 rows=26 width=36) Join Filter: (r.sen_id = s.sen_id) -> Seq Scan on sensori s (cost=0.00..13.57 rows=329 width=16) Filter: interesting -> Materialize (cost=0.00..7246465.93 rows=43 width=32) -> Append (cost=0.00..7246465.72 rows=43 width=32) Subplans Removed: 31 -> Seq Scan on y2019m08 r (cost=0.00..623008.30 rows=2 width=32) Filter: ((ts >= '2019-08-28 23:35:00.007245'::timestamp without time zone) AND (date_part('month'::text, mis_ora) >= date_part('month'::text, CURRENT_TIMESTAMP)) AND (date_part('year'::text, mis_ora) = date_part('year'::text, CURRENT_TIMESTAMP))) -> Seq Scan on y2019m09 r_1 (cost=0.00..49.00 rows=1 width=32) Filter: ((ts >= '2019-08-28 23:35:00.007245'::timestamp without time zone) AND (date_part('month'::text, mis_ora) >= date_part('month'::text, CURRENT_TIMESTAMP)) AND (date_part('year'::text, mis_ora) = date_part('year'::text, CURRENT_TIMESTAMP))) -> Seq Scan on y2019m10 r_2 (cost=0.00..49.00 rows=1 width=32) Filter: ((ts >= '2019-08-28 23:35:00.007245'::timestamp without time zone) AND (date_part('month'::text, mis_ora) >= date_part('month'::text, CURRENT_TIMESTAMP)) AND (date_part('year'::text, mis_ora) = date_part('year'::text, CURRENT_TIMESTAMP))) -> Seq Scan on y2019m11 r_3 (cost=0.00..49.00 rows=1 width=32) Filter: ((ts >= '2019-08-28 23:35:00.007245'::timestamp without time zone) AND (date_part('month'::text, mis_ora) >= date_part('month'::text, CURRENT_TIMESTAMP)) AND (date_part('year'::text, mis_ora) = date_part('year'::text, CURRENT_TIMESTAMP))) The fact that making explicit the condition against the year and the month, which are the top level partition constraint, makes me think that the executor will try to go down all the branches to the leaf if the condition is not filtered at the top level. Even if I don't understand why. Luca
partition by range or by list constraint check (was Re: literal vsdynamic partition constraint in plan execution)
From
Luca Ferrari
Date:
I've done a simple test case, and find out that probably the problem I got was due to the partition schema I'm using. I want a table to be partitioned by a timestamp field with a first level partition by year, and a second level by month. Therefore, I did a BY LIST partitioning, but that produces a wrong constraint check when executing a query. This is a reproducible example. BEGIN; CREATE TABLE root( pk int generated always as identity, v int, ts timestamp default current_timestamp ) PARTITION BY LIST( extract( year from ts ) ); CREATE TABLE y2018 PARTITION OF root FOR VALUES IN ( 2018 ); CREATE TABLE y2019 PARTITION OF root FOR VALUES IN ( 2019 ); ALTER TABLE y2018 ADD CHECK( ts >= make_timestamp( 2018, 1, 1, 0, 0, 0 ) AND ts <= make_timestamp( 2018, 12, 31, 23, 59, 59 ) ); ALTER TABLE y2019 ADD CHECK( ts >= make_timestamp( 2019, 1, 1, 0, 0, 0 ) AND ts <= make_timestamp( 2019, 12, 31, 23, 59, 59 ) ); INSERT INTO root( v ) SELECT generate_series( 1, 100 ); -- same ts here COMMIT; Now if I try to explain a query with the current timestamp (which is of course in 2019): testdb=# explain select * from root where ts = current_timestamp; QUERY PLAN ------------------------------------------------------------- Append (cost=0.00..75.59 rows=18 width=16) -> Seq Scan on y2018 (cost=0.00..37.75 rows=9 width=16) Filter: (ts = CURRENT_TIMESTAMP) -> Seq Scan on y2019 (cost=0.00..37.75 rows=9 width=16) Filter: (ts = CURRENT_TIMESTAMP) (5 rows) I got y2018 scanned too, which of course could not be the case since y2018 cannot contain values that are equal to current_timestamp. However, if I use a literal the query works fine: testdb=# explain select * from root where ts = '2019-09-01 09:00:00.000000'; QUERY PLAN --------------------------------------------------------------------------- Append (cost=0.00..33.17 rows=9 width=16) -> Seq Scan on y2019 (cost=0.00..33.12 rows=9 width=16) Filter: (ts = '2019-09-01 09:00:00'::timestamp without time zone) (3 rows) Now, if I change the partition schema using a range, the query works fine with current_timestamp too: CREATE TABLE root( pk int generated always as identity, v int, ts timestamp default current_timestamp ) PARTITION BY RANGE( ts ); CREATE TABLE y2018 PARTITION OF root FOR VALUES FROM ('2018-01-01 00:00:00.000000') TO ('2018-12-31 23:59:59.000000'); CREATE TABLE y2019 PARTITION OF root FOR VALUES FROM ('2019-01-01 00:00:00.000000') TO ('2019-12-31 23:59:59.000000'); testdb=# explain select * from root where ts = current_timestamp; QUERY PLAN ------------------------------------------------------------- Append (cost=0.00..75.59 rows=18 width=16) Subplans Removed: 1 -> Seq Scan on y2019 (cost=0.00..37.75 rows=9 width=16) Filter: (ts = CURRENT_TIMESTAMP) (4 rows) So my end with this is that: - the list partitioning prevents the current_timestamp to be compared against the list of possible values (extract year from current_timestamp) and therefore the planner has no chance but to get into all the tables, even if the constraints on the ts field explicitly state some tables can be removed; - in range partitioning, since the partition is built on the very range of values, the planner gets the correct path. I still don't get why using a literal in the first case can lead to a "more correct" plan. And I'm curious to know if there's a way to force constraints in the list partitioning to make the planner really aware of tables that can be excluded. Luca
Re: partition by range or by list constraint check (was Re: literal vs dynamic partition constraint in plan execution)
From
Tom Lane
Date:
Luca Ferrari <fluca1978@gmail.com> writes: > I've done a simple test case, and find out that probably the problem I > got was due to the partition schema I'm using. > I want a table to be partitioned by a timestamp field with a first > level partition by year, and a second level by month. Therefore, I did > a BY LIST partitioning, > CREATE TABLE root( pk int generated always as identity, v int, ts > timestamp default current_timestamp ) > PARTITION BY LIST( extract( year from ts ) ); No, that's not going to work, unless your queries all explicitly use "extract( year from ts ) = something" in their WHERE clauses. There is nothing in the system that would derive a constraint like that from a constraint that just mentions ts. In your example, the partition routing logic was entirely ineffective because of this, so you tried to make up for that by adding CHECK constraints. But pruning based on CHECK constraints is done at planning time, so it can't do anything with run-time-mutable values such as CURRENT_TIMESTAMP. > CREATE TABLE root( pk int generated always as identity, v int, ts > timestamp default current_timestamp ) > PARTITION BY RANGE( ts ); > CREATE TABLE y2018 > PARTITION OF root > FOR VALUES FROM ('2018-01-01 00:00:00.000000') > TO ('2018-12-31 23:59:59.000000'); This is mostly the right way to do it; you forgot that range bounds use "low <= variable < high" logic. So the correct way to declare the partitions is like regression=# CREATE TABLE y2018 regression-# PARTITION OF root regression-# FOR VALUES FROM ('2018-01-01') TO ('2019-01-01'); CREATE TABLE regression=# CREATE TABLE y2019 PARTITION OF root FOR VALUES FROM ('2019-01-01') TO ('2020-01-01'); CREATE TABLE That's easier and doesn't leave a gap between partitions. This is already sufficient to allow partition routing based on equality or simple inequality involving "ts", so you don't need extra CHECK constraints. > I still don't get why using a literal in the first case can lead to a > "more correct" plan. With a literal, plan-time pruning based on the CHECK constraints was possible. > And I'm curious to know if there's a way to force constraints in the > list partitioning to make the planner really aware of tables that can > be excluded. No. The short answer here is that your query WHERE clauses have to be things that the planner or partition routing code can relate to the partitioning rules. In the case of LIST partitioning, that means there had better be WHERE constraints on the values specified in the LIST clause, not values that perhaps could be shown to be related to those values given extensive knowledge about the behaviors of certain functions. By and large, the system doesn't have such knowledge. regards, tom lane
I think I would expect this behavior with how you have defined the constraints as the function results rather than just being strict comparisons to the timestamp field.
Instead of this-
Check constraints:
> "y2018_mis_ora_check" CHECK (date_part('year'::text, mis_ora) =
> 2018::double precision)
> "y2018_mis_ora_check" CHECK (date_part('year'::text, mis_ora) =
> 2018::double precision)
I would expect this-
Check constraints:
> "y2018_mis_ora_check" CHECK mis_ora) >= make_timestamp(2018, 1,
> 1, 0, 0, 0::double precision) AND < make_timestamp(2019, 1,
> 1, 0, 0, 0::double precision)
> "y2018_mis_ora_check" CHECK mis_ora) >= make_timestamp(2018, 1,
> 1, 0, 0, 0::double precision) AND < make_timestamp(2019, 1,
> 1, 0, 0, 0::double precision)