Re: inconsistent results querying table partitioned by date - Mailing list pgsql-bugs
From | Tom Lane |
---|---|
Subject | Re: inconsistent results querying table partitioned by date |
Date | |
Msg-id | 24525.1557444159@sss.pgh.pa.us Whole thread Raw |
In response to | inconsistent results querying table partitioned by date (Alan Jackson <ajax@tvsquared.com>) |
Responses |
Re: inconsistent results querying table partitioned by date
|
List | pgsql-bugs |
Alan Jackson <ajax@tvsquared.com> writes: > Im having a problem with querying a table partitioned by date. > Depending on the sequence of operations on a date parameter used in a where clause, I either get the expected results,or no results. Yeah, this is pretty clearly broken. It looks to me like the partition pruning code is making insupportable assumptions about a comparison to a stable expression. Using your example table: regression=# explain select * from dataid where id=1 and datadatetime < localtimestamp; QUERY PLAN ---------------------------------------------------------------------------------- Bitmap Heap Scan on dataid_default (cost=4.19..11.31 rows=3 width=12) Recheck Cond: ((id = 1) AND (datadatetime < LOCALTIMESTAMP)) -> Bitmap Index Scan on dataid_default_pkey (cost=0.00..4.19 rows=3 width=0) Index Cond: ((id = 1) AND (datadatetime < LOCALTIMESTAMP)) (4 rows) It should absolutely not have pruned away the dataid_201902 partition, but it did. It's okay with an immutable expression: regression=# explain select * from dataid where id=1 and datadatetime < '2019-05-09'::timestamp; QUERY PLAN -------------------------------------------------------------------------------------------------------------- Append (cost=4.18..22.63 rows=6 width=12) -> Bitmap Heap Scan on dataid_201902 (cost=4.18..11.30 rows=3 width=12) Recheck Cond: ((id = 1) AND (datadatetime < '2019-05-09 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on dataid_201902_pkey (cost=0.00..4.18 rows=3 width=0) Index Cond: ((id = 1) AND (datadatetime < '2019-05-09 00:00:00'::timestamp without time zone)) -> Bitmap Heap Scan on dataid_default (cost=4.18..11.30 rows=3 width=12) Recheck Cond: ((id = 1) AND (datadatetime < '2019-05-09 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on dataid_default_pkey (cost=0.00..4.18 rows=3 width=0) Index Cond: ((id = 1) AND (datadatetime < '2019-05-09 00:00:00'::timestamp without time zone)) (9 rows) or a volatile one: regression=# explain select * from dataid where id=1 and datadatetime < '2019-05-09'::timestamp + random()*'1 day'::interval; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ Append (cost=4.23..29.80 rows=6 width=12) -> Bitmap Heap Scan on dataid_201902 (cost=4.23..14.88 rows=3 width=12) Recheck Cond: (id = 1) Filter: (datadatetime < ('2019-05-09 00:00:00'::timestamp without time zone + (random() * '1 day'::interval))) -> Bitmap Index Scan on dataid_201902_pkey (cost=0.00..4.23 rows=10 width=0) Index Cond: (id = 1) -> Bitmap Heap Scan on dataid_default (cost=4.23..14.88 rows=3 width=12) Recheck Cond: (id = 1) Filter: (datadatetime < ('2019-05-09 00:00:00'::timestamp without time zone + (random() * '1 day'::interval))) -> Bitmap Index Scan on dataid_default_pkey (cost=0.00..4.23 rows=10 width=0) Index Cond: (id = 1) (11 rows) but somebody's confused about what can be done with stable expressions. While I'm on about it, this behavior is also insupportable: regression=# explain select * from dataid where id=1 and datadatetime < '2018-05-09'::timestamptz ; QUERY PLAN -------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on dataid_default (cost=4.18..11.30 rows=3 width=12) Recheck Cond: ((id = 1) AND (datadatetime < '2018-05-09 00:00:00-04'::timestamp with time zone)) -> Bitmap Index Scan on dataid_default_pkey (cost=0.00..4.18 rows=3 width=0) Index Cond: ((id = 1) AND (datadatetime < '2018-05-09 00:00:00-04'::timestamp with time zone)) (4 rows) because timestamp-against-timestamptz comparison is inherently only stable; the pruning code is way exceeding its authority by supposing that a comparison that holds at plan time will hold at runtime, even with a constant comparison value. The reason for the difference in your results is that one expression is immutable and the other is only stable: regression=# explain verbose select (('2019-02-26T00:00:00'::timestamp::timestamp at time zone 'America/New_York' ) at time zone 'UTC' + '2 days'::interval)as workingdate, (('2019-02-26T00:00:00'::timestamp::timestamp at time zone 'America/New_York' + '2 days'::interval) at time zone 'UTC') asnotworkingdate; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------- Result (cost=0.00..0.01 rows=1 width=16) Output: '2019-02-28 05:00:00'::timestamp without time zone, timezone('UTC'::text, ('2019-02-26 00:00:00-05'::timestampwith time zone + '2 days'::interval)) (2 rows) the reason being that timestamptz + interval depends on the timezone setting (for some intervals) but timestamp + interval never does. Seems to be equally broken in v11 and HEAD. I didn't try v10. > I hope there is something simple I can change in the partition definitions to work around this. Until we fix the bug, I think the best you can do is not use stable expressions in this context. regards, tom lane
pgsql-bugs by date: