[BUGS] BUG #14780: PostgreSQL 9.6 selects a wrong plan during aggregationagainst timestamp columns - Mailing list pgsql-bugs
| From | sogawa@yandex.ru |
|---|---|
| Subject | [BUGS] BUG #14780: PostgreSQL 9.6 selects a wrong plan during aggregationagainst timestamp columns |
| Date | |
| Msg-id | 20170816014704.27360.35941@wrigleys.postgresql.org Whole thread Raw |
| Responses |
[BUGS] Re: BUG #14780: PostgreSQL 9.6 selects a wrong plan duringaggregation against timestamp columns
|
| List | pgsql-bugs |
The following bug has been logged on the website:
Bug reference: 14780
Logged by: sogawa-sps
Email address: sogawa@yandex.ru
PostgreSQL version: 9.6.2
Operating system: Windows 10
Description:
PostgreSQL 9.6 selects a wrong plan during aggregation against indexed
timestamp columns while it's ok for other types.
Given: table “log” that has three columns: user_id, day, hours.
user_id character varying(36) COLLATE pg_catalog."default" NOT NULL, day timestamp without time zone, hours
doubleprecision
All columns have indexes.
The issue is that aggregation against the 'day' field works extremely slow
because makes a full scan filtering the entries that doesn’t relate to
user_id = 'ab056f5a-390b-41d7-ba56-897c14b679bf'
select min(day) from log where user_id =
'ab056f5a-390b-41d7-ba56-897c14b679bf'
[ { "Execution Time": 146502.05, "Planning Time": 0.893, "Plan": { "Startup Cost":
789.02, "Actual Rows": 1, "Plans": [ { "Startup Cost": 0.44, "Actual
Rows":1, "Plans": [ { "Index Cond": "(log.day IS NOT NULL)",
"StartupCost": 0.44, "Scan Direction": "Forward", "Plan Width": 8,
"RowsRemoved by Index Recheck": 0, "Actual Rows": 1, "Node Type": "Index Scan",
"Total Cost": 1395792.54, "Plan Rows": 1770, "Relation Name": "log",
"Alias": "log", "Parallel Aware": false, "Actual Total Time": 146502.015,
"Output": [ "log.day" ], "Parent Relationship": "Outer",
"Actual Startup Time": 146502.015, "Schema": "public", "Filter":
"((log.user_id)::text=
'ab056f5a-390b-41d7-ba56-897c14b679bf'::text)", "Actual Loops": 1, "Rows Removed by
Filter":12665610, "Index Name": "index_log_day" } ], "Node Type":
"Limit", "Plan Rows": 1, "Parallel Aware": false, "Actual Total Time": 146502.016,
"Output": [ "log.day" ], "Parent Relationship": "InitPlan",
"ActualStartup Time": 146502.016, "Plan Width": 8, "Subplan Name": "InitPlan 1 (returns $0)",
"Actual Loops": 1, "Total Cost": 789.02 } ], "Node Type": "Result",
"Plan Rows": 1, "Parallel Aware": false, "Actual Total Time": 146502.019, "Output": [
"$0" ], "Actual Startup Time": 146502.019, "Plan Width": 8, "Actual Loops": 1,
"TotalCost": 789.03 }, "Triggers": [] } ]
However the almost similar query but for the double type has a correct.
select min(hours) from log where user_id =
'ab056f5a-390b-41d7-ba56-897c14b679bf'
Server selects entries for user_id = 'ab056f5a-390b-41d7-ba56-897c14b679bf'
first and then aggregates among them what is correct
[ { "Execution Time": 5.989, "Planning Time": 1.186, "Plan": { "Partial Mode":
"Simple", "Startup Cost": 6842.66, "Actual Rows": 1, "Plans": [ { "Startup
Cost":66.28, "Plan Width": 8, "Rows Removed by Index Recheck": 0, "Actual Rows":
745, "Plans": [ { "Startup Cost": 0, "Plan Width": 0,
"Actual Rows": 745, "Node Type": "Bitmap Index Scan", "Index Cond":
"((log.user_id)::text=
'ab056f5a-390b-41d7-ba56-897c14b679bf'::text)", "Plan Rows": 1770, "Parallel Aware":
false, "Actual Total Time": 0.25, "Parent Relationship": "Outer",
"ActualStartup Time": 0.25, "Total Cost": 65.84, "Actual Loops": 1,
"IndexName": "index_log_user_id" } ], "Recheck Cond": "((log.user_id)::text =
'ab056f5a-390b-41d7-ba56-897c14b679bf'::text)", "Exact Heap Blocks": 742, "Node Type": "Bitmap
HeapScan", "Plan Rows": 1770, "Relation Name": "log", "Alias": "log",
"ParallelAware": false, "Actual Total Time": 5.793, "Output": [ "day",
"hours", "user_id" ], "Lossy Heap Blocks": 0, "Parent
Relationship":"Outer", "Actual Startup Time": 0.357, "Total Cost": 6838.23, "Actual
Loops":1, "Schema": "public" } ], "Node Type": "Aggregate", "Strategy":
"Plain", "Plan Rows": 1, "Parallel Aware": false, "Actual Total Time": 5.946, "Output":
[ "min(hours)" ], "Actual Startup Time": 5.946, "Plan Width": 8, "Actual
Loops":1, "Total Cost": 6842.67 }, "Triggers": [] } ]
Optimizer have to select correct plan for the timestamp fields like it does
for double.
WA: Rewrite query into:
select user_id, min(day) from log where user_id =
'ac43a155-4fbb-49eb-a670-02c307eb3d4f' group by user_id
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
pgsql-bugs by date: