Optimizer misses big in 10.4 with BRIN index - Mailing list pgsql-hackers
From | Arcadiy Ivanov |
---|---|
Subject | Optimizer misses big in 10.4 with BRIN index |
Date | |
Msg-id | d25d6e62-6f9c-5d8f-f264-797fa1f8cf3b@gmail.com Whole thread Raw |
Responses |
Re: Optimizer misses big in 10.4 with BRIN index
|
List | pgsql-hackers |
Hi, Before spamming the list with reproduction examples I want to make sure the issue isn't already known. Moving to 10.4 from 9.2 (AWS RDS but repro on local laptop as well) we've discovered that an optimizer prefers a seq scan to fully analyzed consistent BRIN index, ending up with a query that is 4.8s long on seqscan vs 56ms when forcing use of BRIN (85 times difference). The size of the dataset is millions of rows and with extremely high probability the rows are naturally clustered on BRIN index column. Anybody observed anything like that? ======================================================== schema0=# SET enable_seqscan=false; SET schema0=# EXPLAIN (analyze, verbose, costs, buffers) SELECT data, count(*) OVER() AS full_count FROM schema0_lab.data_table WHERE segment = 'pb1' AND (data->>'tradeDate')::numeric >= '1531267200' AND (data->>'tradeDate')::numeric <= '1531353600' AND data->>'tradeStatus' = 'Replaced' ORDER BY (data->>'tradeDate')::numeric DESC; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Sort (cost=617851.03..617851.24 rows=84 width=1219) (actual time=55.765..55.794 rows=611 loops=1) Output: data, (count(*) OVER (?)), (((data ->> 'tradeDate'::text))::numeric) Sort Key: (((data_table.data ->> 'tradeDate'::text))::numeric) DESC Sort Method: quicksort Memory: 1256kB Buffers: shared hit=824 -> WindowAgg (cost=1231.98..617848.34 rows=84 width=1219) (actual time=52.688..55.068 rows=611 loops=1) Output: data, count(*) OVER (?), ((data ->> 'tradeDate'::text))::numeric Buffers: shared hit=824 -> Gather (cost=1231.98..617846.66 rows=84 width=1179) (actual time=8.247..51.804 rows=611 loops=1) Output: data Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=824 -> Parallel Bitmap Heap Scan on schema0_lab.data_table (cost=231.98..616838.26 rows=35 width=1179) (actual time=3.850..46.704 rows=204 loops=3) Output: data Recheck Cond: ((((data_table.data ->> 'tradeDate'::text))::numeric >= '1531267200'::numeric) AND (((data_table.data ->> 'tradeDate'::text))::numeric <= '1531353600'::numeric)) Rows Removed by Index Recheck: 4404 Filter: (((data_table.segment)::text = 'pb1'::text) AND ((data_table.data ->> 'tradeStatus'::text) = 'Replaced'::text)) Heap Blocks: lossy=794 Buffers: shared hit=2334 Worker 0: actual time=3.572..44.145 rows=236 loops=1 Buffers: shared hit=749 Worker 1: actual time=0.326..45.184 rows=212 loops=1 Buffers: shared hit=761 -> Bitmap Index Scan on tradedate_idx (cost=0.00..231.96 rows=3377106 width=0) (actual time=4.500..4.500 rows=23040 loops=1) Index Cond: ((((data_table.data ->> 'tradeDate'::text))::numeric >= '1531267200'::numeric) AND (((data_table.data ->> 'tradeDate'::text))::numeric <= '1531353600'::numeric)) Buffers: shared hit=30 Planning time: 0.246 ms Execution time: 56.209 ms (29 rows) schema0=# SET enable_seqscan=true; SET schema0=# EXPLAIN (analyze, verbose, costs, buffers) SELECT data, count(*) OVER() AS full_count FROM schema0_lab.data_table WHERE segment = 'pb1' AND (data->>'tradeDate')::numeric >= '1531267200' AND (data->>'tradeDate')::numeric <= '1531353600' AND data->>'tradeStatus' = 'Replaced' ORDER BY (data->>'tradeDate')::numeric DESC; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------------------------------------------------- Sort (cost=617619.05..617619.26 rows=84 width=1219) (actual time=4823.081..4823.106 rows=611 loops=1) Output: data, (count(*) OVER (?)), (((data ->> 'tradeDate'::text))::numeric) Sort Key: (((data_table.data ->> 'tradeDate'::text))::numeric) DESC Sort Method: quicksort Memory: 1256kB Buffers: shared hit=839 read=187353 -> WindowAgg (cost=1000.00..617616.36 rows=84 width=1219) (actual time=4820.005..4822.390 rows=611 loops=1) Output: data, count(*) OVER (?), ((data ->> 'tradeDate'::text))::numeric Buffers: shared hit=839 read=187353 -> Gather (cost=1000.00..617614.68 rows=84 width=1179) (actual time=3.262..4819.362 rows=611 loops=1) Output: data Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=839 read=187353 -> Parallel Seq Scan on schema0_lab.data_table (cost=0.00..616606.28 rows=35 width=1179) (actual time=6.135..4814.826 rows=204 loops=3) Output: data Filter: (((data_table.segment)::text = 'pb1'::text) AND ((data_table.data ->> 'tradeStatus'::text) = 'Replaced'::text) AND (((data_table.data ->> 'tradeDate'::text))::numeric >= '1531267200'::numeric) AND (((data_table.data ->> 'tradeDate'::text))::numeric <= '1531353600'::numeric)) Rows Removed by Filter: 1125498 Buffers: shared hit=2400 read=561439 Worker 0: actual time=11.414..4812.744 rows=220 loops=1 Buffers: shared hit=775 read=187333 Worker 1: actual time=4.249..4813.264 rows=220 loops=1 Buffers: shared hit=786 read=186753 Planning time: 0.232 ms Execution time: 4823.412 ms (24 rows) schema0=# \d schema0_lab.data_table Table "schema0_lab.data_table" Column | Type | Collation | Nullable | Default ------------+------------------------+-----------+----------+-------------------------------------------------- id | integer | | not null | nextval('schema0_lab.data_table_id_seq'::regclass) address | character varying(128) | | not null | segment | character varying(128) | | not null | data | jsonb | | not null | Indexes: "data_table_pkey" PRIMARY KEY, btree (id) "tradedate_idx" brin (((data ->> 'tradeDate'::text)::numeric)) WITH (autosummarize='true') -- Arcadiy Ivanov arcadiy@gmail.com | @arcivanov | https://ivanov.biz https://github.com/arcivanov
pgsql-hackers by date: