Re: querying with index on jsonb slower than standard column. Why? - Mailing list pgsql-sql
From | Adrian Klaver |
---|---|
Subject | Re: querying with index on jsonb slower than standard column. Why? |
Date | |
Msg-id | 5485C449.4020204@aklaver.com Whole thread Raw |
In response to | Re: querying with index on jsonb slower than standard column. Why? (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: querying with index on jsonb slower than standard column. Why?
|
List | pgsql-sql |
On 12/07/2014 05:28 PM, Tom Lane wrote: > Tim Dudgeon <tdudgeon.ml@gmail.com> writes: >> The index created is not a gin index. Its a standard btree index on the >> data extracted from the json. So the indexes on the standard columns and >> the ones on the 'fields' extracted from the json seem to be equivalent. >> But perform differently. > > I don't see any particular difference ... > > regression=# explain analyze select count(*) from json_test where (data->>'assay1_ic50')::float > 90 > and (data->>'assay2_ic50')::float < 10; > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=341613.79..341613.80 rows=1 width=0) (actual time=901.207..901.208 rows=1 loops=1) > -> Bitmap Heap Scan on json_test (cost=123684.69..338836.02 rows=1111111 width=0) (actual time=497.982..887.128 rows=100690loops=1) > Recheck Cond: ((((data ->> 'assay2_ic50'::text))::double precision < 10::double precision) AND (((data ->> 'assay1_ic50'::text))::doubleprecision > 90::double precision)) > Heap Blocks: exact=77578 > -> BitmapAnd (cost=123684.69..123684.69 rows=1111111 width=0) (actual time=476.585..476.585 rows=0 loops=1) > -> Bitmap Index Scan on idx_data_json_assay2_ic50 (cost=0.00..61564.44 rows=3333333 width=0) (actualtime=219.287..219.287 rows=999795 loops=1) > Index Cond: (((data ->> 'assay2_ic50'::text))::double precision < 10::double precision) > -> Bitmap Index Scan on idx_data_json_assay1_ic50 (cost=0.00..61564.44 rows=3333333 width=0) (actualtime=208.197..208.197 rows=1000231 loops=1) > Index Cond: (((data ->> 'assay1_ic50'::text))::double precision > 90::double precision) > Planning time: 0.128 ms > Execution time: 904.196 ms > (11 rows) > > regression=# explain analyze select count(*) from json_test where assay1_ic50 > 90 and assay2_ic50 < 10; > QUERY PLAN > ----------------------------------------------------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=197251.24..197251.25 rows=1 width=0) (actual time=895.238..895.238 rows=1 loops=1) > -> Bitmap Heap Scan on json_test (cost=36847.25..197003.24 rows=99197 width=0) (actual time=495.427..881.033 rows=100690loops=1) > Recheck Cond: ((assay2_ic50 < 10::double precision) AND (assay1_ic50 > 90::double precision)) > Heap Blocks: exact=77578 > -> BitmapAnd (cost=36847.25..36847.25 rows=99197 width=0) (actual time=474.201..474.201 rows=0 loops=1) > -> Bitmap Index Scan on idx_data_col_assay2_ic50 (cost=0.00..18203.19 rows=985434 width=0) (actual time=219.060..219.060rows=999795 loops=1) > Index Cond: (assay2_ic50 < 10::double precision) > -> Bitmap Index Scan on idx_data_col_assay1_ic50 (cost=0.00..18594.21 rows=1006637 width=0) (actual time=206.066..206.066rows=1000231 loops=1) > Index Cond: (assay1_ic50 > 90::double precision) > Planning time: 0.129 ms > Execution time: 898.237 ms > (11 rows) > > regression=# \timing > Timing is on. > regression=# select count(*) from json_test where (data->>'assay1_ic50')::float > 90 > and (data->>'assay2_ic50')::float < 10; > count > -------- > 100690 > (1 row) > > Time: 882.607 ms > regression=# select count(*) from json_test where assay1_ic50 > 90 and assay2_ic50 < 10; > count > -------- > 100690 > (1 row) > > Time: 881.071 ms > > regards, tom lane > > Running the above on my machine I do see the slow down the OP reports. I ran it several times and it stayed around 3.5x. It might be interesting to get the OS and architecture information from the OP. test=# select version(); version ------------------------------------------------------------------------------------------------------------------------------ PostgreSQL 9.4rc1 on i686-pc-linux-gnu, compiled by gcc (SUSE Linux)4.8.1 20130909 [gcc-4_8-branch revision 202388], 32-bit (1 row) test=# \timing Timing is on. test=# select count(*) from json_test where (data->>'assay1_ic50')::float > 90 test-# and (data->>'assay2_ic50')::float < 10;count -------99288 (1 row) Time: 9092.966 ms test=# select count(*) from json_test where assay1_ic50 > 90 and assay2_ic50 < 10;count -------99288 (1 row) Time: 2542.294 ms explain analyze select count(*) from json_test where (data->>'assay1_ic50')::float > 90 and (data->>'assay2_ic50')::float < 10; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Aggregate (cost=332209.79..332209.80 rows=1 width=0) (actual time=8980.009..8980.009 rows=1 loops=1) -> Bitmap Heap Scan on json_test (cost=123684.69..329432.02 rows=1111111 width=0) (actual time=538.688..8960.308 rows=99288 loops=1) RecheckCond: ((((data ->> 'assay2_ic50'::text))::double precision < 10::double precision) AND (((data ->> 'assay1_ic50'::text))::doubleprecision > 90::double precision)) Rows Removed by Index Recheck: 7588045 HeapBlocks: exact=20894 lossy=131886 -> BitmapAnd (cost=123684.69..123684.69 rows=1111111 width=0) (actual time=531.066..531.066rows=0 loops=1) -> Bitmap Index Scan on idx_data_json_assay2_ic50 (cost=0.00..61564.44rows=3333333 width=0) (actual time=258.717..258.717 rows=998690 loops=1) Index Cond:(((data ->> 'assay2_ic50'::text))::double precision < 10::double precision) -> Bitmap Index Scan on idx_data_json_assay1_ic50 (cost=0.00..61564.44 rows=3333333 width=0) (actual time=251.664..251.664 rows=997880 loops=1) Index Cond: (((data ->> 'assay1_ic50'::text))::double precision > 90::double precision)Planning time: 0.391msExecution time: 8980.391 ms (12 rows) test=# explain analyze select count(*) from json_test where assay1_ic50 > 90 and assay2_ic50 < 10; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------Aggregate (cost=196566.38..196566.39 rows=1 width=0) (actual time=2609.545..2609.545 rows=1 loops=1) -> Bitmap Heap Scan on json_test (cost=37869.00..196304.39 rows=104796 width=0) (actual time=550.273..2590.093 rows=99288 loops=1) RecheckCond: ((assay2_ic50 < 10::double precision) AND (assay1_ic50 > 90::double precision)) Rows Removed by IndexRecheck: 7588045 Heap Blocks: exact=20894 lossy=131886 -> BitmapAnd (cost=37869.00..37869.00 rows=104796width=0) (actual time=542.666..542.666 rows=0 loops=1) -> Bitmap Index Scan on idx_data_col_assay2_ic50 (cost=0.00..18871.73 rows=1021773 width=0) (actual time=263.959..263.959 rows=998690 loops=1) Index Cond: (assay2_ic50 < 10::double precision) -> Bitmap Index Scan on idx_data_col_assay1_ic50 (cost=0.00..18944.62 rows=1025624 width=0) (actual time=257.912..257.912 rows=997880 loops=1) Index Cond: (assay1_ic50 > 90::double precision)Planning time: 0.834 msExecution time: 2609.960 ms (12 rows) -- Adrian Klaver adrian.klaver@aklaver.com