Re: Indexing on JSONB field not working - Mailing list pgsql-bugs
| From | Zhihong Zhang |
|---|---|
| Subject | Re: Indexing on JSONB field not working |
| Date | |
| Msg-id | 0AA828F3-5BA5-48CD-B229-8351ACB308D3@gmail.com Whole thread Raw |
| In response to | Re: Indexing on JSONB field not working (Tom Lane <tgl@sss.pgh.pa.us>) |
| Responses |
RE: Indexing on JSONB field not working
|
| List | pgsql-bugs |
Just came back from the long break and I couldn’t quite follow the threads. Is this a bug or something I am doing
wrong?
If it’s a bug, can I have a tracking or ticket number?
Thanks!
Zhihong
> On Dec 21, 2019, at 10:49 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> On Fri, Dec 20, 2019 at 08:15:39PM -0500, Zhihong Zhang wrote:
>>> "Limit (cost=0.43..313.25 rows=100 width=53) (actual time=0.092..0.236 rows=7 loops=1)"
>>> " -> Index Scan using assets_floatvalue_idx on assets (cost=0.43..2714072.57 rows=867607 width=53) (actual
time=0.089..0.230rows=7 loops=1)"
>>> " Index Cond: (((_doc #> '{floatValue}'::text[]))::double precision < '3'::double precision)"
>
>> Well, this confirms what I suspected before - the optimizer believes the
>> seqscan plan is a bit cheaper compared to index scan (107 vs. 313) but
>> that mismatches the actual performance. The question is why ...
>
> The planner evidently believes that 867607 rows will match the query
> condition, so it expects that the scan will stop (after collecting
> 100 rows) very quickly. In reality only 7 rows match, so the scan
> has to run to completion. This is what's bollixing the plan choice.
>
> I suspect that 867607 is just a default estimate, but if ANALYZE has
> been run then there should be stats for the index column, so why isn't
> it doing better? When I try a similar case here, I get good estimates:
>
> regression=# select version();
> version
> ---------------------------------------------------------------------------------------------------------
> PostgreSQL 11.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23), 64-bit
> (1 row)
>
> regression=# create table public.assets(_doc jsonb);
> CREATE TABLE
> regression=# CREATE INDEX on assets (((_doc #> '{floatValue}'::text[])::double precision) ASC NULLS LAST);
> CREATE INDEX
> regression=# insert into assets select ('{"floatValue": ' || x || '}')::jsonb from generate_series(1,10000) x;
> INSERT 0 10000
> regression=# explain select _doc->>'floatValue' from assets where (_doc #> '{floatValue}'::text[])::double precision
<3.0 limit 100;
> QUERY PLAN
> ------------------------------------------------------------------------------------------------
> Limit (cost=0.00..7.71 rows=100 width=32)
> -> Seq Scan on assets (cost=0.00..293.44 rows=3808 width=32)
> Filter: (((_doc #> '{floatValue}'::text[]))::double precision < '3'::double precision)
> (3 rows)
>
> regression=# analyze assets;
> ANALYZE
> regression=# explain select _doc->>'floatValue' from assets where (_doc #> '{floatValue}'::text[])::double precision
<3.0 limit 100;
> QUERY PLAN
> ----------------------------------------------------------------------------------------------------
> Limit (cost=0.29..8.33 rows=2 width=32)
> -> Index Scan using assets_float8_idx on assets (cost=0.29..8.33 rows=2 width=32)
> Index Cond: (((_doc #> '{floatValue}'::text[]))::double precision < '3'::double precision)
> (3 rows)
>
> The "3808" estimate is just a default for '<' with no stats, but
> with stats I get a dead-on estimate.
>
> regards, tom lane
pgsql-bugs by date: