Thread: jsonb Indexing

jsonb Indexing

From

ramikvl@gmail.com

Date:

17 September 2021, 13:55:29

Hello,

I'm new to jsonb feature of postgres. I have PostgreSQL 13.2 running 
locally in docker.

I've been playing a little bit with jsonb indexes described 
https://www.postgresql.org/docs/13/datatype-json.html#JSON-INDEXING and 
I'm not able make the GIN index work on 'tags'.

I was wondering what I'm doing wrong. There are steps what I've tried:

CREATE TABLE api (
     jdoc jsonb
);

INSERT INTO api (jdoc)
     VALUES ('{
     "guid": "9c36adc1-7fb5-4d5b-83b4-90356a46061a",
     "name": "Angela Barton",
     "is_active": true,
     "company": "Magnafone",
     "address": "178 Howard Place, Gulf, Washington, 702",
     "registered": "2009-11-07T08:53:22 +08:00",
     "latitude": 19.793713,
     "longitude": 86.513373,
     "tags": [
         "enim",
         "aliquip",
         "qui"
     ]
}');

CREATE INDEX idxgintags ON api USING GIN ((jdoc->'tags'));

EXPLAIN ANALYZE SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc -> 
'tags' ? 'qui';

And the result is

Seq Scan on api  (cost=0.00..1.02 rows=1 width=64) (actual 
time=0.019..0.021 rows=1 loops=1)
   Filter: ((jdoc -> 'tags'::text) ? 'qui'::text)

Planning Time: 0.115 ms

Execution Time: 0.047 ms

Do you know why Index Scan on idxgintag is not used?

Thank you,

Vlasta

Re: jsonb Indexing

From

Julien Rouhaud

Date:

17 September 2021, 14:00:15

Hi,

On Fri, Sep 17, 2021 at 9:55 PM <ramikvl@gmail.com> wrote:
>
> I was wondering what I'm doing wrong. There are steps what I've tried:
>
> CREATE TABLE api (
>      jdoc jsonb
> );
>
> INSERT INTO api (jdoc)
>      VALUES ('{
>      "guid": "9c36adc1-7fb5-4d5b-83b4-90356a46061a",
>      "name": "Angela Barton",
>      "is_active": true,
>      "company": "Magnafone",
>      "address": "178 Howard Place, Gulf, Washington, 702",
>      "registered": "2009-11-07T08:53:22 +08:00",
>      "latitude": 19.793713,
>      "longitude": 86.513373,
>      "tags": [
>          "enim",
>          "aliquip",
>          "qui"
>      ]
> }');
>
> CREATE INDEX idxgintags ON api USING GIN ((jdoc->'tags'));
>
> EXPLAIN ANALYZE SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc ->
> 'tags' ? 'qui';
>
> And the result is
>
> Seq Scan on api  (cost=0.00..1.02 rows=1 width=64) (actual
> time=0.019..0.021 rows=1 loops=1)
>    Filter: ((jdoc -> 'tags'::text) ? 'qui'::text)
>
> Planning Time: 0.115 ms
>
> Execution Time: 0.047 ms
>
> Do you know why Index Scan on idxgintag is not used?

Yes, because doing an index scan on a table containing a single row is
an order or magnitude less efficient than simply doing a sequential
scan.  You should try to simulate something close to your production
data to see something interesting.

Re: jsonb Indexing

From

ramikvl@gmail.com

Date:

20 September 2021, 10:52:54

Hello Julien,

On 9/17/21 4:00 PM, Julien Rouhaud wrote:
> Hi,
>
> On Fri, Sep 17, 2021 at 9:55 PM <ramikvl@gmail.com> wrote:
>> I was wondering what I'm doing wrong. There are steps what I've tried:
>>
>> CREATE TABLE api (
>>       jdoc jsonb
>> );
>>
>> INSERT INTO api (jdoc)
>>       VALUES ('{
>>       "guid": "9c36adc1-7fb5-4d5b-83b4-90356a46061a",
>>       "name": "Angela Barton",
>>       "is_active": true,
>>       "company": "Magnafone",
>>       "address": "178 Howard Place, Gulf, Washington, 702",
>>       "registered": "2009-11-07T08:53:22 +08:00",
>>       "latitude": 19.793713,
>>       "longitude": 86.513373,
>>       "tags": [
>>           "enim",
>>           "aliquip",
>>           "qui"
>>       ]
>> }');
>>
>> CREATE INDEX idxgintags ON api USING GIN ((jdoc->'tags'));
>>
>> EXPLAIN ANALYZE SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc ->
>> 'tags' ? 'qui';
>>
>> And the result is
>>
>> Seq Scan on api  (cost=0.00..1.02 rows=1 width=64) (actual
>> time=0.019..0.021 rows=1 loops=1)
>>     Filter: ((jdoc -> 'tags'::text) ? 'qui'::text)
>>
>> Planning Time: 0.115 ms
>>
>> Execution Time: 0.047 ms
>>
>> Do you know why Index Scan on idxgintag is not used?
> Yes, because doing an index scan on a table containing a single row is
> an order or magnitude less efficient than simply doing a sequential
> scan.  You should try to simulate something close to your production
> data to see something interesting.

Thank you for the tip. I've tried to generate more data. I have 2000 
rows in the table but the query still uses sequential scan.

Seq Scan on api  (cost=0.00..131.00 rows=2000 width=64) (actual 
time=0.005..0.959 rows=2000 loops=1)
   Filter: ((jdoc -> 'tags'::text) ? 'qui'::text)
Planning Time: 0.064 ms
Execution Time: 1.027 ms

Any thoughts?

Re: jsonb Indexing

From

Ilya Anfimov

Date:

20 September 2021, 11:24:43

On Mon, Sep 20, 2021 at 12:52:54PM +0200, ramikvl@gmail.com wrote:
> Hello Julien,
> 
> On 9/17/21 4:00 PM, Julien Rouhaud wrote:
> > Hi,
> > 
> > On Fri, Sep 17, 2021 at 9:55 PM <ramikvl@gmail.com> wrote:
> > > I was wondering what I'm doing wrong. There are steps what I've tried:
> > > 
> > > CREATE TABLE api (
> > >       jdoc jsonb
> > > );
> > > 
> > > INSERT INTO api (jdoc)
> > >       VALUES ('{
> > >       "guid": "9c36adc1-7fb5-4d5b-83b4-90356a46061a",
> > >       "name": "Angela Barton",
> > >       "is_active": true,
> > >       "company": "Magnafone",
> > >       "address": "178 Howard Place, Gulf, Washington, 702",
> > >       "registered": "2009-11-07T08:53:22 +08:00",
> > >       "latitude": 19.793713,
> > >       "longitude": 86.513373,
> > >       "tags": [
> > >           "enim",
> > >           "aliquip",
> > >           "qui"
> > >       ]
> > > }');
> > > 
> > > CREATE INDEX idxgintags ON api USING GIN ((jdoc->'tags'));
> > > 
> > > EXPLAIN ANALYZE SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc ->
> > > 'tags' ? 'qui';
> > > 
> > > And the result is
> > > 
> > > Seq Scan on api  (cost=0.00..1.02 rows=1 width=64) (actual
> > > time=0.019..0.021 rows=1 loops=1)
> > >     Filter: ((jdoc -> 'tags'::text) ? 'qui'::text)
> > > 
> > > Planning Time: 0.115 ms
> > > 
> > > Execution Time: 0.047 ms
> > > 
> > > Do you know why Index Scan on idxgintag is not used?
> > Yes, because doing an index scan on a table containing a single row is
> > an order or magnitude less efficient than simply doing a sequential
> > scan.  You should try to simulate something close to your production
> > data to see something interesting.
> 
> Thank you for the tip. I've tried to generate more data. I have 2000 rows in
> the table but the query still uses sequential scan.
> 
> Seq Scan on api  (cost=0.00..131.00 rows=2000 width=64) (actual
> time=0.005..0.959 rows=2000 loops=1)
>   Filter: ((jdoc -> 'tags'::text) ? 'qui'::text)
> Planning Time: 0.064 ms
> Execution Time: 1.027 ms
> 
> Any thoughts?

 The planner expects index selectivity around 1 (all the rows to be selected).
 btw, it was right (all the rows were selected).

 So, trying to select something by the index is just wasting time,
compared to seq scan.

> 
>

Re: jsonb Indexing

From

ramikvl@gmail.com

Date:

20 September 2021, 11:46:07

On 9/20/21 12:52 PM, ramikvl@gmail.com wrote:
> Hello Julien,
>
> On 9/17/21 4:00 PM, Julien Rouhaud wrote:
>> Hi,
>>
>> On Fri, Sep 17, 2021 at 9:55 PM <ramikvl@gmail.com> wrote:
>>> I was wondering what I'm doing wrong. There are steps what I've tried:
>>>
>>> CREATE TABLE api (
>>>       jdoc jsonb
>>> );
>>>
>>> INSERT INTO api (jdoc)
>>>       VALUES ('{
>>>       "guid": "9c36adc1-7fb5-4d5b-83b4-90356a46061a",
>>>       "name": "Angela Barton",
>>>       "is_active": true,
>>>       "company": "Magnafone",
>>>       "address": "178 Howard Place, Gulf, Washington, 702",
>>>       "registered": "2009-11-07T08:53:22 +08:00",
>>>       "latitude": 19.793713,
>>>       "longitude": 86.513373,
>>>       "tags": [
>>>           "enim",
>>>           "aliquip",
>>>           "qui"
>>>       ]
>>> }');
>>>
>>> CREATE INDEX idxgintags ON api USING GIN ((jdoc->'tags'));
>>>
>>> EXPLAIN ANALYZE SELECT jdoc->'guid', jdoc->'name' FROM api WHERE 
>>> jdoc ->
>>> 'tags' ? 'qui';
>>>
>>> And the result is
>>>
>>> Seq Scan on api  (cost=0.00..1.02 rows=1 width=64) (actual
>>> time=0.019..0.021 rows=1 loops=1)
>>>     Filter: ((jdoc -> 'tags'::text) ? 'qui'::text)
>>>
>>> Planning Time: 0.115 ms
>>>
>>> Execution Time: 0.047 ms
>>>
>>> Do you know why Index Scan on idxgintag is not used?
>> Yes, because doing an index scan on a table containing a single row is
>> an order or magnitude less efficient than simply doing a sequential
>> scan.  You should try to simulate something close to your production
>> data to see something interesting.
>
> Thank you for the tip. I've tried to generate more data. I have 2000 
> rows in the table but the query still uses sequential scan.
>
> Seq Scan on api  (cost=0.00..131.00 rows=2000 width=64) (actual 
> time=0.005..0.959 rows=2000 loops=1)
>   Filter: ((jdoc -> 'tags'::text) ? 'qui'::text)
> Planning Time: 0.064 ms
> Execution Time: 1.027 ms
>
> Any thoughts?

Strangely enough when I re-created the index it's working, now. I 
probably made a mistake.

Thank you.