Thread: jsonb Indexing
Hello, I'm new to jsonb feature of postgres. I have PostgreSQL 13.2 running locally in docker. I've been playing a little bit with jsonb indexes described https://www.postgresql.org/docs/13/datatype-json.html#JSON-INDEXING and I'm not able make the GIN index work on 'tags'. I was wondering what I'm doing wrong. There are steps what I've tried: CREATE TABLE api ( jdoc jsonb ); INSERT INTO api (jdoc) VALUES ('{ "guid": "9c36adc1-7fb5-4d5b-83b4-90356a46061a", "name": "Angela Barton", "is_active": true, "company": "Magnafone", "address": "178 Howard Place, Gulf, Washington, 702", "registered": "2009-11-07T08:53:22 +08:00", "latitude": 19.793713, "longitude": 86.513373, "tags": [ "enim", "aliquip", "qui" ] }'); CREATE INDEX idxgintags ON api USING GIN ((jdoc->'tags')); EXPLAIN ANALYZE SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc -> 'tags' ? 'qui'; And the result is Seq Scan on api (cost=0.00..1.02 rows=1 width=64) (actual time=0.019..0.021 rows=1 loops=1) Filter: ((jdoc -> 'tags'::text) ? 'qui'::text) Planning Time: 0.115 ms Execution Time: 0.047 ms Do you know why Index Scan on idxgintag is not used? Thank you, Vlasta
Hi, On Fri, Sep 17, 2021 at 9:55 PM <ramikvl@gmail.com> wrote: > > I was wondering what I'm doing wrong. There are steps what I've tried: > > CREATE TABLE api ( > jdoc jsonb > ); > > INSERT INTO api (jdoc) > VALUES ('{ > "guid": "9c36adc1-7fb5-4d5b-83b4-90356a46061a", > "name": "Angela Barton", > "is_active": true, > "company": "Magnafone", > "address": "178 Howard Place, Gulf, Washington, 702", > "registered": "2009-11-07T08:53:22 +08:00", > "latitude": 19.793713, > "longitude": 86.513373, > "tags": [ > "enim", > "aliquip", > "qui" > ] > }'); > > CREATE INDEX idxgintags ON api USING GIN ((jdoc->'tags')); > > EXPLAIN ANALYZE SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc -> > 'tags' ? 'qui'; > > And the result is > > Seq Scan on api (cost=0.00..1.02 rows=1 width=64) (actual > time=0.019..0.021 rows=1 loops=1) > Filter: ((jdoc -> 'tags'::text) ? 'qui'::text) > > Planning Time: 0.115 ms > > Execution Time: 0.047 ms > > Do you know why Index Scan on idxgintag is not used? Yes, because doing an index scan on a table containing a single row is an order or magnitude less efficient than simply doing a sequential scan. You should try to simulate something close to your production data to see something interesting.
Hello Julien, On 9/17/21 4:00 PM, Julien Rouhaud wrote: > Hi, > > On Fri, Sep 17, 2021 at 9:55 PM <ramikvl@gmail.com> wrote: >> I was wondering what I'm doing wrong. There are steps what I've tried: >> >> CREATE TABLE api ( >> jdoc jsonb >> ); >> >> INSERT INTO api (jdoc) >> VALUES ('{ >> "guid": "9c36adc1-7fb5-4d5b-83b4-90356a46061a", >> "name": "Angela Barton", >> "is_active": true, >> "company": "Magnafone", >> "address": "178 Howard Place, Gulf, Washington, 702", >> "registered": "2009-11-07T08:53:22 +08:00", >> "latitude": 19.793713, >> "longitude": 86.513373, >> "tags": [ >> "enim", >> "aliquip", >> "qui" >> ] >> }'); >> >> CREATE INDEX idxgintags ON api USING GIN ((jdoc->'tags')); >> >> EXPLAIN ANALYZE SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc -> >> 'tags' ? 'qui'; >> >> And the result is >> >> Seq Scan on api (cost=0.00..1.02 rows=1 width=64) (actual >> time=0.019..0.021 rows=1 loops=1) >> Filter: ((jdoc -> 'tags'::text) ? 'qui'::text) >> >> Planning Time: 0.115 ms >> >> Execution Time: 0.047 ms >> >> Do you know why Index Scan on idxgintag is not used? > Yes, because doing an index scan on a table containing a single row is > an order or magnitude less efficient than simply doing a sequential > scan. You should try to simulate something close to your production > data to see something interesting. Thank you for the tip. I've tried to generate more data. I have 2000 rows in the table but the query still uses sequential scan. Seq Scan on api (cost=0.00..131.00 rows=2000 width=64) (actual time=0.005..0.959 rows=2000 loops=1) Filter: ((jdoc -> 'tags'::text) ? 'qui'::text) Planning Time: 0.064 ms Execution Time: 1.027 ms Any thoughts?
On Mon, Sep 20, 2021 at 12:52:54PM +0200, ramikvl@gmail.com wrote: > Hello Julien, > > On 9/17/21 4:00 PM, Julien Rouhaud wrote: > > Hi, > > > > On Fri, Sep 17, 2021 at 9:55 PM <ramikvl@gmail.com> wrote: > > > I was wondering what I'm doing wrong. There are steps what I've tried: > > > > > > CREATE TABLE api ( > > > jdoc jsonb > > > ); > > > > > > INSERT INTO api (jdoc) > > > VALUES ('{ > > > "guid": "9c36adc1-7fb5-4d5b-83b4-90356a46061a", > > > "name": "Angela Barton", > > > "is_active": true, > > > "company": "Magnafone", > > > "address": "178 Howard Place, Gulf, Washington, 702", > > > "registered": "2009-11-07T08:53:22 +08:00", > > > "latitude": 19.793713, > > > "longitude": 86.513373, > > > "tags": [ > > > "enim", > > > "aliquip", > > > "qui" > > > ] > > > }'); > > > > > > CREATE INDEX idxgintags ON api USING GIN ((jdoc->'tags')); > > > > > > EXPLAIN ANALYZE SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc -> > > > 'tags' ? 'qui'; > > > > > > And the result is > > > > > > Seq Scan on api (cost=0.00..1.02 rows=1 width=64) (actual > > > time=0.019..0.021 rows=1 loops=1) > > > Filter: ((jdoc -> 'tags'::text) ? 'qui'::text) > > > > > > Planning Time: 0.115 ms > > > > > > Execution Time: 0.047 ms > > > > > > Do you know why Index Scan on idxgintag is not used? > > Yes, because doing an index scan on a table containing a single row is > > an order or magnitude less efficient than simply doing a sequential > > scan. You should try to simulate something close to your production > > data to see something interesting. > > Thank you for the tip. I've tried to generate more data. I have 2000 rows in > the table but the query still uses sequential scan. > > Seq Scan on api (cost=0.00..131.00 rows=2000 width=64) (actual > time=0.005..0.959 rows=2000 loops=1) > Filter: ((jdoc -> 'tags'::text) ? 'qui'::text) > Planning Time: 0.064 ms > Execution Time: 1.027 ms > > Any thoughts? The planner expects index selectivity around 1 (all the rows to be selected). btw, it was right (all the rows were selected). So, trying to select something by the index is just wasting time, compared to seq scan. > >
On 9/20/21 12:52 PM, ramikvl@gmail.com wrote: > Hello Julien, > > On 9/17/21 4:00 PM, Julien Rouhaud wrote: >> Hi, >> >> On Fri, Sep 17, 2021 at 9:55 PM <ramikvl@gmail.com> wrote: >>> I was wondering what I'm doing wrong. There are steps what I've tried: >>> >>> CREATE TABLE api ( >>> jdoc jsonb >>> ); >>> >>> INSERT INTO api (jdoc) >>> VALUES ('{ >>> "guid": "9c36adc1-7fb5-4d5b-83b4-90356a46061a", >>> "name": "Angela Barton", >>> "is_active": true, >>> "company": "Magnafone", >>> "address": "178 Howard Place, Gulf, Washington, 702", >>> "registered": "2009-11-07T08:53:22 +08:00", >>> "latitude": 19.793713, >>> "longitude": 86.513373, >>> "tags": [ >>> "enim", >>> "aliquip", >>> "qui" >>> ] >>> }'); >>> >>> CREATE INDEX idxgintags ON api USING GIN ((jdoc->'tags')); >>> >>> EXPLAIN ANALYZE SELECT jdoc->'guid', jdoc->'name' FROM api WHERE >>> jdoc -> >>> 'tags' ? 'qui'; >>> >>> And the result is >>> >>> Seq Scan on api (cost=0.00..1.02 rows=1 width=64) (actual >>> time=0.019..0.021 rows=1 loops=1) >>> Filter: ((jdoc -> 'tags'::text) ? 'qui'::text) >>> >>> Planning Time: 0.115 ms >>> >>> Execution Time: 0.047 ms >>> >>> Do you know why Index Scan on idxgintag is not used? >> Yes, because doing an index scan on a table containing a single row is >> an order or magnitude less efficient than simply doing a sequential >> scan. You should try to simulate something close to your production >> data to see something interesting. > > Thank you for the tip. I've tried to generate more data. I have 2000 > rows in the table but the query still uses sequential scan. > > Seq Scan on api (cost=0.00..131.00 rows=2000 width=64) (actual > time=0.005..0.959 rows=2000 loops=1) > Filter: ((jdoc -> 'tags'::text) ? 'qui'::text) > Planning Time: 0.064 ms > Execution Time: 1.027 ms > > Any thoughts? Strangely enough when I re-created the index it's working, now. I probably made a mistake. Thank you.