Re: Unable to make use of "deep" JSONB index - Mailing list pgsql-bugs
From | Shaheed Haque |
---|---|
Subject | Re: Unable to make use of "deep" JSONB index |
Date | |
Msg-id | CAHAc2jfokgv5GZ-eLrAJ8G52gLqSRy_VR-s0R90uJxui-YSQsQ@mail.gmail.com Whole thread Raw |
In response to | Re: Unable to make use of "deep" JSONB index (Shaheed Haque <shaheedhaque@gmail.com>) |
Responses |
Re: Unable to make use of "deep" JSONB index
Re: Unable to make use of "deep" JSONB index |
List | pgsql-bugs |
OK, I have corrected and simplified the test case (including switching to a btree index). The WHERE clause and the inex now look like this: ...WHERE ((snapshot -> 'employee' -> '999' ->> 'pay_graph')::integer != 0); ...USING btree (((snapshot -> 'employee' -> '$.*' ->> 'pay_graph')::integer != 0)); But the index is still not being used (test case below). I have confirmed that the equality operator is listed for "search" (I assume inequality is the same as equality, but I tried both): # \dAo btree jsonb* List of operators of operator families AM | Operator family | Operator | Strategy | Purpose -------+-----------------+-----------------+----------+--------- btree | jsonb_ops | <(jsonb,jsonb) | 1 | search btree | jsonb_ops | <=(jsonb,jsonb) | 2 | search btree | jsonb_ops | =(jsonb,jsonb) | 3 | search btree | jsonb_ops | >=(jsonb,jsonb) | 4 | search btree | jsonb_ops | >(jsonb,jsonb) | 5 | search (5 rows) If this is not a bug, then how should the query or the index be changed to make this work? === begin test case === CREATE TABLE payrun ( id serial primary key, snapshot JSONB ); INSERT INTO payrun(snapshot) VALUES ('{"employee": {"999": {"id": 999, "state": {"employment": [["1920-01-01", null, 3]]}, "pay_graph": 0, "last_run_of_employment": false}, "111": {"id": 111, "state": {"employment": [["1920-01-01", null, 5]]}, "pay_graph": 3, "last_run_of_employment": true}}}'), ('{"employee": {"999": {"id": 999, "state": {"employment": [["1970-01-01", null, 3]]}, "pay_graph": 6, "last_run_of_employment": true}, "222": {"id": 222, "state": {"employment": [["1920-01-01", null, 5]]}, "pay_graph": 5, "last_run_of_employment": true}}}'), ('{"employee": {"998": {"id": 998, "state": {"employment": [["1980-01-01", null, 3]]}, "pay_graph": 7, "last_run_of_employment": false}, "333": {"id": 333, "state": {"employment": [["1920-01-01", null, 5]]}, "pay_graph": 3, "last_run_of_employment": true}}}') ; SELECT id,snapshot #>'{employee,999,state,employment}' FROM "payrun" WHERE ((snapshot -> 'employee' -> '999' ->> 'pay_graph')::integer != 0); -- -- Create index designed to match the query. -- CREATE INDEX idx1 ON payrun USING btree (((snapshot -> 'employee' -> '$.*' ->> 'pay_graph')::integer != 0)); set enable_seqscan = OFF; -- -- EXPLAIN ANALYZE ...query above... -- explain analyze SELECT id,snapshot #>'{employee,999,state,employment}' FROM "payrun" WHERE ((snapshot -> 'employee' -> '999' ->> 'pay_graph')::integer != 0); QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Seq Scan on payrun (cost=10000000000.00..10000000001.08 rows=2 width=36) (actual time=70.051..70.052 rows=1 loops=1) Filter: (((((snapshot -> 'employee'::text) -> '999'::text) ->> 'pay_graph'::text))::integer <> 0) Rows Removed by Filter: 2 Planning Time: 0.147 ms JIT: Functions: 4 Options: Inlining true, Optimization true, Expressions true, Deforming true Timing: Generation 0.354 ms, Inlining 8.305 ms, Optimization 49.237 ms, Emission 12.499 ms, Total 70.395 ms Execution Time: 70.428 ms (9 rows) === end test case === Thanks, Shaheed On Thu, 2 Jun 2022 at 16:51, Shaheed Haque <shaheedhaque@gmail.com> wrote: > > On Thu, 2 Jun 2022 at 15:31, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > Shaheed Haque <shaheedhaque@gmail.com> writes: > > > -- Create index designed to match the query. > > > -- > > > create index idx1 on payrun using gin ((snapshot->'$.employee.* ? > > > (@.pay_graph <> 0 || @.last_run_of_employment == true || > > > @.state.employment[last][2] == 0)')); > > > > But that doesn't match the query; it's not even the same topmost > > operator: > > > > > explain analyse SELECT id,snapshot #>'{employee,999,state,employment}' FROM > > > "payrun" WHERE (snapshot @? '$.employee."999" ? (@.pay_graph <> 0 || > > > @.last_run_of_employment > > > == true || @.state.employment[last][2] == 0)'); > > I assume you are referring to the difference between "snapshot @?" and > "snapshot ->"? If so, apologies: too much cutting and pasting from too > many experiments. I did in fact also try the "using btree ((snapshot > @?" form but it gave the same results. > > > In general you seem to have much too high an opinion of what PG's > > index machinery can cope with. The general pattern is that it can > > use a query WHERE clause with an index if the clause is of the form > > "indexed-column indexable-operator constant". There's a small number > > of special cases where it can transform things that don't initially > > look like that into the right form, but AFAIR we don't have any > > such special cases for any json-related operators. > > LOL. I'm pretty much a noob here, so that's very possible. > > > The one saving grace is that "indexed-column" can be an expression > > appearing in an index, so in some cases you can finesse things > > that way. But you won't find any deep knowledge of jsonpath > > expressions in there. > > I was basing my efforts on this statement in the docs > https://www.postgresql.org/docs/14/datatype-json.html#JSON-INDEXING: > > GIN index extracts statements of following form out of jsonpath: > accessors_chain = const. Accessors chain may consist of .key, [*], and > [index] accessors. jsonb_ops additionally supports .* and .** > accessors. > > Did I mis-implement, misunderstand or read too much into this? > > > Having said that, @? is reported as an indexable operator in v14: > > > > regression=# \dAo gin jsonb* > > List of operators of operator families > > AM | Operator family | Operator | Strategy | Purpose > > -----+-----------------+--------------------+----------+--------- > > gin | jsonb_ops | @>(jsonb,jsonb) | 7 | search > > gin | jsonb_ops | @?(jsonb,jsonpath) | 15 | search > > gin | jsonb_ops | @@(jsonb,jsonpath) | 16 | search > > gin | jsonb_ops | ?(jsonb,text) | 9 | search > > gin | jsonb_ops | ?|(jsonb,text[]) | 10 | search > > gin | jsonb_ops | ?&(jsonb,text[]) | 11 | search > > gin | jsonb_path_ops | @>(jsonb,jsonb) | 7 | search > > gin | jsonb_path_ops | @?(jsonb,jsonpath) | 15 | search > > gin | jsonb_path_ops | @@(jsonb,jsonpath) | 16 | search > > (9 rows) > > > > so it seems like you ought to get some benefit for this query > > from just a plain GIN index on "snapshot". > > Interesting. I'm pretty sure I started there a few days ago without > any luck but I'll give it another spin (having learnt quite a bit > since then). > > > > > regards, tom lane
pgsql-bugs by date: