Re: Unable to make use of "deep" JSONB index - Mailing list pgsql-bugs
From | Shaheed Haque |
---|---|
Subject | Re: Unable to make use of "deep" JSONB index |
Date | |
Msg-id | CAHAc2jfQvVMUKQ4rZ16G3Nsn+Rn-5_NFabzXnJ+q41OXmH+pOg@mail.gmail.com Whole thread Raw |
In response to | Re: Unable to make use of "deep" JSONB index (Erik Rijkers <er@xs4all.nl>) |
Responses |
Re: Unable to make use of "deep" JSONB index
Re: Unable to make use of "deep" JSONB index |
List | pgsql-bugs |
(Resend, wrong version was sent before) Thanks Erik. Is the point that the index has to be on the JSON field as a whole (i.e. "snapshot") rather than deep inside it (e.g. "snapshot.something.further[down]")? In my case, the snapshot is several MB in size (perhaps 10MB or even 20MB), dominated by the snapshot.employee (cardinality 10k, each sized as a dict 1-2kB as text). My expectation/guess is that an index of "snapshot" will itself be of a size of similar order. However the design as-is works very well except for this one case where to speed it up, in principle, the index need contain no more than one boolean per employee. So that's what I'd like to achieve, if possible. I've seen no hint in the documentation that creating the index on "snapshot.something.further[down]" should not work, and PG certainly allows it to be created. Also, Tom has suggested that I should not look to some magical ability to infer the use of the index from a differently structured query, and I've taken that on board with the updated query + index. AFAIK, there are 3 possibilities: - I've not done things right, in which case I'd love to know my mistake. - It is not supposed to work, in which case it would be good to have that stated, and maybe have PG not allow useless indices to be created. - It is a bug. All input much appreciated, Thanks, Shaheed On Sun, 12 Jun 2022 at 11:53, Erik Rijkers <er@xs4all.nl> wrote: > > Op 12-06-2022 om 11:34 schreef Shaheed Haque: > > OK, I have corrected and simplified the test case (including switching > > to a btree index). The WHERE clause and the inex now look like this: > > > > ...WHERE ((snapshot -> 'employee' -> '999' ->> > > 'pay_graph')::integer != 0); > > ...USING btree (((snapshot -> 'employee' -> '$.*' ->> > > 'pay_graph')::integer != 0)); > > > > But the index is still not being used (test case below). I have > > confirmed that the equality operator is listed for "search" (I assume > > inequality is the same as equality, but I tried both): > > > > # \dAo btree jsonb* > > List of operators of operator families > > AM | Operator family | Operator | Strategy | Purpose > > -------+-----------------+-----------------+----------+--------- > > btree | jsonb_ops | <(jsonb,jsonb) | 1 | search > > btree | jsonb_ops | <=(jsonb,jsonb) | 2 | search > > btree | jsonb_ops | =(jsonb,jsonb) | 3 | search > > btree | jsonb_ops | >=(jsonb,jsonb) | 4 | search > > btree | jsonb_ops | >(jsonb,jsonb) | 5 | search > > (5 rows) > > > > If this is not a bug, then how should the query or the index be > > changed to make this work? > > > > > > === begin test case === > > > > CREATE TABLE payrun ( > > id serial primary key, > > snapshot JSONB > > ); > > > > INSERT INTO payrun(snapshot) > > VALUES > > ('{"employee": {"999": {"id": 999, "state": {"employment": > > [["1920-01-01", null, 3]]}, "pay_graph": 0, "last_run_of_employment": > > false}, "111": {"id": 111, "state": {"employment": [["1920-01-01", > > null, 5]]}, "pay_graph": 3, "last_run_of_employment": true}}}'), > > ('{"employee": {"999": {"id": 999, "state": {"employment": > > [["1970-01-01", null, 3]]}, "pay_graph": 6, "last_run_of_employment": > > true}, "222": {"id": 222, "state": {"employment": [["1920-01-01", > > null, 5]]}, "pay_graph": 5, "last_run_of_employment": true}}}'), > > ('{"employee": {"998": {"id": 998, "state": {"employment": > > [["1980-01-01", null, 3]]}, "pay_graph": 7, "last_run_of_employment": > > false}, "333": {"id": 333, "state": {"employment": [["1920-01-01", > > null, 5]]}, "pay_graph": 3, "last_run_of_employment": true}}}') > > ; > > > > SELECT id,snapshot #>'{employee,999,state,employment}' FROM "payrun" > > WHERE ((snapshot -> 'employee' -> '999' ->> 'pay_graph')::integer != > > 0); > > > > -- > > -- Create index designed to match the query. > > -- > > CREATE INDEX idx1 ON payrun USING btree (((snapshot -> 'employee' -> > > '$.*' ->> 'pay_graph')::integer != 0)); > > > > set enable_seqscan = OFF; > > > > How is this? > > I took the triple condition from your earlier email. > I did not use your index. > I added one index using gin jsonb_path_ops. > > create index payrun_jspathop_idx ON payrun using gin (snapshot > jsonb_path_ops); > set enable_seqscan = OFF; > \timing on > select id, snapshot #>'{employee,999,state,employment}' from payrun > where snapshot @? '$."employee"."999" ? > ( @.pay_graph <> 0 > || @.last_run_of_employment == true > || @.state.employment[last][2] == 0 > )'; > id | ?column? > ----+--------------------------- > 2 | [["1970-01-01", null, 3]] > (1 row) > > Time: 0.897 ms > explain analyze select id, snapshot #>'{employee,999,state,employment}' > from payrun > where snapshot @? '$."employee"."999" ? > ( @.pay_graph <> 0 > || @.last_run_of_employment == true > || @.state.employment[last][2] == 0 > )'; > > QUERY PLAN > > ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Bitmap Heap Scan on payrun (cost=136.00..140.02 rows=1 width=36) > (actual time=0.018..0.019 rows=1 loops=1) > Recheck Cond: (snapshot @? '$."employee"."999"?((@."pay_graph" != 0 > || @."last_run_of_employment" == true) || > @."state"."employment"[last][2] == 0)'::jsonpath) > Rows Removed by Index Recheck: 2 > Heap Blocks: exact=1 > -> Bitmap Index Scan on payrun_jspathop_idx (cost=0.00..136.00 > rows=1 width=0) (actual time=0.007..0.007 rows=3 loops=1) > Index Cond: (snapshot @? '$."employee"."999"?((@."pay_graph" > != 0 || @."last_run_of_employment" == true) || > @."state"."employment"[last][2] == 0)'::jsonpath) > Planning Time: 0.034 ms > Execution Time: 0.033 ms > (8 rows) > > Time: 0.284 ms > > > hope that helps. > > Erik Rijkers
pgsql-bugs by date: