Re: jsonb concatenate operator's semantics seem questionable - Mailing list pgsql-hackers
From | Andrew Dunstan |
---|---|
Subject | Re: jsonb concatenate operator's semantics seem questionable |
Date | |
Msg-id | 55589FD6.8050304@dunslane.net Whole thread Raw |
In response to | jsonb concatenate operator's semantics seem questionable (Peter Geoghegan <pg@heroku.com>) |
Responses |
Re: jsonb concatenate operator's semantics seem questionable
|
List | pgsql-hackers |
On 05/16/2015 10:56 PM, Peter Geoghegan wrote: > Another thing that I noticed about the new jsonb stuff is that the > concatenate operator is based on the hstore one. This works as > expected: > > postgres=# select '{"a":1}'::jsonb || '{"a":2}'; > ?column? > ---------- > {"a": 2} > (1 row) > > However, the nesting doesn't "match up" -- containers are not merged > beyond the least-nested level: > > postgres=# select '{"a":{"nested":1}}'::jsonb || '{"a":{"also nested":2}}'; > ?column? > --------------------------- > {"a": {"also nested": 2}} > (1 row) > > This feels wrong to me. When jsonb was initially introduced, we took > inspiration for the *containment* ("operator @> jsonb") semantics from > hstore, but since jsonb is nested it worked in a nested fashion. At > the top level and with no nested containers there was no real > difference, but we had to consider the behavior of more nested levels > carefully (the containment operator is clearly the most important > jsonb operator). I had envisaged that with the concatenation of jsonb, > concatenation would similarly behave in a nested fashion. Under this > scheme, the above query would perform nested concatenation as follows: > > postgres=# select '{"a":{"nested":1}}'::jsonb || '{"a":{"also > nested":2}}'; -- does not match actual current behavior > ?column? > --------------------------- > {"a": {"nested":1, "also nested": 2}} > (1 row) > > Now, I think it's good that the minus operator ("operator - text" and > friends) discussed on the nearby thread accepts a text (or int) > argument and remove string elements/pairs at the top level only. This > works exactly the same as existence (I happen to think that removing > elements/pairs at a nested level is likely to be more trouble than > it's worth, and so I don't really like the new "jsonb - text[]" > operator much, because it accepts a Postgres (not JSON) array of texts > that constitute a path, which feels odd). So I have no issue with at > least the plain minus operators' semantics. But I think that the > concatenate operator's current semantics are significantly less useful > than they could be, and are not consistent with the overall design of > jsonb. > > I'm particularly concerned about a table containing many homogeneously > structured, deeply nested jsonb datums (think of the delicious URLs > dataset that jsonb was originally tested using for a good example of > that -- this is quite representative of how people use jsonb in the > real world). It would be almost impossible to perform insert-or-update > type operations to these deeply nested elements using hstore style > concatenation. You'd almost invariably end up removing a bunch of > irrelevant nested values of the documents, when you only intended to > update one deeply nested value. > > Looking back at the discussion of the new jsonb stuff, a concern was > raised along these lines by Ilya Ashchepkov [1], but this was > dismissed. I feel pretty strongly that this should be revisited. I'm > willing to concede that we might not want to always merge containers > that are found in the same position during concatenation, but I think > it's more likely that we do. As with containment, my sense is that > there should be nothing special about the nesting level -- it should > not influence whether we merge rather than overwrite the operator's > lhs container (with or into the rhs container). Not everyone will > agree with this [2]. > > I'm sorry that I didn't get to this sooner, but I was rather busy when > it was being discussed. > > [1] http://www.postgresql.org/message-id/55006879.2050601@dunslane.net > [2] http://www.postgresql.org/message-id/54EF61DD.7040208@agliodbs.com Historical note: I think it's based on the nested hstore work, not on current hstore, but Dmitry can answer on that. I didn't dismiss this because it was a bad idea, but because it was too late in the process. If there is a consensus that we need to address this now then I'm happy to reopen that, but given the recent amount of angst about process I'm certainly not going to make such a decision unilaterally. Personally, I think there is plenty of room for both operations, and I can see use cases for both. If I were designing I'd leave || as it is now and add a + operation to do a recursive merge. I'm not sure how much work that would be. Not huge but not trivial either. cheers andrew
pgsql-hackers by date: