Re: jsonb status - Mailing list pgsql-hackers
From | Andrew Dunstan |
---|---|
Subject | Re: jsonb status |
Date | |
Msg-id | 5327355D.4020509@dunslane.net Whole thread Raw |
In response to | Re: jsonb status (Peter Geoghegan <pg@heroku.com>) |
Responses |
Re: jsonb status
|
List | pgsql-hackers |
On 03/16/2014 04:10 AM, Peter Geoghegan wrote: > On Thu, Mar 13, 2014 at 2:00 PM, Andrew Dunstan <andrew@dunslane.net> wrote: >> I'll be travelling a good bit of tomorrow (Friday), but I hope Peter has >> finished by the time I am back on deck late tomorrow and that I am able to >> commit this on Saturday. > I asked Andrew to hold off on committing this today. It was agreed > that we weren't quite ready, because there were one or two remaining > bugs (since fixed), but also because I felt that it would be useful to > first hear the opinions of more people before proceeding. I think that > we're not that far from having something committed. Obviously I hope > to get this into 9.4, and attach a lot of strategic importance to > having the feature, which is why I made a large effort to help land > it. > > Attached patch has a number of notable revisions. Throughout, it has > been possible for anyone to follow our progress here: > https://github.com/feodor/postgres/commits/jsonb_and_hstore > > * In general, the file jsonb_support.c (renamed to jsonb_utils.c) is > vastly better commented, and has a much clearer structure. This was > not something I did much with in the previous revision, and so it has > been a definite focus of this one. > > * Hashing is refactored to not use CRC32 anymore. I felt this was a > questionable method of hashing, both within jsonb_hash(), as well as > the jsonb_hash_ops GIN operator class. > > * Dead code elimination. > > * I got around to fixing the memory leaks in B-Tree support function one. > > * Andrew added hstore_to_jsonb, hstore_to_jsonb_loose functions and a > cast. One goal of this effort is to preserve a parallel set of > facilities for the json and jsonb types, and that includes > hstore-related features. > > * A fix from Alexander for the jsonb_hash_ops @>operator issue I > complained about during the last submission was merged. > > * There is no longer any GiST opclass. That just leaves B-Tree, hash, > GIN (default) and GIN jsonb_hash_ops opclasses. > > My outstanding concerns are: > > * Have we got things right with GIN indexing, containment semantics, > etc? See my remarks in the patch, by grepping "contain" within > jsonb_util.c. Is the GIN text storage serialization format appropriate > and correct? > > * General design concerns. By far the largest source of these is the > file jsonb_util.c. > > * Is the on-disk format that we propose to tie Postgres to as good as > it could be? > I've been working through all the changes and fixes that Peter and others have made, and they look pretty good to me. There are a few mostly cosmetic changes I want to make, but nothing that would be worth holding up committing this for. I'm fairly keen to get this committed, get some buildfarm coverage and get more people playing with it and testing it. Like Peter, I would like to see more comments from people on the GIN support, especially. The one outstanding significant question of substance I have is this: given the commit 5 days ago of provision for triConsistent functions for GIN opclasses, should be be adding these to the two GIN opclasses we are providing, and what should they look like? Again, this isn't an issue that I think needs to hold up committing what we have now. Regarding Peter's last question, if we're not satisfied with the on-disk format proposed it would mean throwing the whole effort out and starting again. The only thing I have thought of as an alternative would be to store the structure and values separately rather than with values inline with the structure. That way you could have a hash of values more or less, which would eliminate redundancy of storage of things like object field names. But such a structure might well involve at least as much computational overhead as the current structure. And nobody's been saying all along "hold on, we can do better than this." So I'm pretty inclined to go with what we have. cheers andrew
pgsql-hackers by date: