Re: Proposal to use JSON for Postgres Parser format - Mailing list pgsql-hackers
From | Matthias van de Meent |
---|---|
Subject | Re: Proposal to use JSON for Postgres Parser format |
Date | |
Msg-id | CAEze2WgeznFH0yQVcLa+-oV6Uiq4MMGsrZ6vmr-Q0nyZg3dn-w@mail.gmail.com Whole thread Raw |
In response to | Re: Proposal to use JSON for Postgres Parser format (Michel Pelletier <pelletier.michel@gmail.com>) |
Responses |
Re: Proposal to use JSON for Postgres Parser format
|
List | pgsql-hackers |
On Mon, 31 Oct 2022 at 15:56, Michel Pelletier <pelletier.michel@gmail.com> wrote: > On Mon, Oct 31, 2022 at 6:15 AM Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: >> On Mon, 31 Oct 2022 at 13:46, Alexander Korotkov <aekorotkov@gmail.com> wrote: >>> On Fri, Oct 28, 2022 at 4:27 PM Andrew Dunstan <andrew@dunslane.net> wrote: >>>> On 2022-10-27 Th 19:38, Andres Freund wrote: >>>>> Hi, >>>>> >>>>> On 2022-09-19 22:29:15 -0400, Tom Lane wrote: >>>>>> Maybe a compromise could be found whereby we provide a conversion function >>>>>> that converts whatever the catalog storage format is to some JSON >>>>>> equivalent. That would address the needs of external code that doesn't want >>>>>> to write a custom parser, while not tying us directly to JSON. >>>>> +1 >>>> >>>> Agreed. >>> >>> +1 >>> >>> Michel, it seems that you now have a green light to implement node to >>> json function. >> >> I think that Tom's proposal that we +1 is on a pg_node_tree to json >> SQL function / cast; which is tangentially related to the "nodeToJson >> / changing the storage format of pg_node_tree to json" proposal, but >> not the same. > > > I agree. > >> >> I will add my +1 to Tom's proposal for that function/cast, but I'm not >> sure on changing the storage format of pg_node_tree to json. > > > I'm going to spike on this function and will get back to the thread with any updates. Michel, did you get a result from this spike? I'm asking, because as I spiked most of my ideas on updating the node text format, and am working on wrapping it up into a patch (or patchset) later this week. The ideas for this are: 1. Don't write fields with default values for their types, such as NULL for Node* fields; 2. Reset location fields before transforming the node tree to text when we don't have a copy of the original query, which removes location fields from serialization with step 1; 3. Add default() node labels to struct fields that do not share the field type's default, allowing more fields to be omitted with step 1; 4. Add special default_ref() pg_node_attr for node fields that default to other node field's values, used in Var's varnosyn/varattnosyn as refering to varno/varattno; and 5. Truncate trailing 0s in Const' outDatum notation of by-ref types, so that e.g. Consts with `name` data don't waste so much space with 0s Currently, it reduces the pg_total_relation_size metric of pg_rewrite after TOAST compression by 35% vs pg16, down to 483328 bytes / 59 pages, from 753664 bytes / 92 pages. The raw size of the ev_action column's data (that is, before compression) is reduced by 55% to 1.18MB (from 2.80MB), and the largest default shipped row (the information_schema.columns view) in that table is reduced to 'only' 78kB raw, from 193kB. RW performance hasn't been tested yet, so that is still to be determined... Kind regards, Matthias van de Meent Neon (https://neon.tech)
pgsql-hackers by date: