Re: Emitting JSON to file using COPY TO - Mailing list pgsql-hackers
From | Joe Conway |
---|---|
Subject | Re: Emitting JSON to file using COPY TO |
Date | |
Msg-id | 4d5688f4-9582-4093-8448-e1867bc9e2bc@joeconway.com Whole thread Raw |
In response to | Re: Emitting JSON to file using COPY TO (Sehrope Sarkuni <sehrope@jackdb.com>) |
Responses |
Re: Emitting JSON to file using COPY TO
|
List | pgsql-hackers |
On 12/6/23 11:28, Sehrope Sarkuni wrote: > Big +1 to this overall feature. cool! > Regarding the defaults for the output, I think JSON lines (rather than a > JSON array of objects) would be preferred. It's more natural to combine > them and generate that type of data on the fly rather than forcing > aggregation into a single object. So that is +2 (Sehrope and me) for the status quo (JSON lines), and +2 (Andrew and Davin) for defaulting to json arrays. Anyone else want to weigh in on that issue? > Couple more features / use cases come to mind as well. Even if they're > not part of a first round of this feature I think it'd be helpful to > document them now as it might give some ideas for what does make that > first cut: > > 1. Outputting a top level JSON object without the additional column > keys. IIUC, the top level keys are always the column names. A common use > case would be a single json/jsonb column that is already formatted > exactly as the user would like for output. Rather than enveloping it in > an object with a dedicated key, it would be nice to be able to output it > directly. This would allow non-object results to be outputted as well > (e.g., lines of JSON arrays, numbers, or strings). Due to how JSON is > structured, I think this would play nice with the JSON lines v.s. array > concept. > > COPY (SELECT json_build_object('foo', x) AS i_am_ignored FROM > generate_series(1, 3) x) TO STDOUT WITH (FORMAT JSON, > SOME_OPTION_TO_NOT_ENVELOPE) > {"foo":1} > {"foo":2} > {"foo":3} Your example does not match what you describe, or do I misunderstand? I thought your goal was to eliminate the repeated "foo" from each row... > 2. An option to ignore null fields so they are excluded from the output. > This would not be a default but would allow shrinking the total size of > the output data in many situations. This would be recursive to allow > nested objects to be shrunk down (not just the top level). This might be > worthwhile as a standalone JSON function though handling it during > output would be more efficient as it'd only be read once. > > COPY (SELECT json_build_object('foo', CASE WHEN x > 1 THEN x END) FROM > generate_series(1, 3) x) TO STDOUT WITH (FORMAT JSON, > SOME_OPTION_TO_NOT_ENVELOPE, JSON_SKIP_NULLS) > {} > {"foo":2} > {"foo":3} clear enough I think > 3. Reverse of #2 when copying data in to allow defaulting missing fields > to NULL. good to record the ask, but applies to a different feature (COPY FROM instead of COPY TO). -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: