> From: pbj@cmicdo.com <pbj@cmicdo.com>
> Sent: 22 November 2022 15:30
> To: Alastair McKinley <a.mckinley@analyticsengines.com>; pgsql-general@lists.postgresql.org
<pgsql-general@lists.postgresql.org>;Erik Wienhold <ewie@ewie.name>
> Subject: Re: copying json data and backslashes
>
> >
> > On Tuesday, November 22, 2022 at 10:16:11 AM EST, Erik Wienhold <ewie@ewie.name> wrote:
> >
> >
> > > On 22/11/2022 15:23 CET Alastair McKinley <a.mckinley@analyticsengines.com> wrote:
> > >
> > > Hi all,
> > >
> > > I have come across this apparently common issue COPY-ing json and wondering if
> > > there is potentially a better solution.
> > >
> > > I am copying data into a jsonb column originating from a 3rd party API. The
> > > data may have literal \r,\t,\n and also double backslashes.
> > >
> > > I discovered that I can cast this data to a jsonb value directly but I can't
> > > COPY the data without pre-processing.
> >
> >
> > > Is there any other solution with COPY that doesn't require manual
> > > implementation of search/replace to handle these edge cases?
> > > Why does ::jsonb work but COPY doesn't? It seems a bit inconsistent.
> >
> > COPY handles special backslash sequences[1]. The \r in your sample JSON,
> > although properly escaped according to JSON, is replaced with an actual
> > carriage return by COPY before casting to jsonb. The error results from JSON
> > prohibiting unescaped control characters in strings[2].
> >
> > You must double escape to pass those characters through COPY.
> >
> > See how COPY outputs backslash sequences:
> >
> > -- Actual carriage return:
> > copy (select e'\r') to stdout;
> > \r
> >
> > -- Backslash sequence for carriage return:
> > copy (select '\r') to stdout;
> >
> > \\r
>
> I have been able to get around this problem by using the following method:
>
> \copy footable from 'input.json' (format csv, escape '^B', delimieter '^C', quote '^E')
>
> where the control characters are the actual control char, not the caret-letter, and it requires no escaping escapes.
Irealize this won't work for all
> situations.
>
> PJ
Hi PJ,
Thanks for the suggestion, this is interesting to me to try but I am not quite sure how this works.
As far as I understand, escape/quote/delimiter have to be a single character, and CTRL-C etc. are multiple characters.
What way do you input each of the escape/quote/delimiter characters?
Best regards,
Alastair
>
> >
> > [1]
> > https://www.postgresql.org/docs/current/sql-copy.html#id-1.9.3.55.9.2
> > [2] https://www.json.org/json-en.html
> >
> > --
> > Erik
>