Re: confusing / inefficient "need_transcoding" handling in copy - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: confusing / inefficient "need_transcoding" handling in copy
Date
Msg-id ZcVzjGWFobGpNrxs@paquier.xyz
Whole thread Raw
In response to Re: confusing / inefficient "need_transcoding" handling in copy  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: confusing / inefficient "need_transcoding" handling in copy
List pgsql-hackers
On Thu, Feb 08, 2024 at 10:25:07AM +0200, Heikki Linnakangas wrote:
> There's no validation, just conversion. I'd suggest:
>
> "Set up encoding conversion info if the file and server encodings differ
> (see also pg_server_to_any)."
>
> Other than that, +1

Cool.  I've used your wording and applied that on HEAD.

> BTW, I can see an optimization opportunity even if the encodings differ:
> Currently, CopyAttributeOutText() calls pg_server_to_any(), and then grovels
> through the string to find any characters that need to be quoted. You could
> do it the other way round and handle quoting before the conversion. That has
> two benefits:
>
> 1. You don't need the strlen() call, because you just scanned through the
> string so you already know its length.
> 2. You don't need to worry about 'encoding_embeds_ascii' when you operate on
> the server encoding.

That sounds right, still it looks like there would be cases where
you'd need the strlen() call if !encoding_embeds_ascii.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Jim Jones
Date:
Subject: Re: Psql meta-command conninfo+
Next
From: Andres Freund
Date:
Subject: Re: confusing / inefficient "need_transcoding" handling in copy