Thread: COPY manual is ambiguous about column list
The following documentation comment has been logged on the website: Page: https://www.postgresql.org/docs/12/sql-copy.html Description: "If a list of columns is specified, COPY will only copy the data in the specified columns to or from the file. If there are any columns in the table that are not in the column list, COPY FROM will insert the default values for those columns." I clearly see "to or from" file here. But if one tries to COPY FROM filename with more columns than list specified, the process fails: ``` pasha@PG480 MINGW64 ~ $ echo "col1,col2,col3" > test.csv pasha@PG480 MINGW64 ~ $ cat test.csv col1,col2,col3 pasha@PG480 MINGW64 ~ $ psql -d migrator migrator=# CREATE TABLE copy_test(col1 text, col2 text); CREATE TABLE migrator=# COPY copy_test(col1, col2) FROM 'test.csv' (FORMAT csv); ERROR: extra data after last expected column CONTEXT: COPY copy_test, line 1: "col1,col2,col3" ``` I believe this statement should be rewritten, e.g. "If a list of columns is specified, COPY will only copy the data in the specified columns to the file. The input file should contain the same number of columns as the list specified, otherwise COPY FROM will fail. One may use `awk` to preprocess the input file and remove extra columns. If there are any columns in the table that are not in the column list, COPY FROM will insert the default values for those columns." https://www.postgresql.org/docs/12/sql-copy.html
On Wed, 2019-11-27 at 17:54 +0000, PG Doc comments form wrote: > "If a list of columns is specified, COPY will only copy the data in the > specified columns to or from the file. If there are any columns in the table > that are not in the column list, COPY FROM will insert the default values > for those columns." > > I clearly see "to or from" file here. But if one tries to COPY FROM filename > with more columns than list specified, the process fails: I think you misunderstood the documentation. Nowhere in that sentence is the documentation talking about columns in the file, only columns in the table. But if you got it wrong, maybe a clarification would be a good idea. Yours, Laurenz Albe
On Thu, 28 Nov 2019 at 10:47, Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > On Wed, 2019-11-27 at 17:54 +0000, PG Doc comments form wrote: > > "If a list of columns is specified, COPY will only copy the data in the > > specified columns to or from the file. If there are any columns in the table > > that are not in the column list, COPY FROM will insert the default values > > for those columns." > > > > I clearly see "to or from" file here. But if one tries to COPY FROM filename > > with more columns than list specified, the process fails: > > I think you misunderstood the documentation. Yeap, I did. Even two of us. Laetitia was helping me :) > > Nowhere in that sentence is the documentation talking about columns in > the file, only columns in the table. > > But if you got it wrong, maybe a clarification would be a good idea. I think it better to have more details to avoid confusion. > > Yours, > Laurenz Albe > -- -- Hoc est vivere bis, vita posse priore frui. Nullus est in vitae sensus, ipsa vera est sensus.
On Thu, Nov 28, 2019 at 02:53:03PM +0100, Pavlo Golub wrote: > On Thu, 28 Nov 2019 at 10:47, Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > > > On Wed, 2019-11-27 at 17:54 +0000, PG Doc comments form wrote: > > > "If a list of columns is specified, COPY will only copy the data in the > > > specified columns to or from the file. If there are any columns in the table > > > that are not in the column list, COPY FROM will insert the default values > > > for those columns." > > > > > > I clearly see "to or from" file here. But if one tries to COPY FROM filename > > > with more columns than list specified, the process fails: > > > > I think you misunderstood the documentation. > > Yeap, I did. Even two of us. Laetitia was helping me :) > > > > > Nowhere in that sentence is the documentation talking about columns in > > the file, only columns in the table. > > > > But if you got it wrong, maybe a clarification would be a good idea. > > I think it better to have more details to avoid confusion. How is the attached patch? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Attachment
On Thu, 19 Dec 2019 at 17:53, Bruce Momjian <bruce@momjian.us> wrote: > > On Thu, Nov 28, 2019 at 02:53:03PM +0100, Pavlo Golub wrote: > > On Thu, 28 Nov 2019 at 10:47, Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > > > > > On Wed, 2019-11-27 at 17:54 +0000, PG Doc comments form wrote: > > > > "If a list of columns is specified, COPY will only copy the data in the > > > > specified columns to or from the file. If there are any columns in the table > > > > that are not in the column list, COPY FROM will insert the default values > > > > for those columns." > > > > > > > > I clearly see "to or from" file here. But if one tries to COPY FROM filename > > > > with more columns than list specified, the process fails: > > > > > > I think you misunderstood the documentation. > > > > Yeap, I did. Even two of us. Laetitia was helping me :) > > > > > > > > Nowhere in that sentence is the documentation talking about columns in > > > the file, only columns in the table. > > > > > > But if you got it wrong, maybe a clarification would be a good idea. > > > > I think it better to have more details to avoid confusion. > > How is the attached patch? Sounds great to me! Thanks! > > -- > Bruce Momjian <bruce@momjian.us> http://momjian.us > EnterpriseDB http://enterprisedb.com > > + As you are, so once was I. As I am, so you will be. + > + Ancient Roman grave inscription + -- -- Hoc est vivere bis, vita posse priore frui. Nullus est in vitae sensus, ipsa vera est sensus.
Bruce Momjian <bruce@momjian.us> writes: > How is the attached patch? ! In <command>COPY FROM</command>, the number of columns specified ! must match the number of fields in the file; any table columns not ! specified in the <command>COPY FROM</command> column list will insert ! their default values. Phrasing seems a bit weird to me: you've run two nearly independent observations into one sentence. But s/; any/. Any/ might be enough to fix it. regards, tom lane
On Thu, Dec 19, 2019 at 9:46 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Bruce Momjian <bruce@momjian.us> writes:
> How is the attached patch?
! In <command>COPY FROM</command>, the number of columns specified
! must match the number of fields in the file; any table columns not
! specified in the <command>COPY FROM</command> column list will insert
! their default values.
Phrasing seems a bit weird to me: you've run two nearly independent
observations into one sentence. But s/; any/. Any/ might be enough
to fix it.
Seems better to simply talk about each form independently:
If a list of columns is specified, <command>COPY TO</command> will only copy the data in the specified columns to the file. <command>COPY FROM</command> will, by position (i.e., the number of columns listed must match the number of columns in the file), match each file column to the named column in the table. Any additional columns present in the table will receive their default value.
David J.
"David G. Johnston" <david.g.johnston@gmail.com> writes: > Seems better to simply talk about each form independently: > If a list of columns is specified, <command>COPY TO</command> will only > copy the data in the specified columns to the file. <command>COPY > FROM</command> will, by position (i.e., the number of columns listed must > match the number of columns in the file), match each file column to the > named column in the table. Any additional columns present in the table > will receive their default value. +1 for separating it like that, but your middle sentence reads awkwardly to me. How about something like <command>COPY FROM</command> will match each listed column in the table to a file column by position (so the number of columns listed must match the number of columns in the file). regards, tom lane
On Thursday, December 19, 2019, Tom Lane <tgl@sss.pgh.pa.us> wrote:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> Seems better to simply talk about each form independently:
> If a list of columns is specified, <command>COPY TO</command> will only
> copy the data in the specified columns to the file. <command>COPY
> FROM</command> will, by position (i.e., the number of columns listed must
> match the number of columns in the file), match each file column to the
> named column in the table. Any additional columns present in the table
> will receive their default value.
+1 for separating it like that, but your middle sentence reads awkwardly
to me. How about something like
<command>COPY FROM</command> will match each listed column in the table
to a file column by position (so the number of columns listed must match
the number of columns in the file).
+1
David J.
On Thu, Dec 19, 2019 at 10:33:48PM -0700, David G. Johnston wrote: > On Thursday, December 19, 2019, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > "David G. Johnston" <david.g.johnston@gmail.com> writes: > > Seems better to simply talk about each form independently: > > > If a list of columns is specified, <command>COPY TO</command> will only > > copy the data in the specified columns to the file. <command>COPY > > FROM</command> will, by position (i.e., the number of columns listed must > > match the number of columns in the file), match each file column to the > > named column in the table. Any additional columns present in the table > > will receive their default value. > > +1 for separating it like that, but your middle sentence reads awkwardly > to me. How about something like > > <command>COPY FROM</command> will match each listed column in the table > to a file column by position (so the number of columns listed must match > the number of columns in the file). > > > > > +1 OK, how is this? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Attachment
Bruce Momjian <bruce@momjian.us> writes: > OK, how is this? I do not like the phrasing of If a column list is specified in <command>COPY FROM</command>, the number and position of the columns must match those fields in the file. I'm not quite sure why, but that reads to me as implying that the COPY will identify columns of the file by name and match them up. Or at least it could be read that way by somebody who was predisposed to believe it. I'd go with something more like If a column list is specified in <command>COPY FROM</command>, only the listed columns are read from the file (whose fields must match the column list). regards, tom lane
On Fri, Dec 20, 2019 at 7:13 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Bruce Momjian <bruce@momjian.us> writes:
> OK, how is this?
I do not like the phrasing of
If a column list is
specified in <command>COPY FROM</command>, the number and position of
the columns must match those fields in the file.
I'm not quite sure why, but that reads to me as implying that the COPY
will identify columns of the file by name and match them up. Or at
least it could be read that way by somebody who was predisposed to
believe it. I'd go with something more like
If a column list is
specified in <command>COPY FROM</command>, only the listed columns
are read from the file (whose fields must match the column list).
This has the same "match by name" possibility when I read it...
I don't see the benefit of repeating "If a column list is specified" a second time in the three sentence paragraph. To a lesser extent the same goes for the second repetition of COPY FROM.
I agree with the general goal of trying to get rid of the parenthetical; to that end:
<command>COPY FROM</command> will match all listed columns in the table to *all* file columns by position.
That sufficiently implies the "must have equal numbers" and, frankly, if someone gets that part wrong the immediate failure combined with re-reading the word "all" should provide sufficient clarity.
So I'm still for Tom and mine combined proposal with cleaning up the middle sentence even more.
The other point of note is the difference between:
will receive their default value
and
will insert the default values
The system is inserting default values but the columns are receiving them. The sentence is relative to the table columns though so "receive" seems like the better fit. Minor point overall though.
David J.
On Fri, Dec 20, 2019 at 07:29:33AM -0700, David G. Johnston wrote: > This has the same "match by name" possibility when I read it... > > I don't see the benefit of repeating "If a column list is specified" a second > time in the three sentence paragraph. To a lesser extent the same goes for the > second repetition of COPY FROM. > > I agree with the general goal of trying to get rid of the parenthetical; to > that end: > <command>COPY FROM</command> will match all listed columns in the table to > *all* file columns by position. > > That sufficiently implies the "must have equal numbers" and, frankly, if > someone gets that part wrong the immediate failure combined with re-reading the > word "all" should provide sufficient clarity. > > So I'm still for Tom and mine combined proposal with cleaning up the middle > sentence even more. > > The other point of note is the difference between: > > will receive their default value > and > will insert the default values > > The system is inserting default values but the columns are receiving them. The > sentence is relative to the table columns though so "receive" seems like the > better fit. Minor point overall though. OK, this wording is obviously harder than I thought. Updated patch attached. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Attachment
Bruce Momjian <bruce@momjian.us> writes: > OK, this wording is obviously harder than I thought. Updated patch > attached. That one works for me. regards, tom lane
On Fri, Dec 20, 2019 at 9:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Bruce Momjian <bruce@momjian.us> writes:
> OK, this wording is obviously harder than I thought. Updated patch
> attached.
That one works for me.
Me too.
David J.
On Fri, Dec 20, 2019 at 09:45:35AM -0700, David G. Johnston wrote: > On Fri, Dec 20, 2019 at 9:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Bruce Momjian <bruce@momjian.us> writes: > > OK, this wording is obviously harder than I thought. Updated patch > > attached. > > That one works for me. > > > Me too. Patch applied to all supported releases. Thanks for the feedback. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +