Thread: [GENERAL] COPY: row is too big
Hi everyone,
My postgresql is 9.61.
When I copy data from csv file, a very long values for many columns (about 1100 columns). The errors appears:
ERROR: row is too big: size 11808, maximum size 8160CONTEXT:
COPY rius, line 2
rius is the table.
I have searched the mailing list, but seems no solutions founded.
Thanks.
Hi everyone,My postgresql is 9.61.When I copy data from csv file, a very long values for many columns (about 1100 columns). The errors appears:ERROR: row is too big: size 11808, maximum size 8160CONTEXT:COPY rius, line 2rius is the table.I have searched the mailing list, but seems no solutions founded.Thanks.
I looked in the source code. That message _seems_ to be coming from the file ./src/backend/heap/hio.c and relates to MaxHeapTupleSize. This is set, indirectly, from the BLKCZ set in the "configure" from when PostgreSQL was originally compiled. That is, this is a "hard coded" limit which can only be overridden by re-customizing PostgreSQL yourself using the source. Apparently whomever did the PostgreSQL compilation setup took the default BLKCZ of 8192. So there is no solution other than "do it yourself" by getting the PostgreSQL source code and configuring it yourself. I can give you the first step. You can get the PostgreSQL source one of two ways. You can go here: https://www.postgresql.org/ftp/source/v9.6.1/ - download the proper file. Or, if you have and know "git", you can enter the command: git clone git://git.postgresql.org/git/postgresql.git .
Oh, I assumed (bad me!) that you're running on Linux. I know _nothing_ about how to do the above on Windows.
I am not a PostgreSQL guru. Perhaps I made a stupid mistake in my analysis and the truly knowledgeable will have a better answer for you.
-- There’s no obfuscated Perl contest because it’s pointless.
—Jeff Polk
—Jeff Polk
Maranatha! <><
John McKown
John McKown
On 01/02/2017 03:11 AM, vod vos wrote: > Hi everyone, > > My postgresql is 9.61. > > When I copy data from csv file, a very long values for many columns > (about 1100 columns). The errors appears: My guess is you are tripping this: https://www.postgresql.org/about/ Maximum Columns per Table 250 - 1600 depending on column types So what are you storing in table rius and can you give a general idea of its schema? Not all 1100 columns just a sampling of the data types involved. Also what is the COPY command you are using? > > > ERROR: row is too big: size 11808, maximum size 8160CONTEXT: > > COPY rius, line 2 > > rius is the table. > > I have searched the mailing list, but seems no solutions founded. > > Thanks. -- Adrian Klaver adrian.klaver@aklaver.com
The most of the data type are text or varhcar, and I use:
COPY rius FROM "/var/www/test/aa.csv" WITH DELIMITER ';' ;
And some the values in the csv file contain nulls, do this null values matter?
Thanks.
---- On 星期一, 02 一月 2017 03:11:14 -0800 vod vos <vodvos@zoho.com> wrote ----
Hi everyone,My postgresql is 9.61.When I copy data from csv file, a very long values for many columns (about 1100 columns). The errors appears:ERROR: row is too big: size 11808, maximum size 8160CONTEXT:COPY rius, line 2rius is the table.I have searched the mailing list, but seems no solutions founded.Thanks.
vod vos <vodvos@zoho.com> writes: > When I copy data from csv file, a very long values for many columns (about 1100 columns). The errors appears: > ERROR: row is too big: size 11808, maximum size 8160 You need to rethink your table schema so you have fewer columns. Perhaps you can combine some of them into arrays, for example. JSON might be a useful option, too. regards, tom lane
You know, the csv file was exported from other database of a machine, so I really dont want to break it for it is a hard work. Every csv file contains headers and values. If I redesign the table, then I have to cut all the csv files into pieces one by one.
---- On 星期一, 02 一月 2017 08:21:29 -0800 Tom Lane <tgl@sss.pgh.pa.us> wrote ----
vod vos <vodvos@zoho.com> writes:> When I copy data from csv file, a very long values for many columns (about 1100 columns). The errors appears:> ERROR: row is too big: size 11808, maximum size 8160You need to rethink your table schema so you have fewer columns.Perhaps you can combine some of them into arrays, for example.JSON might be a useful option, too.regards, tom lane--Sent via pgsql-general mailing list (pgsql-general@postgresql.org)To make changes to your subscription:
On 01/02/2017 09:03 AM, vod vos wrote: > You know, the csv file was exported from other database of a machine, so > I really dont want to break it for it is a hard work. Every csv file > contains headers and values. If I redesign the table, then I have to cut > all the csv files into pieces one by one. If it helps: http://csvkit.readthedocs.io/en/latest/tutorial/1_getting_started.html#csvcut-data-scalpel > > > ---- On 星期一, 02 一月 2017 08:21:29 -0800 *Tom Lane > <tgl@sss.pgh.pa.us>* wrote ---- > > vod vos <vodvos@zoho.com <mailto:vodvos@zoho.com>> writes: > > When I copy data from csv file, a very long values for many > columns (about 1100 columns). The errors appears: > > ERROR: row is too big: size 11808, maximum size 8160 > > You need to rethink your table schema so you have fewer columns. > Perhaps you can combine some of them into arrays, for example. > JSON might be a useful option, too. > > regards, tom lane > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org > <mailto:pgsql-general@postgresql.org>) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > > -- Adrian Klaver adrian.klaver@aklaver.com
> On Jan 2, 2017, at 10:13 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote: > >> On 01/02/2017 09:03 AM, vod vos wrote: >> You know, the csv file was exported from other database of a machine, so >> I really dont want to break it for it is a hard work. Every csv file >> contains headers and values. If I redesign the table, then I have to cut >> all the csv files into pieces one by one. > > If it helps: > > http://csvkit.readthedocs.io/en/latest/tutorial/1_getting_started.html#csvcut-data-scalpel >> >> >> ---- On 星期一, 02 一月 2017 08:21:29 -0800 *Tom Lane >> <tgl@sss.pgh.pa.us>* wrote ---- >> >> vod vos <vodvos@zoho.com <mailto:vodvos@zoho.com>> writes: >>> When I copy data from csv file, a very long values for many >> columns (about 1100 columns). The errors appears: >>> ERROR: row is too big: size 11808, maximum size 8160 >> >> You need to rethink your table schema so you have fewer columns. >> Perhaps you can combine some of them into arrays, for example. >> JSON might be a useful option, too. >> >> regards, tom lane >> >> >> -- >> Sent via pgsql-general mailing list (pgsql-general@postgresql.org >> <mailto:pgsql-general@postgresql.org>) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-general >> >> > > > -- > Adrian Klaver > adrian.klaver@aklaver.com > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general Perhaps this is your opportunity to correct someone else's mistake. You need to show the table definition to convince usthat it cannot be improved. That it may be hard work really doesn't mean it's not the right path.
Perhaps this is your opportunity to correct someone else's mistake. You need to show the table definition to convince us that it cannot be improved. That it may be hard work really doesn't mean it's not the right path.
This may not be possible. The data might be coming in from an external source. I imagine you've run into the old "well, _we_ don't have any problems, so it must be on your end!" scenario.
Example: we receive CSV files from an external source. These files are _supposed_ to be validated. But we have often received files where NOT NULL fields have "nothing" in them them. E.g. a customer bill which has _everything_ in it _except_ the customer number (or an invalid one such as "123{"); or missing some other vital piece of information.
In this particular case, the OP might want to do what we did in a similar case. We had way too many columns in a table. The performance was horrible. We did an analysis and, as usual, the majority of the selects were for a subset of the columns, about 15% of the total. We "split" the table into the "high use" columns table & the "low use" columns table. We then used triggers to make sure that if we added a new / deleted an old row from one table, the corresponding row in the other was created / deleted.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
There’s no obfuscated Perl contest because it’s pointless.
—Jeff Polk
—Jeff Polk
Maranatha! <><
John McKown
John McKown
Now I am confused about I can create 1100 columns in a table in postgresql, but I can't copy 1100 values into the table. And I really dont want to split the csv file to pieces to avoid mistakes after this action.
I create a table with 1100 columns with data type of varchar, and hope the COPY command will auto transfer the csv data that contains some character and date, most of which are numeric.
I use the command: COPY rius FROM "/var/www/test/test.csv" WITH DELIMITER ';' ;
Then it shows:
ERROR: row is too big: size 11808, maximum size 8160
---- On 星期二, 03 一月 2017 05:24:18 -0800 John McKown <john.archie.mckown@gmail.com> wrote ----
Perhaps this is your opportunity to correct someone else's mistake. You need to show the table definition to convince us that it cannot be improved. That it may be hard work really doesn't mean it's not the right path.This may not be possible. The data might be coming in from an external source. I imagine you've run into the old "well, _we_ don't have any problems, so it must be on your end!" scenario.Example: we receive CSV files from an external source. These files are _supposed_ to be validated. But we have often received files where NOT NULL fields have "nothing" in them them. E.g. a customer bill which has _everything_ in it _except_ the customer number (or an invalid one such as "123{"); or missing some other vital piece of information.In this particular case, the OP might want to do what we did in a similar case. We had way too many columns in a table. The performance was horrible. We did an analysis and, as usual, the majority of the selects were for a subset of the columns, about 15% of the total. We "split" the table into the "high use" columns table & the "low use" columns table. We then used triggers to make sure that if we added a new / deleted an old row from one table, the corresponding row in the other was created / deleted.--Sent via pgsql-general mailing list (pgsql-general@postgresql.org)To make changes to your subscription:--There’s no obfuscated Perl contest because it’s pointless.—Jeff PolkMaranatha! <><John McKown
On 01/04/2017 05:00 AM, vod vos wrote: > Now I am confused about I can create 1100 columns in a table in > postgresql, but I can't copy 1100 values into the table. And I really As pointed out previously: https://www.postgresql.org/about/ Maximum Columns per Table 250 - 1600 depending on column types That being dependent on both the number of columns and the actual data in the columns. Empty columns are not the problem, it is when you start filling them that you get the error. > dont want to split the csv file to pieces to avoid mistakes after this > action. > > I create a table with 1100 columns with data type of varchar, and hope > the COPY command will auto transfer the csv data that contains some I am afraid the solution is going to require more then hope. You are going to need to break the data up. I suspect that just splitting it into half would do the trick. So: Table 1 column 1 for a primary key(assuming first column of your present data) columns 2-550 Table 2 column 1 for a primary key(assuming first column of your present data) columns 551-1100 Using the program I mentioned previously: http://csvkit.readthedocs.io/en/1.0.1/scripts/csvcut.html That translates into: csvcut -c 1,2-550 your_big.csv > table_1.csv csvcut -c 1,551-1100 your_big.csv > table_2.csv > character and date, most of which are numeric. Is this a different data set? Previously you said: "The most of the data type are text or varhcar, ..." > > I use the command: COPY rius FROM "/var/www/test/test.csv" WITH > DELIMITER ';' ; > > Then it shows: > > ERROR: row is too big: size 11808, maximum size 8160 > > > > > > > > ---- On 星期二, 03 一月 2017 05:24:18 -0800 *John McKown > <john.archie.mckown@gmail.com>* wrote ---- > > On Mon, Jan 2, 2017 at 2:57 PM, Rob Sargent <robjsargent@gmail.com > <mailto:robjsargent@gmail.com>>wrote: > > Perhaps this is your opportunity to correct someone else's > mistake. You need to show the table definition to convince us > that it cannot be improved. That it may be hard work really > doesn't mean it's not the right path. > > > This may not be possible. The data might be coming in from an > external source. I imagine you've run into the old "well, _we_ don't > have any problems, so it must be on your end!" scenario. > > Example: we receive CSV files from an external source. These files > are _supposed_ to be validated. But we have often received files > where NOT NULL fields have "nothing" in them them. E.g. a customer > bill which has _everything_ in it _except_ the customer number (or > an invalid one such as "123{"); or missing some other vital piece of > information. > > In this particular case, the OP might want to do what we did in a > similar case. We had way too many columns in a table. The > performance was horrible. We did an analysis and, as usual, the > majority of the selects were for a subset of the columns, about 15% > of the total. We "split" the table into the "high use" columns table > & the "low use" columns table. We then used triggers to make sure > that if we added a new / deleted an old row from one table, the > corresponding row in the other was created / deleted. > > > > > > -- > Sent via pgsql-general mailing list > (pgsql-general@postgresql.org <mailto:pgsql-general@postgresql.org>) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > > > > > -- > There’s no obfuscated Perl contest because it’s pointless. > > —Jeff Polk > > Maranatha! <>< > John McKown > > -- Adrian Klaver adrian.klaver@aklaver.com
Hi
2017-01-04 14:00 GMT+01:00 vod vos <vodvos@zoho.com>:
Now I am confused about I can create 1100 columns in a table in postgresql, but I can't copy 1100 values into the table. And I really dont want to split the csv file to pieces to avoid mistakes after this action.
The PostgreSQL limit is "Maximum Columns per Table250 - 1600 depending on column types" - this limit is related to placing values or pointers to values to one page (8KB).
You can hit this limit not in CREATE TABLE time, but in INSERT time.
I create a table with 1100 columns with data type of varchar, and hope the COPY command will auto transfer the csv data that contains some character and date, most of which are numeric.
Numeric is expensive type - try to use float instead, maybe double.
Regards
Pavel
I use the command: COPY rius FROM "/var/www/test/test.csv" WITH DELIMITER ';' ;Then it shows:ERROR: row is too big: size 11808, maximum size 8160---- On 星期二, 03 一月 2017 05:24:18 -0800 John McKown <john.archie.mckown@gmail.com> wrote ----Perhaps this is your opportunity to correct someone else's mistake. You need to show the table definition to convince us that it cannot be improved. That it may be hard work really doesn't mean it's not the right path.This may not be possible. The data might be coming in from an external source. I imagine you've run into the old "well, _we_ don't have any problems, so it must be on your end!" scenario.Example: we receive CSV files from an external source. These files are _supposed_ to be validated. But we have often received files where NOT NULL fields have "nothing" in them them. E.g. a customer bill which has _everything_ in it _except_ the customer number (or an invalid one such as "123{"); or missing some other vital piece of information.In this particular case, the OP might want to do what we did in a similar case. We had way too many columns in a table. The performance was horrible. We did an analysis and, as usual, the majority of the selects were for a subset of the columns, about 15% of the total. We "split" the table into the "high use" columns table & the "low use" columns table. We then used triggers to make sure that if we added a new / deleted an old row from one table, the corresponding row in the other was created / deleted.--Sent via pgsql-general mailing list (pgsql-general@postgresql.org)To make changes to your subscription:--There’s no obfuscated Perl contest because it’s pointless.—Jeff PolkMaranatha! <><John McKown
On 01/04/2017 06:54 AM, Pavel Stehule wrote: > Hi > > 2017-01-04 14:00 GMT+01:00 vod vos <vodvos@zoho.com > <mailto:vodvos@zoho.com>>: > > __ > Now I am confused about I can create 1100 columns in a table in > postgresql, but I can't copy 1100 values into the table. And I > really dont want to split the csv file to pieces to avoid mistakes > after this action. > > > The PostgreSQL limit is "Maximum Columns per Table250 - 1600 depending > on column types" - this limit is related to placing values or pointers > to values to one page (8KB). > > You can hit this limit not in CREATE TABLE time, but in INSERT time. > > > > I create a table with 1100 columns with data type of varchar, and > hope the COPY command will auto transfer the csv data that contains > some character and date, most of which are numeric. > > > Numeric is expensive type - try to use float instead, maybe double. If I am following the OP correctly the table itself has all the columns declared as varchar. The data in the CSV file is a mix of text, date and numeric, presumably cast to text on entry into the table. > > Regards > > Pavel > > > I use the command: COPY rius FROM "/var/www/test/test.csv" WITH > DELIMITER ';' ; > > Then it shows: > > ERROR: row is too big: size 11808, maximum size 8160 > > > > > > > > ---- On 星期二, 03 一月 2017 05:24:18 -0800 *John McKown > <john.archie.mckown@gmail.com > <mailto:john.archie.mckown@gmail.com>>* wrote ---- > > On Mon, Jan 2, 2017 at 2:57 PM, Rob Sargent > <robjsargent@gmail.com <mailto:robjsargent@gmail.com>>wrote: > > Perhaps this is your opportunity to correct someone else's > mistake. You need to show the table definition to convince > us that it cannot be improved. That it may be hard work > really doesn't mean it's not the right path. > > > This may not be possible. The data might be coming in from an > external source. I imagine you've run into the old "well, _we_ > don't have any problems, so it must be on your end!" scenario. > > Example: we receive CSV files from an external source. These > files are _supposed_ to be validated. But we have often received > files where NOT NULL fields have "nothing" in them them. E.g. a > customer bill which has _everything_ in it _except_ the customer > number (or an invalid one such as "123{"); or missing some other > vital piece of information. > > In this particular case, the OP might want to do what we did in > a similar case. We had way too many columns in a table. The > performance was horrible. We did an analysis and, as usual, the > majority of the selects were for a subset of the columns, about > 15% of the total. We "split" the table into the "high use" > columns table & the "low use" columns table. We then used > triggers to make sure that if we added a new / deleted an old > row from one table, the corresponding row in the other was > created / deleted. > > > > > > -- > Sent via pgsql-general mailing list > (pgsql-general@postgresql.org > <mailto:pgsql-general@postgresql.org>) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > <http://www.postgresql.org/mailpref/pgsql-general> > > > > > -- > There’s no obfuscated Perl contest because it’s pointless. > > —Jeff Polk > > Maranatha! <>< > John McKown > > > -- Adrian Klaver adrian.klaver@aklaver.com
2017-01-04 16:11 GMT+01:00 Adrian Klaver <adrian.klaver@aklaver.com>:
On 01/04/2017 06:54 AM, Pavel Stehule wrote:Hi
2017-01-04 14:00 GMT+01:00 vod vos <vodvos@zoho.com
<mailto:vodvos@zoho.com>>:
__
Now I am confused about I can create 1100 columns in a table in
postgresql, but I can't copy 1100 values into the table. And I
really dont want to split the csv file to pieces to avoid mistakes
after this action.
The PostgreSQL limit is "Maximum Columns per Table250 - 1600 depending
on column types" - this limit is related to placing values or pointers
to values to one page (8KB).
You can hit this limit not in CREATE TABLE time, but in INSERT time.
I create a table with 1100 columns with data type of varchar, and
hope the COPY command will auto transfer the csv data that contains
some character and date, most of which are numeric.
Numeric is expensive type - try to use float instead, maybe double.
If I am following the OP correctly the table itself has all the columns declared as varchar. The data in the CSV file is a mix of text, date and numeric, presumably cast to text on entry into the table.
Table column type are important - Postgres enforces necessary transformations.
Regards
Pavel
Regards
Pavel
I use the command: COPY rius FROM "/var/www/test/test.csv" WITH
DELIMITER ';' ;
Then it shows:
ERROR: row is too big: size 11808, maximum size 8160
---- On 星期二, 03 一月 2017 05:24:18 -0800 *John McKown
<john.archie.mckown@gmail.com
<mailto:john.archie.mckown@gmail.com >>* wrote ----
On Mon, Jan 2, 2017 at 2:57 PM, Rob Sargent
<robjsargent@gmail.com <mailto:robjsargent@gmail.com>>wrote: <mailto:pgsql-general@postgres
Perhaps this is your opportunity to correct someone else's
mistake. You need to show the table definition to convince
us that it cannot be improved. That it may be hard work
really doesn't mean it's not the right path.
This may not be possible. The data might be coming in from an
external source. I imagine you've run into the old "well, _we_
don't have any problems, so it must be on your end!" scenario.
Example: we receive CSV files from an external source. These
files are _supposed_ to be validated. But we have often received
files where NOT NULL fields have "nothing" in them them. E.g. a
customer bill which has _everything_ in it _except_ the customer
number (or an invalid one such as "123{"); or missing some other
vital piece of information.
In this particular case, the OP might want to do what we did in
a similar case. We had way too many columns in a table. The
performance was horrible. We did an analysis and, as usual, the
majority of the selects were for a subset of the columns, about
15% of the total. We "split" the table into the "high use"
columns table & the "low use" columns table. We then used
triggers to make sure that if we added a new / deleted an old
row from one table, the corresponding row in the other was
created / deleted.
--
Sent via pgsql-general mailing list
(pgsql-general@postgresql.orgql.org >)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
<http://www.postgresql.org/mailpref/pgsql-general >
--
There’s no obfuscated Perl contest because it’s pointless.
—Jeff Polk
Maranatha! <><
John McKown
--
Adrian Klaver
adrian.klaver@aklaver.com
OK, maybe the final solution is to split it into half.
---- On 星期三, 04 一月 2017 06:53:31 -0800 Adrian Klaver <adrian.klaver@aklaver.com> wrote ----
On 01/04/2017 05:00 AM, vod vos wrote:> Now I am confused about I can create 1100 columns in a table in> postgresql, but I can't copy 1100 values into the table. And I reallyAs pointed out previously:Maximum Columns per Table 250 - 1600 depending on column typesThat being dependent on both the number of columns and the actual datain the columns. Empty columns are not the problem, it is when you startfilling them that you get the error.> dont want to split the csv file to pieces to avoid mistakes after this> action.>> I create a table with 1100 columns with data type of varchar, and hope> the COPY command will auto transfer the csv data that contains someI am afraid the solution is going to require more then hope. You aregoing to need to break the data up. I suspect that just splitting itinto half would do the trick. So:Table 1column 1 for a primary key(assuming first column of your present data)columns 2-550Table 2column 1 for a primary key(assuming first column of your present data)columns 551-1100Using the program I mentioned previously:That translates into:csvcut -c 1,2-550 your_big.csv > table_1.csvcsvcut -c 1,551-1100 your_big.csv > table_2.csv> character and date, most of which are numeric.Is this a different data set?Previously you said:"The most of the data type are text or varhcar, ...">> I use the command: COPY rius FROM "/var/www/test/test.csv" WITH> DELIMITER ';' ;>> Then it shows:>> ERROR: row is too big: size 11808, maximum size 8160>>>>>>>> ---- On 星期二, 03 一月 2017 05:24:18 -0800 *John McKown> <john.archie.mckown@gmail.com>* wrote ---->> On Mon, Jan 2, 2017 at 2:57 PM, Rob Sargent <robjsargent@gmail.com> <mailto:robjsargent@gmail.com>>wrote:>> Perhaps this is your opportunity to correct someone else's> mistake. You need to show the table definition to convince us> that it cannot be improved. That it may be hard work really> doesn't mean it's not the right path.>>> This may not be possible. The data might be coming in from an> external source. I imagine you've run into the old "well, _we_ don't> have any problems, so it must be on your end!" scenario.>> Example: we receive CSV files from an external source. These files> are _supposed_ to be validated. But we have often received files> where NOT NULL fields have "nothing" in them them. E.g. a customer> bill which has _everything_ in it _except_ the customer number (or> an invalid one such as "123{"); or missing some other vital piece of> information.>> In this particular case, the OP might want to do what we did in a> similar case. We had way too many columns in a table. The> performance was horrible. We did an analysis and, as usual, the> majority of the selects were for a subset of the columns, about 15%> of the total. We "split" the table into the "high use" columns table> & the "low use" columns table. We then used triggers to make sure> that if we added a new / deleted an old row from one table, the> corresponding row in the other was created / deleted.>>>>>> --> Sent via pgsql-general mailing list> To make changes to your subscription:>>>>> --> There’s no obfuscated Perl contest because it’s pointless.>> —Jeff Polk>> Maranatha! <><> John McKown>>--Adrian Klaver
On 2017-01-04 06:53:31 -0800, Adrian Klaver wrote: > On 01/04/2017 05:00 AM, vod vos wrote: > >Now I am confused about I can create 1100 columns in a table in > >postgresql, but I can't copy 1100 values into the table. And I really > > As pointed out previously: > > https://www.postgresql.org/about/ > Maximum Columns per Table 250 - 1600 depending on column types > > That being dependent on both the number of columns and the actual data in > the columns. I think this is confusingly phrased. In my mind "column type" is static - the type is the same, independent of the values which are stored. So "250 - 1600 depending on column types" implies to me that there is some type A of which I can have only 250 columns and another type B of which I can have 1600 columns. But it doesn't imply to me that the number of columns depends on the values which ar put into those columns. May I suggest the these improvements? In https://www.postgresql.org/about/: Instead of | 250 - 1600 depending on column types write | 250 - 1600 depending on column types and data In https://www.postgresql.org/docs/9.6/static/ddl-basics.html: Replace the sentence: | Depending on the column types, it is between 250 and 1600. with: | For all columns in a row, some information (either the data itself or | a pointer to the data) must be stored in a single block (8 kB). | Because for some types this data is itself of variable length, the | maximum number of columns depends not only on the types of the columns | but also on the data (e.g., a NULL uses less space than a non-NULL | value). Therefore there is no simple way to compute the maximum number | of columns, and it is possible to declare a table with more columns | than can be filled. Keeping all this in mind, the limit is between 250 | and 1600. hp -- _ | Peter J. Holzer | A coding theorist is someone who doesn't |_|_) | | think Alice is crazy. | | | hjp@hjp.at | -- John Gordon __/ | http://www.hjp.at/ | http://downlode.org/Etext/alicebob.html
Attachment
Hello, On Wed, 2017-01-04 at 07:11 -0800, Adrian Klaver wrote: > On 01/04/2017 06:54 AM, Pavel Stehule wrote: > > Hi > > > > 2017-01-04 14:00 GMT+01:00 vod vos <vodvos@zoho.com > > <mailto:vodvos@zoho.com>>: > > > > __ > > Now I am confused about I can create 1100 columns in a table in > > postgresql, but I can't copy 1100 values into the table. And I > > really dont want to split the csv file to pieces to avoid > > mistakes > > after this action. > > > > > > The PostgreSQL limit is "Maximum Columns per Table250 - 1600 > > depending > > on column types" - this limit is related to placing values or > > pointers > > to values to one page (8KB). > > > > You can hit this limit not in CREATE TABLE time, but in INSERT > > time. > > > > > > > > I create a table with 1100 columns with data type of varchar, > > and > > hope the COPY command will auto transfer the csv data that > > contains > > some character and date, most of which are numeric. > > > > > > Numeric is expensive type - try to use float instead, maybe double. > > If I am following the OP correctly the table itself has all the > columns > declared as varchar. The data in the CSV file is a mix of text, date > and > numeric, presumably cast to text on entry into the table. > > > > > Regards > > > > Pavel > > > > > > I use the command: COPY rius FROM "/var/www/test/test.csv" WITH > > DELIMITER ';' ; > > > > Then it shows: > > > > ERROR: row is too big: size 11808, maximum size 8160 > > > > Assuming this is a brand new database instance and not an existing application, could the OP not compile from source and specify the --with-blocksize=16384 so as to overcome the 8k default page size limit? My 2 cents. Rob
On 01/04/2017 08:00 AM, rob stone wrote: > Hello, > On Wed, 2017-01-04 at 07:11 -0800, Adrian Klaver wrote: >> On 01/04/2017 06:54 AM, Pavel Stehule wrote: >>> Hi >>> >>> 2017-01-04 14:00 GMT+01:00 vod vos <vodvos@zoho.com >>> <mailto:vodvos@zoho.com>>: >>> >>> __ >>> Now I am confused about I can create 1100 columns in a table in >>> postgresql, but I can't copy 1100 values into the table. And I >>> really dont want to split the csv file to pieces to avoid >>> mistakes >>> after this action. >>> >>> >>> The PostgreSQL limit is "Maximum Columns per Table250 - 1600 >>> depending >>> on column types" - this limit is related to placing values or >>> pointers >>> to values to one page (8KB). >>> >>> You can hit this limit not in CREATE TABLE time, but in INSERT >>> time. >>> >>> >>> >>> I create a table with 1100 columns with data type of varchar, >>> and >>> hope the COPY command will auto transfer the csv data that >>> contains >>> some character and date, most of which are numeric. >>> >>> >>> Numeric is expensive type - try to use float instead, maybe double. >> >> If I am following the OP correctly the table itself has all the >> columns >> declared as varchar. The data in the CSV file is a mix of text, date >> and >> numeric, presumably cast to text on entry into the table. >> >>> >>> Regards >>> >>> Pavel >>> >>> >>> I use the command: COPY rius FROM "/var/www/test/test.csv" WITH >>> DELIMITER ';' ; >>> >>> Then it shows: >>> >>> ERROR: row is too big: size 11808, maximum size 8160 >>> >>> > > > Assuming this is a brand new database instance and not an existing > application, could the OP not compile from source and specify the > --with-blocksize=16384 so as to overcome the 8k default page size > limit? Well I was thinking along those lines also, then I did a search on BLCKSZ in the docs and saw all the configuration parameters that are keyed off it. I know I would have to do a lot more homework to understand the implications to the database instance as a whole and whether it was worth it to accommodate a single table. > > My 2 cents. > Rob > -- Adrian Klaver adrian.klaver@aklaver.com
...
Numeric is expensive type - try to use float instead, maybe double.
If I am following the OP correctly the table itself has all the columns declared as varchar. The data in the CSV file is a mix of text, date and numeric, presumably cast to text on entry into the table.
But a CSV *is* purely text - no casting to text is needed. Conversion is only needed when the strings in the CSV are text representations of *non*-text data.
I'm guessing that the OP is using all text fields to deal with possibly flawed input data and then validating and migrating the data in subsequent steps. In that case, an ETL solution may be a better approach. Many options, both open- closed- and hybrid-source exist.
Cheers,
Steve
On 01/04/2017 08:32 AM, Steve Crawford wrote: > ... > > Numeric is expensive type - try to use float instead, maybe double. > > > If I am following the OP correctly the table itself has all the > columns declared as varchar. The data in the CSV file is a mix of > text, date and numeric, presumably cast to text on entry into the table. > > > But a CSV *is* purely text - no casting to text is needed. Conversion is > only needed when the strings in the CSV are text representations of > *non*-text data. Yeah, muddled thinking. > > I'm guessing that the OP is using all text fields to deal with possibly > flawed input data and then validating and migrating the data in > subsequent steps. In that case, an ETL solution may be a better > approach. Many options, both open- closed- and hybrid-source exist. > > Cheers, > Steve -- Adrian Klaver adrian.klaver@aklaver.com
I finally figured it out as follows:
1. modified the corresponding data type of the columns to the csv file
2. if null values existed, defined the data type to varchar. The null values cause problem too.
so 1100 culumns work well now.
This problem wasted me three days. I have lots of csv data to COPY.
---- On 星期三, 04 一月 2017 08:39:42 -0800 Adrian Klaver <adrian.klaver@aklaver.com> wrote ----
On 01/04/2017 08:32 AM, Steve Crawford wrote:> ...>> Numeric is expensive type - try to use float instead, maybe double.>>> If I am following the OP correctly the table itself has all the> columns declared as varchar. The data in the CSV file is a mix of> text, date and numeric, presumably cast to text on entry into the table.>>> But a CSV *is* purely text - no casting to text is needed. Conversion is> only needed when the strings in the CSV are text representations of> *non*-text data.Yeah, muddled thinking.>> I'm guessing that the OP is using all text fields to deal with possibly> flawed input data and then validating and migrating the data in> subsequent steps. In that case, an ETL solution may be a better> approach. Many options, both open- closed- and hybrid-source exist.>> Cheers,> Steve--Adrian Klaver--Sent via pgsql-general mailing list (pgsql-general@postgresql.org)To make changes to your subscription:
2017-01-05 13:44 GMT+01:00 vod vos <vodvos@zoho.com>:
I finally figured it out as follows:1. modified the corresponding data type of the columns to the csv file2. if null values existed, defined the data type to varchar. The null values cause problem too.
int, float, double can be null too - null needs same space (1bit) for all types
Regards
Pavel
so 1100 culumns work well now.This problem wasted me three days. I have lots of csv data to COPY.
On 01/05/2017 04:44 AM, vod vos wrote: > I finally figured it out as follows: > > 1. modified the corresponding data type of the columns to the csv file > > 2. if null values existed, defined the data type to varchar. The null > values cause problem too. Did you change the NULLs to something else? As Pavel said the type does not really matter for NULL: https://www.postgresql.org/docs/9.6/static/storage-page-layout.html See marked(<***>) up part "All table rows are structured in the same way. There is a fixed-size header (occupying 23 bytes on most machines), followed by an optional null bitmap, an optional object ID field, and the user data. The header is detailed in Table 65-4. The actual user data (columns of the row) begins at the offset indicated by t_hoff, which must always be a multiple of the MAXALIGN distance for the platform. <***>The null bitmap is only present if the HEAP_HASNULL bit is set in t_infomask. If it is present it begins just after the fixed header and occupies enough bytes to have one bit per data column (that is, t_natts bits altogether). In this list of bits, a 1 bit indicates not-null, a 0 bit is a null. When the bitmap is not present, all columns are assumed not-null. <***> The object ID is only present if the HEAP_HASOID bit is set in t_infomask. If present, it appears just before the t_hoff boundary. Any padding needed to make t_hoff a MAXALIGN multiple will appear between the null bitmap and the object ID. (This in turn ensures that the object ID is suitably aligned.)" In this post: https://www.postgresql.org/message-id/1595fd48444.ba3ec57e13739.3837934651947496063%40zoho.com you said: "And some the values in the csv file contain nulls, do this null values matter?" It looks like there are a good deal of NULLs in a row. In your original post COPY failed on the second line, so assuming the same data what is the NULL count in that line. Or can you provide some estimate of the high count of NULLS in your data rows? > > so 1100 culumns work well now. > > This problem wasted me three days. I have lots of csv data to COPY. You solved the problem so it was not entirely wasted and it provided information for future reference when folks hit this list with a similar issue. > > > > > ---- On 星期三, 04 一月 2017 08:39:42 -0800 *Adrian Klaver > <adrian.klaver@aklaver.com>* wrote ---- > > On 01/04/2017 08:32 AM, Steve Crawford wrote: > > ... > > > > Numeric is expensive type - try to use float instead, maybe double. > > > > > > If I am following the OP correctly the table itself has all the > > columns declared as varchar. The data in the CSV file is a mix of > > text, date and numeric, presumably cast to text on entry into the > table. > > > > > > But a CSV *is* purely text - no casting to text is needed. > Conversion is > > only needed when the strings in the CSV are text representations of > > *non*-text data. > > Yeah, muddled thinking. > > > > > I'm guessing that the OP is using all text fields to deal with > possibly > > flawed input data and then validating and migrating the data in > > subsequent steps. In that case, an ETL solution may be a better > > approach. Many options, both open- closed- and hybrid-source exist. > > > > Cheers, > > Steve > > > -- > Adrian Klaver > adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com> > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org > <mailto:pgsql-general@postgresql.org>) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > > -- Adrian Klaver adrian.klaver@aklaver.com
On 01/05/2017 05:44 AM, vod vos wrote:
Yes, you cost yourself a lot of time by not showing the original table definition into which you were trying insert data.I finally figured it out as follows:1. modified the corresponding data type of the columns to the csv file2. if null values existed, defined the data type to varchar. The null values cause problem too.so 1100 culumns work well now.This problem wasted me three days. I have lots of csv data to COPY.
On 01/05/2017 08:31 AM, Rob Sargent wrote: > > > On 01/05/2017 05:44 AM, vod vos wrote: >> I finally figured it out as follows: >> >> 1. modified the corresponding data type of the columns to the csv file >> >> 2. if null values existed, defined the data type to varchar. The null >> values cause problem too. >> >> so 1100 culumns work well now. >> >> This problem wasted me three days. I have lots of csv data to COPY. >> >> > Yes, you cost yourself a lot of time by not showing the original table > definition into which you were trying insert data. Given that the table had 1100 columns I am not sure I wanted to see it:) Still the OP did give it to us in description: https://www.postgresql.org/message-id/15969913dd3.ea2ff58529997.7460368287916683127%40zoho.com "I create a table with 1100 columns with data type of varchar, and hope the COPY command will auto transfer the csv data that contains some character and date, most of which are numeric." In retrospect I should have pressed for was a more complete description of the data. I underestimated this description: "And some the values in the csv file contain nulls, do this null values matter? " -- Adrian Klaver adrian.klaver@aklaver.com
On 01/05/2017 11:46 AM, Adrian Klaver wrote: > On 01/05/2017 08:31 AM, Rob Sargent wrote: >> >> >> On 01/05/2017 05:44 AM, vod vos wrote: >>> I finally figured it out as follows: >>> >>> 1. modified the corresponding data type of the columns to the csv file >>> >>> 2. if null values existed, defined the data type to varchar. The null >>> values cause problem too. >>> >>> so 1100 culumns work well now. >>> >>> This problem wasted me three days. I have lots of csv data to COPY. >>> >>> >> Yes, you cost yourself a lot of time by not showing the original table >> definition into which you were trying insert data. > > Given that the table had 1100 columns I am not sure I wanted to see it:) > > Still the OP did give it to us in description: > > https://www.postgresql.org/message-id/15969913dd3.ea2ff58529997.7460368287916683127%40zoho.com > > "I create a table with 1100 columns with data type of varchar, and > hope the COPY command will auto transfer the csv data that contains > some character and date, most of which are numeric." > > In retrospect I should have pressed for was a more complete > description of the data. I underestimated this description: > > "And some the values in the csv file contain nulls, do this null > values matter? " > > My apologies for missing that. Was sure there would be room for some normalization but so be it: OP's happy, I'm happy
I am piggy-backing in this thread because I have the same issue as well. I need to import a csv file that is 672 columns long and each column consists of 12 alpha-numeric characters. Such as: SA03ARE1015D SA03ARE1S15N SB03ARE1015D ... 356412 275812 43106 ... I am aware this is not normalized, however, we (or try to) keep source data intact, and normalize after importing into our system. While trying to import all columns to type `text` I get this error: [54000] ERROR: row is too big: size 8760, maximum size 8160 Where: COPY temp_table, line 3 SQL statement "copy temp_table from '/home/edgleweb/data/raw/TX/TAPR/2015/ADV/SSTAAR1ADV.csv' with delimiter ',' quote '"' csv " I tried varchar(12) also, nothing changed. My questions is 1) I have 672x12=8,064 characters in the first row (which are actually the headers), why would it complain that it is 8760. I am assuming here type `text` occupies 1 byte for a character. 2) Is there anything I can do to work around this situation? Thanks in advance. -- View this message in context: http://www.postgresql-archive.org/COPY-row-is-too-big-tp5936997p5963385.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
BTW, we have pg9.5 run on ubuntu. -- View this message in context: http://www.postgresql-archive.org/COPY-row-is-too-big-tp5936997p5963386.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On 05/26/2017 05:07 AM, doganmeh wrote: > I am piggy-backing in this thread because I have the same issue as well. I > need to import a csv file that is 672 columns long and each column consists > of 12 alpha-numeric characters. Such as: > > SA03ARE1015D SA03ARE1S15N SB03ARE1015D ... > 356412 275812 43106 ... > > I am aware this is not normalized, however, we (or try to) keep source data > intact, and normalize after importing into our system. > > While trying to import all columns to type `text` I get this error: > > [54000] ERROR: row is too big: size 8760, maximum size 8160 > Where: COPY temp_table, line 3 > SQL statement "copy temp_table from > '/home/edgleweb/data/raw/TX/TAPR/2015/ADV/SSTAAR1ADV.csv' with delimiter ',' > quote '"' csv " > > I tried varchar(12) also, nothing changed. My questions is 1) I have > 672x12=8,064 characters in the first row (which are actually the headers), > why would it complain that it is 8760. I am assuming here type `text` > occupies 1 byte for a character. 2) Is there anything I can do to work https://www.postgresql.org/docs/9.6/static/datatype-character.html "The storage requirement for a short string (up to 126 bytes) is 1 byte plus the actual string, which includes the space padding in the case of character." > around this situation? Use csvkit's csvcut tool to split the file?: http://csvkit.readthedocs.io/en/1.0.2/scripts/csvcut.html > > Thanks in advance. > > > > -- > View this message in context: http://www.postgresql-archive.org/COPY-row-is-too-big-tp5936997p5963385.html > Sent from the PostgreSQL - general mailing list archive at Nabble.com. > > -- Adrian Klaver adrian.klaver@aklaver.com
Hello > -----Original Message----- > From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of doganmeh > Sent: Freitag, 26. Mai 2017 14:08 > To: pgsql-general@postgresql.org > Subject: Re: [GENERAL] COPY: row is too big > > I am piggy-backing in this thread because I have the same issue as well. I need to import a csv file that is 672 > columns long and each column consists of 12 alpha-numeric characters. Such as: > > SA03ARE1015D SA03ARE1S15N SB03ARE1015D ... > 356412 275812 43106 ... > > I am aware this is not normalized, however, we (or try to) keep source data intact, and normalize after importing > into our system. > > While trying to import all columns to type `text` I get this error: > > [54000] ERROR: row is too big: size 8760, maximum size 8160 > Where: COPY temp_table, line 3 > SQL statement "copy temp_table from > '/home/edgleweb/data/raw/TX/TAPR/2015/ADV/SSTAAR1ADV.csv' with delimiter ',' > quote '"' csv " Is the delimiter really ','? In the lines above it seems to be different. Did you check line 3? Regards, Charles > I tried varchar(12) also, nothing changed. My questions is 1) I have > 672x12=8,064 characters in the first row (which are actually the headers), why would it complain that it is 8760. I > am assuming here type `text` occupies 1 byte for a character. 2) Is there anything I can do to work around this > situation? > > Thanks in advance. > > > > -- > View this message in context: http://www.postgresql-archive.org/COPY-row-is-too-big-tp5936997p5963385.html > Sent from the PostgreSQL - general mailing list archive at Nabble.com. > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general
Am 26.05.2017 um 14:07 schrieb doganmeh: > > I tried varchar(12) also, nothing changed. My questions is 1) I have > 672x12=8,064 characters in the first row (which are actually the headers), > why would it complain that it is 8760. I am assuming here type `text` > occupies 1 byte for a character. please consider special chars, a little example: test=*# create table demo(id int, t text); CREATE TABLE test=*# insert into demo values (1, '123') test-# ; INSERT 0 1 test=*# insert into demo values (2, '€€€'); INSERT 0 1 test=*# select id, t, length(t), pg_column_size(t) from demo; id | t | length | pg_column_size ----+-----+--------+---------------- 1 | 123 | 3 | 4 2 | €€€ | 3 | 10 (2 Zeilen) -- 2ndQuadrant - The PostgreSQL Support Company. www.2ndQuadrant.com
doganmeh <mehmet@edgle.com> writes: > I tried varchar(12) also, nothing changed. My questions is 1) I have > 672x12=8,064 characters in the first row (which are actually the headers), > why would it complain that it is 8760. No, you have 672*13, because each varchar value will require a length word (which is only 1 byte for short values like these). Adding the 24-byte row header comes to 8760. > 2) Is there anything I can do to work > around this situation? Maybe you could combine the strings into an array? A large array would be subject to compression and/or out-of-line storage, but 12-byte fields are too small to benefit from either. regards, tom lane
Yes, the delimiter was indeed ",". I fixed my original post . Seems I carelessly copy/pasted from excel. -- View this message in context: http://www.postgresql-archive.org/COPY-row-is-too-big-tp5936997p5963558.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
Yes, csvkit is what I decided to go with. Thank you all! -- View this message in context: http://www.postgresql-archive.org/COPY-row-is-too-big-tp5936997p5963559.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.