Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features) - Mailing list pgsql-hackers
From | jian he |
---|---|
Subject | Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features) |
Date | |
Msg-id | CACJufxH3KGOGhnS-J2bYnA-mpABg4HG9Xddfj9zBBgoK+iGSsQ@mail.gmail.com Whole thread Raw |
In response to | Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features) (Alena Rybakina <lena.ribackina@yandex.ru>) |
Responses |
Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
|
List | pgsql-hackers |
On Tue, Dec 5, 2023 at 6:07 PM Alena Rybakina <lena.ribackina@yandex.ru> wrote: > > Hi! > > Thank you for your contribution to this thread. > > > I reviewed it and have a few questions. > > 1. I have seen that you delete a table before creating it, to which you want to add errors due to a failed "copy from"operation. I think this is wrong because this table can save useful data for the user. > At a minimum, we should warn the user about this, but I think we can just add some number at the end of the name, suchas name_table1, name_table_2. Sorry. I don't understand this part. Currently, if the error table name already exists, then the copy will fail, an error will be reported. I try to first create a table, if no error then the error table will be dropped. Can you demo the expected behavior? > 2. I noticed that you are forming a table name using the type of errors that prevent rows from being added during 'copyfrom' operation. > I think it would be better to use the name of the source file that was used while 'copy from' was running. > In addition, there may be several such files, it is also worth considering. > Another column added. now it looks like: SELECT * FROM save_error_csv_error; filename | lineno | line | field | source | err_message | err_detail | errorcode ----------+--------+----------------------------------------------------+-------+--------+---------------------------------------------+------------+----------- STDIN | 1 | 2002 232 40 50 60 70 80 | NULL | NULL | extra data after last expected column | NULL | 22P04 STDIN | 1 | 2000 230 23 | d | NULL | missing data for column "d" | NULL | 22P04 STDIN | 1 | z,,"" | a | z | invalid input syntax for type integer: "z" | NULL | 22P02 STDIN | 2 | \0,, | a | \0 | invalid input syntax for type integer: "\0" | NULL | 22P02 > 3. I found spelling: > > /* no err_nsp.error_rel table then crete one. for holding error. */ > fixed. > 4. Maybe rewrite this comment > > these info need, no error will drop err_nsp.error_rel table > to: > this information is necessary, no error will lead to the deletion of the err_sp.error_rel table. > fixed. > 5. Is this part of the comment needed? I think it duplicates the information below when we form the query. > > * . column list(order by attnum, begin from ctid) = > * {ctid, lineno,line,field,source,err_message,err_detail,errorcode} > * . data types (from attnum = -1) ={tid, int8,text,text,text,text,text,text} > > I'm not sure if we need to order the rows by number. It might be easier to work with these lines in the order they appear. > Simplified the comment. "order by attnum" is to make sure that if there is a table already existing, and the column name is like X and the data type like Y, then we consider this table is good for holding potential error info. COPY FROM, main entry point is NextCopyFrom. Now for non-binary mode, if you specified save_error then it will not fail at NextCopyFrom. all these three errors will be tolerated: extra data after last expected column, missing data for column, data type conversion.
Attachment
pgsql-hackers by date: