Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features) - Mailing list pgsql-hackers
| From | torikoshia |
|---|---|
| Subject | Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features) |
| Date | |
| Msg-id | 6076bebc06624713efda7421f8ed1ad0@oss.nttdata.com Whole thread Raw |
| In response to | Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features) (jian he <jian.universality@gmail.com>) |
| Responses |
Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
|
| List | pgsql-hackers |
On Wed, Jan 10, 2024 at 4:42 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:
> Yeah, I'm still thinking it's better to implement this feature
> incrementally. Given we're closing to feature freeze, I think it's
> unlikely to get the whole feature into PG17 since there are still many
> design discussions we need in addition to what Torikoshi-san pointed
> out. The feature like "ignore errors" or "logging errors" would have
> higher possibilities. Even if we get only these parts of the whole
> "error table" feature into PG17, it will make it much easier to
implement "error tables" feature.
+1.
I'm also going to make patch for "logging errors", since this
functionality is isolated from v7 patch.
> Seems promising. I'll look at the patch.
Thanks a lot!
Sorry to attach v2 if you already reviewed v1..
On 2024-01-11 12:13, jian he wrote:
> On Tue, Jan 9, 2024 at 10:36 PM torikoshia <torikoshia@oss.nttdata.com>
> wrote:
>>
>> On Tue, Dec 19, 2023 at 10:14 AM Masahiko Sawada
>> <sawada.mshk@gmail.com>
>> wrote:
>> > If we want only such a feature we need to implement it together (the
>> > patch could be split, though). But if some parts of the feature are
>> > useful for users as well, I'd recommend implementing it incrementally.
>> > That way, the patches can get small and it would be easy for reviewers
>> > and committers to review/commit them.
>>
>> Jian, how do you think this comment?
>>
>> Looking back at the discussion so far, it seems that not everyone
>> thinks
>> saving table information is the best idea[1] and some people think
>> just
>> skipping error data is useful.[2]
>>
>> Since there are issues to be considered from the design such as
>> physical/logical replication treatment, putting error information to
>> table is likely to take time for consensus building and development.
>>
>> Wouldn't it be better to follow the following advice and develop the
>> functionality incrementally?
>>
>> On Fri, Dec 15, 2023 at 4:49 AM Masahiko Sawada
>> <sawada(dot)mshk(at)gmail(dot)com> wrote:
>> > So I'm thinking we may be able to implement this
>> > feature incrementally. The first step would be something like an
>> > option to ignore all errors or an option to specify the maximum number
>> > of errors to tolerate before raising an ERROR. The second step would
>> > be to support logging destinations such as server logs and tables.
>>
>>
>> Attached a patch for this "first step" with reference to v7 patch,
>> which
>> logged errors and simpler than latest one.
>> - This patch adds new option SAVE_ERROR_TO, but currently only
>> supports
>> 'none', which means just skips error data. It is expected to support
>> 'log' and 'table'.
>> - This patch Skips just soft errors and don't handle other errors such
>> as missing column data.
>
> Hi.
> I made the following change based on your patch
> (v1-0001-Add-new-COPY-option-SAVE_ERROR_TO.patch)
>
> * when specified SAVE_ERROR_TO, move the initialization of
> ErrorSaveContext to the function BeginCopyFrom.
> I think that's the right place to initialize struct CopyFromState
> field.
> * I think your patch when there are N rows have malformed data, then it
> will initialize N ErrorSaveContext.
> In the struct CopyFromStateData, I changed it to ErrorSaveContext
> *escontext.
> So if an error occurred, you can just set the escontext accordingly.
> * doc: mention "If this option is omitted, <command>COPY</command>
> stops operation at the first error."
> * Since we only support 'none' for now, 'none' means we don't want
> ErrorSaveContext metadata,
> so we should set cstate->escontext->details_wanted to false.
>
>> BTW I have question and comment about v15 patch:
>>
>> > + {
>> > + /*
>> > + *
>> > + * InputFunctionCall is more faster than
>> > InputFunctionCallSafe.
>> > + *
>> > + */
>>
>> Have you measured this?
>> When I tested it in an older patch, there were no big difference[3].
> Thanks for pointing it out, I probably was over thinking.
>
>> > - SAVEPOINT SCALAR SCHEMA SCHEMAS SCROLL SEARCH SECOND_P
>> SECURITY
>> SELECT
>> > + SAVEPOINT SAVE_ERROR SCALAR SCHEMA SCHEMAS SCROLL SEARCH
>> SECOND_P
>> SECURITY SELECT
>>
>> There was a comment that we shouldn't add new keyword for this[4].
>>
> Thanks for pointing it out.
Thanks for reviewing!
Updated the patch merging your suggestions except below points:
> + cstate->num_errors = 0;
Since cstate is already initialized in below lines, this may be
redundant.
| /* Allocate workspace and zero all fields */
| cstate = (CopyFromStateData *) palloc0(sizeof(CopyFromStateData));
> + Assert(!cstate->escontext->details_wanted);
I'm not sure this is necessary, considering we're going to add other
options like 'table' and 'log', which need details_wanted soon.
--
Regards,
--
Atsushi Torikoshi
NTT DATA Group Corporation
Attachment
pgsql-hackers by date: