Re: Design of pg_stat_subscription_workers vs pgstats - Mailing list pgsql-hackers
From | David G. Johnston |
---|---|
Subject | Re: Design of pg_stat_subscription_workers vs pgstats |
Date | |
Msg-id | CAKFQuwZzHcMQKDH_i9LPsk89Omc4ew69X0_khKxPkNeJrx_3oA@mail.gmail.com Whole thread Raw |
In response to | Re: Design of pg_stat_subscription_workers vs pgstats (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Design of pg_stat_subscription_workers vs pgstats
|
List | pgsql-hackers |
On Saturday, February 19, 2022, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sat, Feb 19, 2022 at 1:17 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Fri, Feb 18, 2022 at 1:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>
>>
>> Here is the summary of the discussion, changes, and plan.
>>
>> 1. Move some error information such as the error message to a new
>> system catalog, pg_subscription_error. The pg_subscription_error table
>> would have the following columns:
>>
>> * sesubid : subscription Oid.
>> * serelid : relation Oid (NULL for apply worker).
>> * seerrlsn : commit-LSN or the error transaction.
>> * seerrcmd : command (INSERT, UPDATE, etc.) of the error transaction.
>> * seerrmsg : error message
>
>
> Not a fan of the "se" prefix but overall yes. We should include a timestamp.
>
How about naming it pg_subscription_worker_error as Peter E. has
suggested in one of his emails? I find pg_subscription_error slightly
odd as one could imagine that even the errors related to subscription
commands like Alter Subscription.
Adding worker makes sense.
>>
>> The tuple is inserted or updated when an apply worker or a tablesync
>> worker raises an error. If the same error occurs in a row, the update
>> is skipped.
>
Are you going to query table to check if it is same error?
I don’t get the question, the quoted text is your which I disagree with. But the error message is being captured in any case.
>
> I disagree with this - I would treat every new instance of an error to be important and insert on conflict (sesubid, serelid) the new entry, updating lsn/cmd/msg with the new values.
>
I don't think that will be a problem but what advantage are you
envisioning with updating the same info except for timestamp?
Omission of timestamp (or any other non-key field we have) from the update is an oversight.
>> The tuple is removed in the following cases:
>>
>> * the subscription is dropped.
>> * the table is dropped.
>>
>> * the table is removed from the subscription.
>> * the worker successfully committed a non-empty transaction.
>
>
> Correct. This handles the "end of sync worker" just fine since its final action should be a successful commit of a non-empty transaction.
>>
I think for tablesync workers, we may need slightly different handling
as there could probably be no transactions to apply apart from the
initial copy. Now, I think for tablesync worker, we can't postpone it
till after we update the rel state as SUBREL_STATE_SYNCDONE because if
we do it after that and there is some error updating/deleting the
tuple, the tablesync worker won't be launched again and that entry
will remain in the system for a longer duration.
Not sure…but I don’t see how you can not have a non-empty transaction while still having an error.
Are subscriptions “dropped” upon a reboot? If not, that needs its own case for row removal.
David J.
pgsql-hackers by date: