Thread: [HACKERS] DROP SUBSCRIPTION, query cancellations and slot handling
Hi, I have noticed the following behavior with DROP SUBSCRIPTION followed by a cancel request. If the remote replication slot is dropped, the subscription may still be present locally: =# CREATE SUBSCRIPTION mysub CONNECTION 'port=5432 user=mpaquier dbname=mpaquier' PUBLICATION mypub, insert_only; NOTICE: 00000: created replication slot "mysub" on publisher LOCATION: CreateSubscription, subscriptioncmds.c:408 NOTICE: 00000: synchronized table states LOCATION: CreateSubscription, subscriptioncmds.c:434 CREATE SUBSCRIPTION =# DROP SUBSCRIPTION mysub; ^CCancel request sent NOTICE: 00000: dropped replication slot "mysub" on publisher LOCATION: DropSubscription, subscriptioncmds.c:873 ERROR: 57014: canceling statement due to user request LOCATION: ProcessInterrupts, postgres.c:2984 In this case the subscription is not dropped: =# select subname from pg_subscription;subname ---------mysub (1 row) But trying to issue once again a drop results in an error: =# DROP SUBSCRIPTION mysub; ERROR: XX000: could not drop the replication slot "mysub" on publisher DETAIL: The error was: ERROR: replication slot "mysub" does not exist LOCATION: DropSubscription, subscriptioncmds.c:869 A subscription with the same name cannot be created either, so there is nothing that the user can do except drop manually the slot on the publisher. It seems to me that the moment where the slot is created should be a point of no-return: the subcription has to be dropped on the replication slot is dropped on the remote. I am adding an open item. Thanks, -- Michael
On Thu, Apr 20, 2017 at 4:22 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > I am adding an open item. Just adding something... When a subscription is created, if the step synchronizing tables fails then CREATE SUBSCRIPTION fails but the slot remains present on the publisher side, so trying to re-create the same subscription results in an error: =# CREATE SUBSCRIPTION mysub CONNECTION 'port=5432' PUBLICATION mypub, insert_only; NOTICE: 00000: Sleeping now... NOTICE: 00000: created replication slot "mysub" on publisher LOCATION: CreateSubscription, subscriptioncmds.c:411 ERROR: 42P01: relation "public.aa" does not exist LOCATION: RangeVarGetRelidExtended, namespace.c:400 Time: 1033.739 ms (00:01.034) =# CREATE SUBSCRIPTION mysub CONNECTION 'port=5432 user=mpaquier dbname=mpaquier' PUBLICATION mypub, insert_only; NOTICE: 00000: Sleeping now... LOCATION: CreateSubscription, subscriptioncmds.c:376 ERROR: XX000: could not create replication slot "mysub": ERROR: replication slot "mysub" already exists LOCATION: libpqrcv_create_slot, libpqwalreceiver.c:776 I have created a simple table aa (a int) on the publisher first, where a publication with ALL TABLES has been created: CREATE PUBLICATION mypub FOR ALL TABLES; -- Michael
On 20/04/17 09:22, Michael Paquier wrote: > Hi, > > I have noticed the following behavior with DROP SUBSCRIPTION followed > by a cancel request. If the remote replication slot is dropped, the > subscription may still be present locally: > =# CREATE SUBSCRIPTION mysub CONNECTION 'port=5432 user=mpaquier > dbname=mpaquier' PUBLICATION mypub, insert_only; > NOTICE: 00000: created replication slot "mysub" on publisher > LOCATION: CreateSubscription, subscriptioncmds.c:408 > NOTICE: 00000: synchronized table states > LOCATION: CreateSubscription, subscriptioncmds.c:434 > CREATE SUBSCRIPTION > =# DROP SUBSCRIPTION mysub; > ^CCancel request sent > NOTICE: 00000: dropped replication slot "mysub" on publisher > LOCATION: DropSubscription, subscriptioncmds.c:873 > ERROR: 57014: canceling statement due to user request > LOCATION: ProcessInterrupts, postgres.c:2984 > > In this case the subscription is not dropped: > =# select subname from pg_subscription; > subname > --------- > mysub > (1 row) > But trying to issue once again a drop results in an error: > =# DROP SUBSCRIPTION mysub; > ERROR: XX000: could not drop the replication slot "mysub" on publisher > DETAIL: The error was: ERROR: replication slot "mysub" does not exist > LOCATION: DropSubscription, subscriptioncmds.c:869 > > A subscription with the same name cannot be created either, so there > is nothing that the user can do except drop manually the slot on the > publisher. It seems to me that the moment where the slot is created > should be a point of no-return: the subcription has to be dropped on > the replication slot is dropped on the remote. > DROP SUBSCRIPTION mysub NODROP SLOT; -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 20/04/17 09:35, Michael Paquier wrote: > On Thu, Apr 20, 2017 at 4:22 PM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> I am adding an open item. > > Just adding something... When a subscription is created, if the step > synchronizing tables fails then CREATE SUBSCRIPTION fails but the slot > remains present on the publisher side, so trying to re-create the same > subscription results in an error: > > =# CREATE SUBSCRIPTION mysub CONNECTION 'port=5432' PUBLICATION mypub, > insert_only; > NOTICE: 00000: Sleeping now... > NOTICE: 00000: created replication slot "mysub" on publisher > LOCATION: CreateSubscription, subscriptioncmds.c:411 > ERROR: 42P01: relation "public.aa" does not exist > LOCATION: RangeVarGetRelidExtended, namespace.c:400 > Time: 1033.739 ms (00:01.034) > =# CREATE SUBSCRIPTION mysub CONNECTION 'port=5432 user=mpaquier > dbname=mpaquier' PUBLICATION mypub, insert_only; > NOTICE: 00000: Sleeping now... > LOCATION: CreateSubscription, subscriptioncmds.c:376 > ERROR: XX000: could not create replication slot "mysub": ERROR: > replication slot "mysub" already exists > LOCATION: libpqrcv_create_slot, libpqwalreceiver.c:776 > CREATE SUBSCRIPTION mysub CONNECTION 'port=5432' PUBLICATION mypub, insert_only WITH(NOCREATE SLOT); Or you can drop the slot manually on upstream. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Apr 20, 2017 at 8:47 PM, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote: > Or you can drop the slot manually on upstream. Sure, but the point here is that if for example users have client_min_messages set at least at warning, they may have no idea that an underlying slot has been created. This is a confusing experience for users. As subscription is a self-contained concept, it seems to me that any errors happening should at least try to do some cleanup action before just giving up processing, that would be a less frustrating experience. -- Michael
On 20/04/17 14:41, Michael Paquier wrote: > On Thu, Apr 20, 2017 at 8:47 PM, Petr Jelinek > <petr.jelinek@2ndquadrant.com> wrote: >> Or you can drop the slot manually on upstream. > > Sure, but the point here is that if for example users have > client_min_messages set at least at warning, they may have no idea > that an underlying slot has been created. This is a confusing > experience for users. > > As subscription is a self-contained concept, it seems to me that any > errors happening should at least try to do some cleanup action before > just giving up processing, that would be a less frustrating > experience. > Hmm well since this only affects the synchronization of table states/names, I guess we could just simply do that before we create the slot as there is no expectancy of consistency between slot and the table list snapshot. Any other potential errors will be out of control of CreateSubscription anyway. Thoughts? -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 4/20/17 08:41, Michael Paquier wrote: > As subscription is a self-contained concept, it seems to me that any > errors happening should at least try to do some cleanup action before > just giving up processing, that would be a less frustrating > experience. This is the way it's designed. The alternative is to do what we currently do for physical replication, namely requiring the user to set up all the replication slots manually beforehand. I don't think that's a better experience. There was a thread about having pg_basebackup automatically create replication slots. That will have to deal with the same issues. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 4/20/17 10:19, Petr Jelinek wrote: > Hmm well since this only affects the synchronization of table > states/names, I guess we could just simply do that before we create the > slot as there is no expectancy of consistency between slot and the table > list snapshot. I suppose that wouldn't hurt. Prior to the table sync patch, a missing target relation would just show up as an error later on in the logs. So having the error sooner actually seems like a good change. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 20/04/17 23:30, Peter Eisentraut wrote: > On 4/20/17 10:19, Petr Jelinek wrote: >> Hmm well since this only affects the synchronization of table >> states/names, I guess we could just simply do that before we create the >> slot as there is no expectancy of consistency between slot and the table >> list snapshot. > > I suppose that wouldn't hurt. > > Prior to the table sync patch, a missing target relation would just show > up as an error later on in the logs. So having the error sooner > actually seems like a good change. > Very simple patch to make. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Fri, Apr 21, 2017 at 11:34 AM, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote: > On 20/04/17 23:30, Peter Eisentraut wrote: >> On 4/20/17 10:19, Petr Jelinek wrote: >>> Hmm well since this only affects the synchronization of table >>> states/names, I guess we could just simply do that before we create the >>> slot as there is no expectancy of consistency between slot and the table >>> list snapshot. >> >> I suppose that wouldn't hurt. >> >> Prior to the table sync patch, a missing target relation would just show >> up as an error later on in the logs. So having the error sooner >> actually seems like a good change. >> > > Very simple patch to make. +1 for that. -- Michael
On 4/20/17 22:57, Michael Paquier wrote: > On Fri, Apr 21, 2017 at 11:34 AM, Petr Jelinek > <petr.jelinek@2ndquadrant.com> wrote: >> On 20/04/17 23:30, Peter Eisentraut wrote: >>> On 4/20/17 10:19, Petr Jelinek wrote: >>>> Hmm well since this only affects the synchronization of table >>>> states/names, I guess we could just simply do that before we create the >>>> slot as there is no expectancy of consistency between slot and the table >>>> list snapshot. >>> >>> I suppose that wouldn't hurt. >>> >>> Prior to the table sync patch, a missing target relation would just show >>> up as an error later on in the logs. So having the error sooner >>> actually seems like a good change. >>> >> >> Very simple patch to make. > > +1 for that. Committed that. I don't think there is anything else open here. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services