Home > mailing lists

logical replication walsender loop preventing a clean shutdown - Mailing list pgsql-bugs

From	Greg Sabino Mullane
Subject	logical replication walsender loop preventing a clean shutdown
Date	September 16, 2024 21:27:42
Msg-id	CAKAnmm+STYvW_5aRx2C0QWgbNpd_zEjruc6MytePnRuK8oKtTA@mail.gmail.com Whole thread Raw
Responses	Re: logical replication walsender loop preventing a clean shutdown
List	pgsql-bugs

Tree view

When doing logical replication, a large transaction can prevent the postgres process from shutting down until the WAL has all been processed and the client reports back. This is obviously less than ideal, as it means a pg_ctl stop -m fast can take minutes or hours to complete. I would expect the behavior to be that all backends are signalled so they can leave cleanly.

I found this thread that reports something very similar (but without the infinite looping):

Subject: walsender bug: stuck during shutdown

https://www.postgresql.org/message-id/flat/20201123205253.GA10075%40alvherre.pgsql

I have cc'd Alvaro in case he has any progress on this, or ideas. I tried applying the patch from that thread, but the behavior remained unchanged. Wanted to raise this in -bugs for added visibility, and also see if anyone had thoughts before I dig deeper.

My test case (tested with latest, as of commit b8ea0f675f35c3f0c2cf62175517ba0dacad4abd)

* Spin up a cluster, port 5555, using wal_level logical

* pg_recvlogical --create-slot -d postgres -p 5555 --slot=foo

* pg_recvlogical --start -d postgres -p 5555 --slot=foo --file /tmp/tmp

* If all is well, ctrl-z, bg 1, watch -n 3 tail /tmp/tmp

Other session:

* psql -p 5555 postgres

* create table t (id int generated always as identity, foo text);

* insert into t(foo) select 'abcdefghijklmnopqrstuvwxyz' from generate_series(1,10_000_000);

Once the commit finishes, and as soon as pc_recvlogical starts processing it:

* time pg_ctl stop -m fast -w -t 10000

I found 10 million a nice test on my system - shutdown takes an additional 50 seconds or so, as it waits for pg_recvlogical to respond.

Cheers,

Greg

pgsql-bugs by date:

From: Tom Lane
Date: 16 September 2024, 20:28:30
Subject: Re: BUG #18618: pg_upgrade from 14 to 15+ fails for unlogged table with identity column

From: PG Bug reporting form
Date: 16 September 2024, 21:54:50
Subject: BUG #18620: Problem: Slow Delete Operation

logical replication walsender loop preventing a clean shutdown - Mailing list pgsql-bugs

Previous

Next