Re: SIGQUIT handling, redux - Mailing list pgsql-hackers

From Andres Freund
Subject Re: SIGQUIT handling, redux
Date
Msg-id 20230802164840.2fh2b26bbakhqga7@awork3.anarazel.de
Whole thread Raw
In response to Re: SIGQUIT handling, redux  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,

On 2023-08-02 12:35:19 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2020-09-11 11:52:55 -0400, Tom Lane wrote:
> >> It's simple enough that maybe we could back-patch it, once it's
> >> aged awhile in HEAD.  OTOH, given the lack of field reports of
> >> trouble here, I'm not sure back-patching is worth the risk.
> 
> > FWIW, looking at collected stack traces in azure, there's a slow but steady
> > stream of crashes below StartupPacketTimeoutHandler. ...
> > Unsurprisingly just in versions before 14, where this change went in.
> > I think that might be enough evidence for backpatching the commit? I've not
> > heard of issues due to the checks in check_on_shmem_exit_lists_are_empty().
> 
> I'd be willing to take a look at this in a few weeks when $real_life
> is a bit less demanding.

Cool.


> Right before minor releases would likely be a bad idea anyway.

Agreed. I had just waded through the stacks, so I thought it'd be worth
bringing up, didn't intend to imply we should backpatch immediately.


Aside: Analyzing this kind of thing at scale is made considerably more painful
by "expected" ereport(PANIC)s (say out of disk space during WAL writes)
calling abort() and dumping core... While there are other PANICs you really
want cores of.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: SIGQUIT handling, redux
Next
From: Nathan Bossart
Date:
Subject: Re: add timing information to pg_upgrade