Thread: more backtraces

more backtraces

From

Peter Eisentraut

Date:

04 December 2019, 19:45:25

In the previous discussions on backtrace support, some people asked for 
backtraces in more situations.  Here is a patch that prints backtraces 
on SIGABRT, SIGBUS, and SIGSEGV signals.  SIGABRT includes assertions 
and elog(PANIC).

Do signals work like this on Windows?  Do we need special EXEC_BACKEND 
support?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

0001-Print-backtrace-on-SIGABRT-SIGBUS-SIGSEGV.patch

Re: more backtraces

From

Andres Freund

Date:

04 December 2019, 19:59:15

Hi,

On 2019-12-04 20:45:25 +0100, Peter Eisentraut wrote:
> In the previous discussions on backtrace support, some people asked for
> backtraces in more situations.  Here is a patch that prints backtraces on
> SIGABRT, SIGBUS, and SIGSEGV signals.  SIGABRT includes assertions and
> elog(PANIC).

Hm. Can we really do that somewhat reliably like this? I'd suspect that
there'll be some oddities e.g. for stack overflows if done this way. To
my knowledge it's not a good idea to intercept SIGBUS/SIGSEGV without
using a separate signal stack (cf. sigaltstack) - but using a separate
stack could also make it harder to determine a correct backtrace?

It'd be bad if the addition of backtraces for SEGV/BUS suddenly made it
harder to attach a debugger and getting useful results. Even
disregarding the previous concerns, we'll get less useful debugger
interactions due to this, e.g. for things like null pointer derefs,
right?

Doing this for SIGABRT seems like a more clearly good case - by that
point we're already removed a few frames from the triggering code
anyway. So debugging experience won't suffer much. And I don't think
there's a corresponding issue with the stack potentially being
corrupted / not large enough.

- Andres

Re: more backtraces

From

Peter Eisentraut

Date:

04 December 2019, 20:31:01

On 2019-12-04 20:59, Andres Freund wrote:
> On 2019-12-04 20:45:25 +0100, Peter Eisentraut wrote:
>> In the previous discussions on backtrace support, some people asked for
>> backtraces in more situations.  Here is a patch that prints backtraces on
>> SIGABRT, SIGBUS, and SIGSEGV signals.  SIGABRT includes assertions and
>> elog(PANIC).
> 
> Hm. Can we really do that somewhat reliably like this?

I've seen reputable programs that do all kinds of things in SIGSEGV 
handlers, including running user-defined programs, without taking any 
special precautions.  So it seems possible in general.

> I'd suspect that
> there'll be some oddities e.g. for stack overflows if done this way. To
> my knowledge it's not a good idea to intercept SIGBUS/SIGSEGV without
> using a separate signal stack (cf. sigaltstack) - but using a separate
> stack could also make it harder to determine a correct backtrace?

Didn't know about that, but seems useful.  I'll look into it.

> It'd be bad if the addition of backtraces for SEGV/BUS suddenly made it
> harder to attach a debugger and getting useful results. Even
> disregarding the previous concerns, we'll get less useful debugger
> interactions due to this, e.g. for things like null pointer derefs,
> right?

The backtrace and level of detail jumping around between frames I get in 
lldb looks the same as without this.  But it might depend.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: more backtraces

From

Tom Lane

Date:

04 December 2019, 21:34:42

Andres Freund <andres@anarazel.de> writes:
> It'd be bad if the addition of backtraces for SEGV/BUS suddenly made it
> harder to attach a debugger and getting useful results.

Yeah.  TBH, I'm not sure I want this, at least not in debug builds.

            regards, tom lane

Re: more backtraces

From

Peter Eisentraut

Date:

13 December 2019, 12:26:43

On 2019-12-04 22:34, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
>> It'd be bad if the addition of backtraces for SEGV/BUS suddenly made it
>> harder to attach a debugger and getting useful results.
> 
> Yeah.  TBH, I'm not sure I want this, at least not in debug builds.

I understand that the SEGV/BUS thing can be a bit scary.  We can skip it.

Are people interested in backtraces on abort()?  That was asked for in 
an earlier thread.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: more backtraces

From

Robert Haas

Date:

15 December 2019, 03:38:49

On Fri, Dec 13, 2019 at 7:26 AM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> On 2019-12-04 22:34, Tom Lane wrote:
> > Andres Freund <andres@anarazel.de> writes:
> >> It'd be bad if the addition of backtraces for SEGV/BUS suddenly made it
> >> harder to attach a debugger and getting useful results.
> >
> > Yeah.  TBH, I'm not sure I want this, at least not in debug builds.
>
> I understand that the SEGV/BUS thing can be a bit scary.  We can skip it.
>
> Are people interested in backtraces on abort()?  That was asked for in
> an earlier thread.

I mean, I think backtraces are great, and we should have more of them.
It's possible that trying to do it in certain cases will cause
problems, but we could back off those cases as we find them, or maybe
try to work around them using sigaltstack(), or maybe back it off in
debug builds.

It would make life a lot easier for me if I never had to explain to a
customer (1) how to install gdb or (2) that they needed to get $BOSS
to approve installation of development tools on production systems. I
would hate to see us shy away from improvements that might reduce the
need for such conversations on the theory that bad stuff *might*
happen.

In my experience, the importance of having a stack trace in the log is
greatest for a segmentation fault, because otherwise you have no
indication whatsoever of where the problem happened. Having the query
text has been a boon, but it's still not a lot to go on unless the
same query crashes every time. In other situations, like a PANIC,
Assertion failure, or (and this is a big one) non-descriptive error
message (cache look failed for thingy %u) a backtrace is sometimes
really helpful as well. You don't *always* need it, but you *often*
need it.

It is absolutely important that we don't break debuggability in the
service of getting more stack traces. At the same time, there are a
lot more PostgreSQL users out there than there are PostgreSQL
developers, and a lot of those people are running non-cassert,
non-debug builds. Being able to get debugging information from
failures that happen on those installations that enables us to fix
things without having to go through a time-consuming process of
guesswork and attempted reproduction is really valuable. A stack trace
can turn a lengthy nightmare into a quick fix.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: more backtraces

From

Tom Lane

Date:

15 December 2019, 16:06:39

Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Dec 13, 2019 at 7:26 AM Peter Eisentraut
>> Are people interested in backtraces on abort()?  That was asked for in
>> an earlier thread.

FWIW, I don't have too much of an opinion about abort() yet.
Aren't we covering most of the possible cases for that already?
I don't think that direct abort() calls are considered good style
in the backend; it'd mostly get reached via Assert or PANIC.

> It would make life a lot easier for me if I never had to explain to a
> customer (1) how to install gdb or (2) that they needed to get $BOSS
> to approve installation of development tools on production systems.

Sure, but this facility is not going to have that end result, because
the output just isn't detailed enough.  If it were, I'd be more interested
in taking risks to get the output.  But as it stands, we're going to
need more information in a large fraction of cases, so I'm dubious
about doing anything that might actually interfere with collecting
such information.

> Being able to get debugging information from
> failures that happen on those installations that enables us to fix
> things without having to go through a time-consuming process of
> guesswork and attempted reproduction is really valuable. A stack trace
> can turn a lengthy nightmare into a quick fix.

I think you are supposing that these traces will be as useful as gdb
traces.  They won't.  In particular, where a gdb trace will almost
always localize the problem to a line of C code, with these you're
quite lucky if you can even localize to a specific function.  That
issue is mitigated for the existing use-cases by the fact that there's
also a reported error message or assertion condition, so you can use
that to narrow down the trap site.  But that won't help for SIGSEGV.

I think that the most useful next steps would involve trying to get
better printouts from the cases this code already traps, rather than
extending it to more cases.  Maybe eventually we'll feel that this
code is useful and reliable enough to justify trying to insert it
into SIGSEGV cases; but we're not there today.

            regards, tom lane

Re: more backtraces

From

Alvaro Herrera

Date:

15 December 2019, 21:28:50

On 2019-Dec-15, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:

> > Being able to get debugging information from
> > failures that happen on those installations that enables us to fix
> > things without having to go through a time-consuming process of
> > guesswork and attempted reproduction is really valuable. A stack trace
> > can turn a lengthy nightmare into a quick fix.
> 
> I think you are supposing that these traces will be as useful as gdb
> traces.  They won't.  In particular, where a gdb trace will almost
> always localize the problem to a line of C code, with these you're
> quite lucky if you can even localize to a specific function.

That's already been my experience :-(

> I think that the most useful next steps would involve trying to get
> better printouts from the cases this code already traps,

+1

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services