Thread: Quite strange crash

Quite strange crash

From

Denis Perchine

Date:

07 January 2001, 04:36:17

Hi,

Does anyone seen this on PostgreSQL 7.0.3?????

FATAL: s_lock(401f7435) at bufmgr.c:2350, stuck spinlock. Aborting.
FATAL: s_lock(401f7435) at bufmgr.c:2350, stuck spinlock. Aborting.
Server process (pid 1008) exited with status 6 at Sun Jan  7 04:29:07 2001
Terminating any active server processes...
Server processes were terminated at Sun Jan  7 04:29:07 2001
Reinitializing shared memory and semaphores

-- 
Sincerely Yours,
Denis Perchine

----------------------------------
E-Mail: dyp@perchine.com
HomePage: http://www.perchine.com/dyp/
FidoNet: 2:5000/120.5
----------------------------------

Re: Quite strange crash

From

Tom Lane

Date:

07 January 2001, 13:09:00

Denis Perchine <dyp@perchine.com> writes:
> Does anyone seen this on PostgreSQL 7.0.3?????
> FATAL: s_lock(401f7435) at bufmgr.c:2350, stuck spinlock. Aborting.

Were there any errors before that?

I've been suspicious for awhile that the system might neglect to release
buffer cntx_lock spinlocks if an elog() occurs while one is held.  This
looks like it might be such a case, but you're only showing us the end
symptom not what led up to it ...
        regards, tom lane

Re: Quite strange crash

From

Denis Perchine

Date:

08 January 2001, 01:38:03

On Monday 08 January 2001 00:08, Tom Lane wrote:
> Denis Perchine <dyp@perchine.com> writes:
> > Does anyone seen this on PostgreSQL 7.0.3?????
> > FATAL: s_lock(401f7435) at bufmgr.c:2350, stuck spinlock. Aborting.
>
> Were there any errors before that?

No... Just clean log (I redirect log from stderr/out t file, and all other to 
syslog).

Here it is just from the begin:

----
DEBUG:  Data Base System is starting up at Sun Jan  7 04:22:00 2001
DEBUG:  Data Base System was interrupted being in production at Thu Jan  4 
23:30:22 2001
DEBUG:  Data Base System is in production state at Sun Jan  7 04:22:00 2001
FATAL: s_lock(401f7435) at bufmgr.c:2350, stuck spinlock. Aborting.
FATAL: s_lock(401f7435) at bufmgr.c:2350, stuck spinlock. Aborting.
Server process (pid 1008) exited with status 6 at Sun Jan  7 04:29:07 2001
Terminating any active server processes...
Server processes were terminated at Sun Jan  7 04:29:07 2001
Reinitializing shared memory and semaphores
-----

As far as you can see it happends almost just after start.

I can give you full list of queries which was made by process 1008. But 
basically there was only queries like this:
select message_id from pop3 where server_id = 6214

insert into pop3 (server_id, mailfrom, mailto, subject, message_id, 
sent_date, sent_date_text, recieved_date, state) values (25641, 
'virtualo.com', 'jrdias@mail.telepac.pt', 'Joao roque Dias I have
tried them all....this one is for real........!', 
'20010107041334.CVEA17335.fep02-svc.mail.telepac.pt@anydomain.com', 
'2001-01-07 04:06:23 -00', 'Sat, 06 Jan 2001 23:06:23 -0500', 'now', 1)

And the last query was:
Jan  7 04:27:53 mx postgres[1008]: query: select message_id from pop3 where 
server_id = 22615

> I've been suspicious for awhile that the system might neglect to release
> buffer cntx_lock spinlocks if an elog() occurs while one is held.  This
> looks like it might be such a case, but you're only showing us the end
> symptom not what led up to it ...

Just say me what can I do. Unfortunatly I can not reproduce the situation...

-- 
Sincerely Yours,
Denis Perchine

----------------------------------
E-Mail: dyp@perchine.com
HomePage: http://www.perchine.com/dyp/
FidoNet: 2:5000/120.5
----------------------------------

Re: Quite strange crash

From

Tom Lane

Date:

08 January 2001, 10:53:28

Denis Perchine <dyp@perchine.com> writes:
> On Monday 08 January 2001 00:08, Tom Lane wrote:
>>>> FATAL: s_lock(401f7435) at bufmgr.c:2350, stuck spinlock. Aborting.
>> 
>> Were there any errors before that?

> No... Just clean log (I redirect log from stderr/out t file, and all
> other to syslog).

The error messages would be in the syslog then, not in stderr.

> And the last query was:
> Jan  7 04:27:53 mx postgres[1008]: query: select message_id from pop3 where 
> server_id = 22615

How about the prior queries of other processes?  Keep in mind that the
spinlock could have been left locked by any backend, not only the one
that complained about it.
        regards, tom lane

Re: Quite strange crash

From

Denis Perchine

Date:

08 January 2001, 11:10:24

> >>>> FATAL: s_lock(401f7435) at bufmgr.c:2350, stuck spinlock. Aborting.
> >>
> >> Were there any errors before that?
> >
> > No... Just clean log (I redirect log from stderr/out t file, and all
> > other to syslog).
>
> The error messages would be in the syslog then, not in stderr.

Hmmm... The only strange errors I see are:

Jan  7 04:22:14 mx postgres[679]: query: insert into statistic (date, 
visit_count, variant_id) values (now(), 1, 2)
Jan  7 04:22:14 mx postgres[631]: query: insert into statistic (date, 
visit_count, variant_id) values (now(), 1, 2)
Jan  7 04:22:14 mx postgres[700]: query: insert into statistic (date, 
visit_count, variant_id) values (now(), 1, 2)
Jan  7 04:22:14 mx postgres[665]: query: insert into statistic (date, 
visit_count, variant_id) values (now(), 1, 2)
Jan  7 04:22:14 mx postgres[633]: query: insert into statistic (date, 
visit_count, variant_id) values (now(), 1, 2)
Jan  7 04:22:14 mx postgres[629]: query: insert into statistic (date, 
visit_count, variant_id) values (now(), 1, 2)
Jan  7 04:22:14 mx postgres[736]: query: commit
Jan  7 04:22:14 mx postgres[736]: ProcessUtility: commit
Jan  7 04:22:14 mx postgres[700]: ERROR:  Cannot insert a duplicate key into 
unique index statistic_date_vid_key
Jan  7 04:22:14 mx postgres[700]: query: update users set 
rcpt_ip='213.75.35.129',rcptdate=now() where id=1428067
Jan  7 04:22:14 mx postgres[700]: NOTICE:  current transaction is aborted, 
queries ignored until end of transaction block
Jan  7 04:22:14 mx postgres[679]: query: commit
Jan  7 04:22:14 mx postgres[679]: ProcessUtility: commit
Jan  7 04:22:14 mx postgres[679]: query: update users set 
rcpt_ip='213.75.55.185',rcptdate=now() where id=1430836
Jan  7 04:22:14 mx postgres[665]: ERROR:  Cannot insert a duplicate key into 
unique index statistic_date_vid_key
Jan  7 04:22:14 mx postgres[665]: query: update users set 
rcpt_ip='202.156.121.139',rcptdate=now() where id=1271397
Jan  7 04:22:14 mx postgres[665]: NOTICE:  current transaction is aborted, 
queries ignored until end of transaction block
Jan  7 04:22:14 mx postgres[631]: ERROR:  Cannot insert a duplicate key into 
unique index statistic_date_vid_key
Jan  7 04:22:14 mx postgres[631]: query: update users set 
rcpt_ip='24.20.53.63',rcptdate=now() where id=1451254
Jan  7 04:22:14 mx postgres[631]: NOTICE:  current transaction is aborted, 
queries ignored until end of transaction block
Jan  7 04:22:14 mx postgres[633]: ERROR:  Cannot insert a duplicate key into 
unique index statistic_date_vid_key
Jan  7 04:22:14 mx postgres[633]: query: update users set 
rcpt_ip='213.116.168.173',rcptdate=now() where id=1378049
Jan  7 04:22:14 mx postgres[633]: NOTICE:  current transaction is aborted, 
queries ignored until end of transaction block
Jan  7 04:22:14 mx postgres[630]: query: select id,msg,next from alert
Jan  7 04:22:14 mx postgres[630]: query: select email,type from email where 
variant_id=2
Jan  7 04:22:14 mx postgres[630]: query:   select * from users where senderdate > now()-'10days'::interval AND
variant_id=2AND crypt='21AN6KRffJdFRFc511'
 
Jan  7 04:22:14 mx postgres[629]: ERROR:  Cannot insert a duplicate key into 
unique index statistic_date_vid_key
Jan  7 04:22:14 mx postgres[629]: query: update users set 
rcpt_ip='213.42.45.81',rcptdate=now() where id=1441046
Jan  7 04:22:14 mx postgres[629]: NOTICE:  current transaction is aborted, 
queries ignored until end of transaction block
Jan  7 04:22:15 mx postgres[711]: query: select message_id from pop3 where 
server_id = 17746
Jan  7 04:22:15 mx postgres[711]: ERROR:  Relation 'pop3' does not exist

They popped up 4 minutes before. And the most interesting is that relation 
pop3 does exist!

> > And the last query was:
> > Jan  7 04:27:53 mx postgres[1008]: query: select message_id from pop3
> > where server_id = 22615
>
> How about the prior queries of other processes?

I do not want to flood maillist (it will be too much of info). I can send you 
complete log file from Jan 7. It is 128Mb uncompressed. With gz it is 8Mb. 
Maybe it will be smaller with bz2.

>  Keep in mind that the
> spinlock could have been left locked by any backend, not only the one
> that complained about it.

Actually you can have a look on the logs yourself. Remember I gave you a 
password from postgres user. This is the same postgres. Logs are in 
/var/log/postgres. You will need postgres.log.1.gz.

-- 
Sincerely Yours,
Denis Perchine

----------------------------------
E-Mail: dyp@perchine.com
HomePage: http://www.perchine.com/dyp/
FidoNet: 2:5000/120.5
----------------------------------

Re: Quite strange crash

From

Tom Lane

Date:

08 January 2001, 12:21:43

Denis Perchine <dyp@perchine.com> writes:
>>>>>>> FATAL: s_lock(401f7435) at bufmgr.c:2350, stuck spinlock. Aborting.
>>>>> 
>>>>> Were there any errors before that?

> Actually you can have a look on the logs yourself.

Well, I found a smoking gun:

Jan  7 04:27:51 mx postgres[2501]: FATAL 1:  The system is shutting down

PID 2501 had been running:

Jan  7 04:25:44 mx postgres[2501]: query: vacuum verbose lazy;

What seems to have happened is that 2501 curled up and died, leaving
one or more buffer spinlocks locked.  Roughly one spinlock timeout
later, at 04:29:07, we have 1008 complaining of a stuck spinlock.
So that fits.

The real question is what happened to 2501?  None of the other backends
reported a SIGTERM signal, so the signal did not come from the
postmaster.

Another interesting datapoint: there is a second place in this logfile
where one single backend reports SIGTERM while its brethren keep running:

Jan  7 04:30:47 mx postgres[4269]: query: vacuum verbose;
...
Jan  7 04:38:16 mx postgres[4269]: FATAL 1:  The system is shutting down

There is something pretty fishy about this.  You aren't by any chance
running the postmaster under a ulimit setting that might cut off
individual backends after a certain amount of CPU time, are you?
What signal does a ulimit violation deliver on your machine, anyway?
        regards, tom lane

Re: Quite strange crash

From

ncm@zembu.com (Nathan Myers)

Date:

08 January 2001, 19:10:35

On Mon, Jan 08, 2001 at 12:21:38PM -0500, Tom Lane wrote:
> Denis Perchine <dyp@perchine.com> writes:
> >>>>>>> FATAL: s_lock(401f7435) at bufmgr.c:2350, stuck spinlock. Aborting.
> >>>>> 
> >>>>> Were there any errors before that?
> 
> > Actually you can have a look on the logs yourself.
> 
> Well, I found a smoking gun: ...
> What seems to have happened is that 2501 curled up and died, leaving
> one or more buffer spinlocks locked.  ...
> There is something pretty fishy about this.  You aren't by any chance
> running the postmaster under a ulimit setting that might cut off
> individual backends after a certain amount of CPU time, are you?
> What signal does a ulimit violation deliver on your machine, anyway?

It's worth noting here that modern Unixes run around killing user-level
processes more or less at random when free swap space (and sometimes
just RAM) runs low.  AIX was the first such, but would send SIGDANGER
to processes first to try to reclaim some RAM; critical daemons were
expected to explicitly ignore SIGDANGER. Other Unixes picked up the 
idea without picking up the SIGDANGER behavior.

The reason for this common pathological behavior is usually traced
to sloppy resource accounting.  It manifests as the bad policy of 
having malloc() (and sbrk() or mmap() underneath) return a valid 
pointer rather than NULL, on the assumption that most of the memory 
asked for won't be used just yet.  Anyhow, the system doesn't know 
how much memory is really available at that moment.

Usually the problem is explained with the example of a very large
process that forks, suddenly demanding twice as much memory. (Apache
is particularly egregious this way, allocating lots of memory and
then forking several times.)  Instead of failing the fork, the kernel
waits for a process to touch memory it was granted and then see if 
any RAM/swap has turned up to satisfy it, and then kill the process 
(or some random other process!) if not.

Now that programs have come to depend on this behavior, it has become
very hard to fix it. The implication for the rest of us is that we 
should expect our processes to be killed at random, just for touching 
memory granted, or for no reason at all. (Kernel people say, "They're 
just user-level programs, restart them;" or, "Maybe we can designate 
some critical processes that don't get killed".)  In Linux they try 
to invent heuristics to avoid killing the X server, because so many 
programs depend on it.  It's a disgraceful mess, really.

The relevance to the issue at hand is that processes dying during 
heavy memory load is a documented feature of our supported platforms.

Nathan Myers 
ncm@zembu.com

Re: Quite strange crash

From

Denis Perchine

Date:

09 January 2001, 00:52:43

> > Well, I found a smoking gun: ...
> > What seems to have happened is that 2501 curled up and died, leaving
> > one or more buffer spinlocks locked.  ...
> > There is something pretty fishy about this.  You aren't by any chance
> > running the postmaster under a ulimit setting that might cut off
> > individual backends after a certain amount of CPU time, are you?
> > What signal does a ulimit violation deliver on your machine, anyway?
>
> It's worth noting here that modern Unixes run around killing user-level
> processes more or less at random when free swap space (and sometimes
> just RAM) runs low.  AIX was the first such, but would send SIGDANGER
> to processes first to try to reclaim some RAM; critical daemons were
> expected to explicitly ignore SIGDANGER. Other Unixes picked up the
> idea without picking up the SIGDANGER behavior.

That's not the case for sure. There are 512Mb on the machine, and when I had 
this problem it was compltely unloaded (>300Mb in caches).

-- 
Sincerely Yours,
Denis Perchine

----------------------------------
E-Mail: dyp@perchine.com
HomePage: http://www.perchine.com/dyp/
FidoNet: 2:5000/120.5
----------------------------------

Re: Quite strange crash

From

Tom Lane

Date:

09 January 2001, 01:04:04

Denis Perchine <dyp@perchine.com> writes:
>> It's worth noting here that modern Unixes run around killing user-level
>> processes more or less at random when free swap space (and sometimes
>> just RAM) runs low.

> That's not the case for sure. There are 512Mb on the machine, and when I had 
> this problem it was compltely unloaded (>300Mb in caches).

The fact that VACUUM processes seemed to be preferential victims
suggests a resource limit of some sort.  I had suggested a CPU-time
limit, but perhaps it could also be disk-pages-written.
        regards, tom lane

Re: Quite strange crash

From

Denis Perchine

Date:

09 January 2001, 01:11:03

On Monday 08 January 2001 23:21, Tom Lane wrote:
> Denis Perchine <dyp@perchine.com> writes:
> >>>>>>> FATAL: s_lock(401f7435) at bufmgr.c:2350, stuck spinlock. Aborting.
> >>>>>
> >>>>> Were there any errors before that?
> >
> > Actually you can have a look on the logs yourself.
>
> Well, I found a smoking gun:
>
> Jan  7 04:27:51 mx postgres[2501]: FATAL 1:  The system is shutting down
>
> PID 2501 had been running:
>
> Jan  7 04:25:44 mx postgres[2501]: query: vacuum verbose lazy;

Hmmm... actually this is real problem with vacuum lazy. Sometimes it just do 
something for enormous amount of time (I have mailed a sample database to 
Vadim, but did not get any response yet). It is possible, that it was me, who 
killed the backend.

> What seems to have happened is that 2501 curled up and died, leaving
> one or more buffer spinlocks locked.  Roughly one spinlock timeout
> later, at 04:29:07, we have 1008 complaining of a stuck spinlock.
> So that fits.
>
> The real question is what happened to 2501?  None of the other backends
> reported a SIGTERM signal, so the signal did not come from the
> postmaster.
>
> Another interesting datapoint: there is a second place in this logfile
> where one single backend reports SIGTERM while its brethren keep running:
>
> Jan  7 04:30:47 mx postgres[4269]: query: vacuum verbose;
> ...
> Jan  7 04:38:16 mx postgres[4269]: FATAL 1:  The system is shutting down

Hmmm... Maybe this also was me... But I am not sure here.

> There is something pretty fishy about this.  You aren't by any chance
> running the postmaster under a ulimit setting that might cut off
> individual backends after a certain amount of CPU time, are you?

[postgres@mx postgres]$ ulimit -a
core file size (blocks)  1000000
data seg size (kbytes)   unlimited
file size (blocks)       unlimited
max memory size (kbytes) unlimited
stack size (kbytes)      8192
cpu time (seconds)       unlimited
max user processes       2048
pipe size (512 bytes)    8
open files               1024
virtual memory (kbytes)  2105343

No, there are no any ulimits.

> What signal does a ulimit violation deliver on your machine, anyway?
       if (psecs / HZ > p->rlim[RLIMIT_CPU].rlim_cur) {               /* Send SIGXCPU every second.. */
if(!(psecs % HZ))                       send_sig(SIGXCPU, p, 1);               /* and SIGKILL when we go over max.. */
            if (psecs / HZ > p->rlim[RLIMIT_CPU].rlim_max)                       send_sig(SIGKILL, p, 1);       }
 

This part of the kernel show the logic. This mean that process wil get 
SIGXCPU each second if it above soft limit, and SIGKILL when it will be above 
hardlimit.

-- 
Sincerely Yours,
Denis Perchine

----------------------------------
E-Mail: dyp@perchine.com
HomePage: http://www.perchine.com/dyp/
FidoNet: 2:5000/120.5
----------------------------------

Re: Quite strange crash

From

Tom Lane

Date:

09 January 2001, 01:23:31

Denis Perchine <dyp@perchine.com> writes:
> Hmmm... actually this is real problem with vacuum lazy. Sometimes it
> just do something for enormous amount of time (I have mailed a sample
> database to Vadim, but did not get any response yet). It is possible,
> that it was me, who killed the backend.

Killing an individual backend with SIGTERM is bad luck.  The backend
will assume that it's being killed by the postmaster, and will exit
without a whole lot of concern for cleaning up shared memory --- the
expectation is that as soon as all the backends are dead, the postmaster
will reinitialize shared memory.

You can get away with sending SIGINT (QueryCancel) to an individual
backend.  Anything else voids the warranty ;=)

But, having said that --- this VACUUM process had only been running
for two minutes of real time.  Seems unlikely that you'd have chosen
to kill it so quickly.
        regards, tom lane

RE: Quite strange crash

From

"Mikheev, Vadim"

Date:

09 January 2001, 01:26:54

> Killing an individual backend with SIGTERM is bad luck.  The backend
> will assume that it's being killed by the postmaster, and will exit
> without a whole lot of concern for cleaning up shared memory --- the

What code will be returned to postmaster in this case?

Vadim

Re: Quite strange crash

From

Tom Lane

Date:

09 January 2001, 01:40:54

"Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:
>> Killing an individual backend with SIGTERM is bad luck.  The backend
>> will assume that it's being killed by the postmaster, and will exit
>> without a whole lot of concern for cleaning up shared memory --- the

> What code will be returned to postmaster in this case?

Right at the moment, the backend will exit with status 0.  I think you
are thinking the same thing I am: maybe a backend that receives SIGTERM
ought to exit with nonzero status.

That would mean that killing an individual backend would instantly
translate into an installation-wide restart.  I am not sure whether
that's a good idea.  Perhaps this cure is worse than the disease.
Comments anyone?
        regards, tom lane

RE: Quite strange crash

From

"Mikheev, Vadim"

Date:

09 January 2001, 01:58:28

> >> Killing an individual backend with SIGTERM is bad luck.  
> >> The backend will assume that it's being killed by the postmaster,
> >> and will exit without a whole lot of concern for cleaning up shared 
> >> memory --- the

SIGTERM --> die() --> elog(FATAL)

Is it true that elog(FATAL) doesn't clean up shmem etc?
This would be very bad...

> > What code will be returned to postmaster in this case?
> 
> Right at the moment, the backend will exit with status 0.  I think you
> are thinking the same thing I am: maybe a backend that 
> receives SIGTERM ought to exit with nonzero status.
> 
> That would mean that killing an individual backend would instantly
> translate into an installation-wide restart.  I am not sure whether
> that's a good idea.  Perhaps this cure is worse than the disease.

Well, it's not good idea because of SIGTERM is used for ABORT + EXIT
(pg_ctl -m fast stop), but shouldn't ABORT clean up everything?

Vadim

Re: Quite strange crash

From

Tom Lane

Date:

09 January 2001, 02:26:11

"Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:
>>>>> Killing an individual backend with SIGTERM is bad luck.  

> SIGTERM --> die() --> elog(FATAL)

> Is it true that elog(FATAL) doesn't clean up shmem etc?
> This would be very bad...

It tries, but I don't think it's possible to make a complete guarantee
without an unreasonable amount of overhead.  The case at hand was a
stuck spinlock because die() --> elog(FATAL) had neglected to release
that particular spinlock before exiting.  To guarantee that all
spinlocks will be released by die(), we'd need something like
START_CRIT_SECTION;S_LOCK(spinlock);record that we own spinlock;END_CRIT_SECTION;

around every existing S_LOCK() call, and the reverse around every
S_UNLOCK.  Are you willing to pay that kind of overhead?  I'm not
sure this'd be enough anyway.  Guaranteeing that you have consistent
state at every instant that an ISR could interrupt you is not easy.
        regards, tom lane

Re: Quite strange crash

From

Alfred Perlstein

Date:

09 January 2001, 02:33:09

* Mikheev, Vadim <vmikheev@SECTORBASE.COM> [010108 23:08] wrote:
> > >> Killing an individual backend with SIGTERM is bad luck.  
> > >> The backend will assume that it's being killed by the postmaster,
> > >> and will exit without a whole lot of concern for cleaning up shared 
> > >> memory --- the
> 
> SIGTERM --> die() --> elog(FATAL)
> 
> Is it true that elog(FATAL) doesn't clean up shmem etc?
> This would be very bad...
> 
> > > What code will be returned to postmaster in this case?
> > 
> > Right at the moment, the backend will exit with status 0.  I think you
> > are thinking the same thing I am: maybe a backend that 
> > receives SIGTERM ought to exit with nonzero status.
> > 
> > That would mean that killing an individual backend would instantly
> > translate into an installation-wide restart.  I am not sure whether
> > that's a good idea.  Perhaps this cure is worse than the disease.
> 
> Well, it's not good idea because of SIGTERM is used for ABORT + EXIT
> (pg_ctl -m fast stop), but shouldn't ABORT clean up everything?

Er, shouldn't ABORT leave the system in the exact state that it's
in so that one can get a crashdump/traceback on a wedged process
without it trying to clean up after itself?

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."

Re: Quite strange crash

From

"Vadim Mikheev"

Date:

09 January 2001, 03:26:42

> > Well, it's not good idea because of SIGTERM is used for ABORT + EXIT
> > (pg_ctl -m fast stop), but shouldn't ABORT clean up everything?
> 
> Er, shouldn't ABORT leave the system in the exact state that it's
> in so that one can get a crashdump/traceback on a wedged process
> without it trying to clean up after itself?

Sorry, I've meant "transaction abort"...

Vadim

Re: Quite strange crash

From

Tom Lane

Date:

09 January 2001, 12:41:26

ncm@zembu.com (Nathan Myers) writes:
> The relevance to the issue at hand is that processes dying during 
> heavy memory load is a documented feature of our supported platforms.

Ugh.  Do you know anything about *how* they get killed --- ie, with
what signal?
        regards, tom lane

RE: Quite strange crash

From

"Mikheev, Vadim"

Date:

09 January 2001, 13:11:44

> > Is it true that elog(FATAL) doesn't clean up shmem etc?
> > This would be very bad...
> 
> It tries, but I don't think it's possible to make a complete guarantee
> without an unreasonable amount of overhead.  The case at hand was a
> stuck spinlock because die() --> elog(FATAL) had neglected to release
> that particular spinlock before exiting.  To guarantee that all
> spinlocks will be released by die(), we'd need something like
> 
>     START_CRIT_SECTION;
>     S_LOCK(spinlock);
>     record that we own spinlock;
>     END_CRIT_SECTION;
> 
> around every existing S_LOCK() call, and the reverse around every
> S_UNLOCK.  Are you willing to pay that kind of overhead?  I'm not

START_/END_CRIT_SECTION is mostly CritSectionCount++/--.
Recording could be made as LockedSpinLocks[LockedSpinCounter++] = &spinlock
in pre-allocated array.

Another way of implementing Transaction Abort + Exit could be some flag
in shmem setted by postmaster + QueryCancel..?

> sure this'd be enough anyway.  Guaranteeing that you have consistent
> state at every instant that an ISR could interrupt you is not easy.

Agreed, but we have to forget old happy days when it was so easy to
shutdown DB. If we aren't able to release spins (eg excl buffer lock)
on Abort+Exit then instead of fast shutdown by pg_ctl -m fast stop
ppl can get checkpointer stuck attempting shlock that buffer.
(BTW, it's bad that pg_ctl doesn't wait on shutdown by default).

Vadim

Re: Quite strange crash

From

Denis Perchine

Date:

09 January 2001, 13:25:01

> > The relevance to the issue at hand is that processes dying during
> > heavy memory load is a documented feature of our supported platforms.
>
> Ugh.  Do you know anything about *how* they get killed --- ie, with
> what signal?

Didn't you get my mail with a piece of Linux kernel code? I think all is 
clear there.

-- 
Sincerely Yours,
Denis Perchine

----------------------------------
E-Mail: dyp@perchine.com
HomePage: http://www.perchine.com/dyp/
FidoNet: 2:5000/120.5
----------------------------------

Re: Quite strange crash

From

Tom Lane

Date:

09 January 2001, 13:28:25

"Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:
> START_/END_CRIT_SECTION is mostly CritSectionCount++/--.
> Recording could be made as LockedSpinLocks[LockedSpinCounter++] = &spinlock
> in pre-allocated array.

Yeah, I suppose.  We already do record locking of all the fixed
spinlocks (BufMgrLock etc), it's just the per-buffer spinlocks that
are missing from that (and CRIT_SECTION calls).  Would it be reasonable
to assume that only one buffer spinlock could be held at a time?

> (BTW, it's bad that pg_ctl doesn't wait on shutdown by default).

I agree.  Anyone object to changing pg_ctl to do -w by default?
What should we call the switch to tell it to not wait?  -n?
        regards, tom lane

Re: Quite strange crash

From

Tom Lane

Date:

09 January 2001, 13:29:15

Denis Perchine <dyp@perchine.com> writes:
> Didn't you get my mail with a piece of Linux kernel code? I think all is 
> clear there.

That was implementing CPU-time-exceeded kill, which is a different
issue.
        regards, tom lane

Re: Quite strange crash

From

Denis Perchine

Date:

09 January 2001, 13:44:01

> > Didn't you get my mail with a piece of Linux kernel code? I think all is
> > clear there.
>
> That was implementing CPU-time-exceeded kill, which is a different
> issue.

Opps.. You are talking about OOM killer.

/* This process has hardware access, be more careful. */
if (cap_t(p->cap_effective) & CAP_TO_MASK(CAP_SYS_RAWIO)) { force_sig(SIGTERM, p);
} else { force_sig(SIGKILL, p);
}

You will get SIGKILL in most cases.

-- 
Sincerely Yours,
Denis Perchine

----------------------------------
E-Mail: dyp@perchine.com
HomePage: http://www.perchine.com/dyp/
FidoNet: 2:5000/120.5
----------------------------------

RE: Quite strange crash

From

"Mikheev, Vadim"

Date:

09 January 2001, 13:50:54

> > START_/END_CRIT_SECTION is mostly CritSectionCount++/--.
> > Recording could be made as 
> > LockedSpinLocks[LockedSpinCounter++] = &spinlock
> > in pre-allocated array.
> 
> Yeah, I suppose.  We already do record locking of all the fixed
> spinlocks (BufMgrLock etc), it's just the per-buffer spinlocks that
> are missing from that (and CRIT_SECTION calls). Would it be 
> reasonable to assume that only one buffer spinlock could be held
> at a time?

No. UPDATE holds two spins, btree split even more.
But stop - afair bufmgr remembers locked buffers, probably
we could just add XXX_CRIT_SECTION to LockBuffer..?

Vadim

Re: Quite strange crash

From

Tom Lane

Date:

09 January 2001, 14:22:39

"Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:
>> Yeah, I suppose.  We already do record locking of all the fixed
>> spinlocks (BufMgrLock etc), it's just the per-buffer spinlocks that
>> are missing from that (and CRIT_SECTION calls). Would it be 
>> reasonable to assume that only one buffer spinlock could be held
>> at a time?

> No. UPDATE holds two spins, btree split even more.
> But stop - afair bufmgr remembers locked buffers, probably
> we could just add XXX_CRIT_SECTION to LockBuffer..?

Right.  A buffer lock isn't a spinlock, ie, we don't hold the spinlock
except within LockBuffer.  So a quick CRIT_SECTION should deal with
that.  Actually, with careful placement of CRIT_SECTION calls in
LockBuffer, there's no need to record holding the buffer's cntxt
spinlock at all, I think.  Will work on it.
        regards, tom lane

Re: Quite strange crash

From

Tom Lane

Date:

09 January 2001, 14:25:48

Denis Perchine <dyp@perchine.com> writes:
> You will get SIGKILL in most cases.

Well, a SIGKILL will cause the postmaster to shut down and restart the
other backends, so we should be safe if that happens.  (Annoyed as heck,
maybe, but safe.)

Anyway, this is looking more and more like the SIGTERM that caused your
vacuum to die must have been done manually.

The CRIT_SECTION code that I'm about to go off and add to spinlocking
should prevent similar problems from happening in 7.1, but I don't think
it's reasonable to try to retrofit that into 7.0.*.
        regards, tom lane

Re: Quite strange crash

From

ncm@zembu.com (Nathan Myers)

Date:

09 January 2001, 15:06:09

On Wed, Jan 10, 2001 at 12:46:50AM +0600, Denis Perchine wrote:
> > > Didn't you get my mail with a piece of Linux kernel code? I think all is
> > > clear there.
> >
> > That was implementing CPU-time-exceeded kill, which is a different
> > issue.
> 
> Opps.. You are talking about OOM killer.
> 
> /* This process has hardware access, be more careful. */
> if (cap_t(p->cap_effective) & CAP_TO_MASK(CAP_SYS_RAWIO)) {
>   force_sig(SIGTERM, p);
> } else {
>   force_sig(SIGKILL, p);
> }
> 
> You will get SIGKILL in most cases.

... on Linux, anyhow.  There's no standard for this behavior.
Probably others try a SIGTERM first (on several processes) and 
then a SIGKILL if none die.

If a backend dies while holding a lock, doesn't that imply that
the shared memory may be in an inconsistent state?  Surely a death
while holding a lock should shut down the whole database, without
writing anything to disk.

Nathan Myers
ncm@zembu.com

Re: Quite strange crash

From

Tom Lane

Date:

09 January 2001, 16:28:35

ncm@zembu.com (Nathan Myers) writes:
> If a backend dies while holding a lock, doesn't that imply that
> the shared memory may be in an inconsistent state?

Yup.  I had just come to the realization that we'd be best off to treat
the *entire* period from SpinAcquire to SpinRelease as a critical
section for the purposes of die().  That is, response to SIGTERM will be
held off until we have released the spinlock.  Most of the places where
we grab spinlocks would have to make such a critical section anyway, at
least for large parts of the time that they are holding the spinlock,
because they are manipulating shared data structures and the
instantaneous intermediate states aren't always self-consistent.  So we
might as well follow the KISS principle and just do START_CRIT_SECTION
in SpinAcquire and END_CRIT_SECTION in SpinRelease.

Vadim, any objection?
        regards, tom lane

RE: Quite strange crash

From

"Mikheev, Vadim"

Date:

09 January 2001, 16:40:04

> Yup.  I had just come to the realization that we'd be best 
> off to treat the *entire* period from SpinAcquire to SpinRelease
> as a critical section for the purposes of die(). That is, response
> to SIGTERM will be held off until we have released the spinlock.
> Most of the places where we grab spinlocks would have to make such
> a critical section anyway, at least for large parts of the time that
> they are holding the spinlock, because they are manipulating shared
> data structures and the instantaneous intermediate states aren't always
> self-consistent.  So we might as well follow the KISS principle and
> just do START_CRIT_SECTION in SpinAcquire and END_CRIT_SECTION in
> SpinRelease.
> 
> Vadim, any objection?

No one for the moment. If we'll just add XXX_CRIT_SECTION
to SpinXXX funcs without changing anything else then it will be easy
to remove them later (in the event we'll find any problems with this),
so - do it.

Vadim