Thread: Postgres processes getting stuck (bug?)
Hi guys,
I had a question. We're running Postgresql 9.0.7 , and all of a sudden we started seeing unexpected behavior. One table got locked after we ran one simple INSERT on one test item.
Nothing else was running against that table as far as we know, and the query is frozen there now. After some time we tried to terminate it forcefully (pg_terminate_backend), ran that and the return of the command is "true" but the query still stays on, shows up on pg_stat_activity etc.
Reading from the table is fine, but we believe any other inserts / updates are not possible.
Restarting the Postgresql server is not an option at this moment, since it's a production box and another high-demand database is running from that.
Has anyone seen a similar issue (maybe a bug in the 9.0 version ?) where simple statements don't finish and get locked there and can't be killed by the pg_terminate_backed ? What is the cause of this ?
Any reasonable way to find out more details on what caused this, how to prevent it in the future, and how it can be fixed sensitively now ? Thanks !
just fyi, checking the data in the "pg_locks" (for that process that is frozen), shows a bunch of rows all with the same virtualtransactionid. Only one of them has the mode of "ExclusiveLock" (the only record there with a locktype of "virtualxid"), a few have the "RowExclusiveLock" mode and the vast majority have the "AccessShareLock" mode.
Thanks,
Ciprian
Check the wait state for the backend process. I’ve seen this happen when a process ran a per-row trigger that tried to connect to something else, and it filled the ip_conntrack table. So, kernel level wait along with a whole bunch of locks on the table in question.
Running pg_terminate_backend() didn’t work, as the signal queued behind the kernel wait. We had to bounce the database to get rid of the problem. Immediately afterwards we disabled the trigger.
HTH,
Bob Lunney
On Apr 29, 2016, at 1:30 PM, Ciprian Grigoras <ciprian.grigoras@vitals.com> wrote:Hi guys,I had a question. We're running Postgresql 9.0.7 , and all of a sudden we started seeing unexpected behavior. One table got locked after we ran one simple INSERT on one test item.Nothing else was running against that table as far as we know, and the query is frozen there now. After some time we tried to terminate it forcefully (pg_terminate_backend), ran that and the return of the command is "true" but the query still stays on, shows up on pg_stat_activity etc.Reading from the table is fine, but we believe any other inserts / updates are not possible.Restarting the Postgresql server is not an option at this moment, since it's a production box and another high-demand database is running from that.Has anyone seen a similar issue (maybe a bug in the 9.0 version ?) where simple statements don't finish and get locked there and can't be killed by the pg_terminate_backed ? What is the cause of this ?Any reasonable way to find out more details on what caused this, how to prevent it in the future, and how it can be fixed sensitively now ? Thanks !just fyi, checking the data in the "pg_locks" (for that process that is frozen), shows a bunch of rows all with the same virtualtransactionid. Only one of them has the mode of "ExclusiveLock" (the only record there with a locktype of "virtualxid"), a few have the "RowExclusiveLock" mode and the vast majority have the "AccessShareLock" mode.Thanks,Ciprian
On 4/29/2016 10:30 AM, Ciprian Grigoras wrote:
We're running Postgresql 9.0.7...
do note, the 9.0 series got up to 9.0.23 before it was discontinued last year. 9.0.7 was released in February 2012, 4+ years ago.
9.0.18 fixed some index corruption bugs in GIST indexes, 9.0.15 fixed a bunch of other data corruption problems, 9.0.13 fixed yet more GiST index problems. of course, every one of these incremental updates fixed dozens of relatively obscure bugs, you'd need to read the release notes for each version between 9.0.8 and 9.0.23 for the complete list.
-- john r pierce, recycling bits in santa cruz
Ciprian Grigoras <ciprian.grigoras@vitals.com> writes: > Hi guys, > > I had a question. We're running Postgresql 9.0.7 , and all of a sudden we started seeing unexpected behavior. One tablegot locked after we ran one simple INSERT on one > test item. > Nothing else was running against that table as far as we know, and the query is frozen there now. After some time we triedto terminate it forcefully > (pg_terminate_backend), ran that and the return of the command is "true" but the query still stays on, shows up on pg_stat_activityetc. > Reading from the table is fine, but we believe any other inserts / updates are not possible. What do you get from strace -p $pid? > Restarting the Postgresql server is not an option at this moment, since it's a production box and another high-demand databaseis running from that. > > Has anyone seen a similar issue (maybe a bug in the 9.0 version ?) where simple statements don't finish and get lockedthere and can't be killed by the > pg_terminate_backed ? What is the cause of this ? > Any reasonable way to find out more details on what caused this, how to prevent it in the future, and how it can be fixedsensitively now ? Thanks ! > > just fyi, checking the data in the "pg_locks" (for that process that is frozen), shows a bunch of rows all with the samevirtualtransactionid. Only one of them has the > mode of "ExclusiveLock" (the only record there with a locktype of "virtualxid"), a few have the "RowExclusiveLock" modeand the vast majority have the "AccessShareLock" > mode. > > Thanks, > Ciprian > -- Jerry Sievers Postgres DBA/Development Consulting e: postgres.consulting@comcast.net p: 312.241.7800
Thanks everyone, it could be that the Linux server is problematic (it's got a long uptime).
I ran the strace thing,
and it kept waiting there with minimal information, nothing else was shown...
|
On Fri, Apr 29, 2016 at 2:33 PM, Jerry Sievers <gsievers19@comcast.net> wrote:
Ciprian Grigoras <ciprian.grigoras@vitals.com> writes:
> Hi guys,
>
> I had a question. We're running Postgresql 9.0.7 , and all of a sudden we started seeing unexpected behavior. One table got locked after we ran one simple INSERT on one
> test item.
> Nothing else was running against that table as far as we know, and the query is frozen there now. After some time we tried to terminate it forcefully
> (pg_terminate_backend), ran that and the return of the command is "true" but the query still stays on, shows up on pg_stat_activity etc.
> Reading from the table is fine, but we believe any other inserts / updates are not possible.
What do you get from strace -p $pid?--
> Restarting the Postgresql server is not an option at this moment, since it's a production box and another high-demand database is running from that.
>
> Has anyone seen a similar issue (maybe a bug in the 9.0 version ?) where simple statements don't finish and get locked there and can't be killed by the
> pg_terminate_backed ? What is the cause of this ?
> Any reasonable way to find out more details on what caused this, how to prevent it in the future, and how it can be fixed sensitively now ? Thanks !
>
> just fyi, checking the data in the "pg_locks" (for that process that is frozen), shows a bunch of rows all with the same virtualtransactionid. Only one of them has the
> mode of "ExclusiveLock" (the only record there with a locktype of "virtualxid"), a few have the "RowExclusiveLock" mode and the vast majority have the "AccessShareLock"
> mode.
>
> Thanks,
> Ciprian
>
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consulting@comcast.net
p: 312.241.7800
Ciprian Grigoras Software Engineer P : 201.459.6275 | M : 845.517.8710 |
Attachment
On Fri, Apr 29, 2016 at 1:09 PM, John R Pierce <pierce@hogranch.com> wrote: > On 4/29/2016 10:30 AM, Ciprian Grigoras wrote: > > We're running Postgresql 9.0.7... > > > do note, the 9.0 series got up to 9.0.23 before it was discontinued last > year. 9.0.7 was released in February 2012, 4+ years ago. > > 9.0.18 fixed some index corruption bugs in GIST indexes, 9.0.15 fixed a > bunch of other data corruption problems, 9.0.13 fixed yet more GiST index > problems. of course, every one of these incremental updates fixed > dozens of relatively obscure bugs, you'd need to read the release notes for > each version between 9.0.8 and 9.0.23 for the complete list. +1 this -- OP is running EOL version of postgres minus 16 or so bugfix releases. merlin
Ciprian Grigoras wrote: > Has anyone seen a similar issue (maybe a bug in the 9.0 version ?) where > simple statements don't finish and get locked there and can't be killed by > the pg_terminate_backed ? What is the cause of this ? Interesting. I wonder if you just wrapped around pg_multixact and the backend is stuck in the loop there. I have never heard of anyone with this problem, and wraparound is supposed to work fine in 9.0. (As I recall that code is there since 8.1 so it'd be very surprising that it would have bugs there and not notice all this time). Since you've been evidently running with this for years and never seen this problem, I would be surprised if you ever see it again. Still, if it does happen, please do grab a stack trace with GDB on the blocked processes. Make sure to have debug symbols. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services