Re: BUG #12785: server process (PID 2872) was terminated by exception 0xC0000005 - Mailing list pgsql-bugs
From | Francisco Olarte |
---|---|
Subject | Re: BUG #12785: server process (PID 2872) was terminated by exception 0xC0000005 |
Date | |
Msg-id | CA+bJJbyF9Pz6cTv5=qik5VL5yvCaEz+-6pEuAFrB-DQhkMu3yA@mail.gmail.com Whole thread Raw |
In response to | BUG #12785: server process (PID 2872) was terminated by exception 0xC0000005 (daniele.posenato@smartec.ch) |
Responses |
Re: BUG #12785: server process (PID 2872) was terminated by
exception 0xC0000005
|
List | pgsql-bugs |
Hi Daniele: On Mon, Feb 23, 2015 at 6:09 PM, Daniele Posenato < daniele.posenato@smartec.ch> wrote: > Thank you a lot for the answer, I really appreciate it. I will try to do > what you have suggested and then I will let you know. > =E2=80=8BThat's ok, but I doubt I can help you more ( I abandoned Windows m= ore than a dozen years ago, haven't looked back, although I still remember how that code appeared when I did something wrong in my programs ).=E2=80=8B > > Just for information the problem has occurred again since the last emai= l > and always on the same query. I could understand a crash of the service = on > performing an update or a delete, but I have some difficulties to > understand this on a select. If it was an hardware problem I would expec= t > the service to crash also on other actions and not randomly (about once p= er > week) only on a specific select (that is executed every 10 seconds). > =E2=80=8BIs that query consuming a lot of your resources? ( It may be due t= o it being lengthy or just frequent ) because in that case it makes sense. In many applications I have 99.9% of the work / ram usage are selects, so a random crash is normally going to hit me in one of this. On the crashing on select stuff. Suppose you have a faulty sector or ram location. When you write to it ( upd or del ) nothing happens, it just sotres the bad value, when you read it ( select, part 1, reading from disk/ram ) nothing happens, you just get bad data, say a null pointer, then when you use ( select part 2 ) you get the fault. In fact, if a ram location loses data written you do not notice it on writting it, or on reading it ( unless you get a parity error ) but on using what you read from it. =E2=80=8BThis is a normal pattern on programming bugs too. You have an erro= r in some code and store something in a random ( or not so random ) ram location . That code seems to work ok. But then an unrelated piece of code reads the corrupted data and crashes ( it is one of the way the buffer overflows work, the guilty code overflows a buffer, but works, and another chunk of code gets its data overwritten and crashes ). > > Is there a way to write a select that is able to crash the service? > =E2=80=8BWith a good database, on good hardware, with adequate ( inifinite,= as you can crash any service by just joining enough copies of a table to exhaust avalible ) memory and disk there shouldn't be, but if you read corrupted data or get hit by a bit flips in the middle of processing, it may Are you able to do a full database dump ( pg_dump, not base backup ) of your database? If you are then you are able to read all the tables, and I would suggest trying to reindex every table if you have quiescent periods ( pg_dump does not touch indexes, so if you have good data bad corrupted indexes that should fix it )=E2=80=8B > > I will let you know the results of the hardware check after the planned > restart. > =E2=80=8BI do not know ( or remember ) what your DB sizes =E2=80=8Band upt= ime requirements are. But I've had that kind of problems caused by corrupted disk structures, and have being able to recover them rewritting the database, that means dump, drop, restore, but this depends on the system, I cannot recommend doing it, but as I said before, if I had the same aplication in 4 machines crashing randomly in only one of them I would try to triple test the machine and dump / restore it. Best ergards. Francisco Olarte.
pgsql-bugs by date: