Re: beta testing version - Mailing list pgsql-hackers
From | Ian Lance Taylor |
---|---|
Subject | Re: beta testing version |
Date | |
Msg-id | 20001201083057.19944.qmail@daffy.airs.com Whole thread Raw |
In response to | Re: beta testing version (Alex Pilosov <alex@pilosoft.com>) |
Responses |
Re: beta testing version
|
List | pgsql-hackers |
Date: Fri, 1 Dec 2000 01:54:23 -0500 (EST) From: Alex Pilosov <alex@pilosoft.com> On Thu, 30 Nov 2000, Nathan Myers wrote: > After a power outage on an active database, you may have corruption > at low levels of the system, and unless you haveenormous redundancy > (and actually use it to verify everything) the corruption may go > undetected and result in(subtly) wrong answers at any future time. Nathan, why are you so hostile against postgres? Is there an ax to grind? I don't think he is being hostile (I work with him, so I know that he is generally pro-postgres). The conditions under which WAL will completely recover your database: 1) OS guarantees complete ordering of fsync()'dwrites. (i.e. having two blocks A and B, A is fsync'd before B, it could NOT happen that B is on disk but A isnot). 2) on boot recovery, OS must not corrupt anything that was fsync'd. Rule 1) is met by all unixish OSes in existance. Rule 2 is met by some filesystems, such as reiserfs, tux2, and softupdates. I think you are missing his main point, which he stated before, which is that modern disk hardware is both smarter and stupider than most people realize. Some disks cleverly accept writes into a RAM cache, and return a completion signal as soon as they have done that. They then feel free to reorder the writes to magnetic media as they see fit. This significantly helps performance. However, it means that all bets off on a sudden power loss. Your rule 1 is met at the OS level, but it is not met at the physical drive level. The fact that the OS guarantees ordering of fsync()'d writes means little since the drive is capable of reordering writes behind the back of the OS. At least with IDE, it is possible to tell the drive to disable this sort of caching and reordering. However, GNU/Linux, at least, does not do this. After all, doing it would hurt performance, and would move us back to the old days when operating systems had to care a great deal about disk geometry. I expect that careful attention to the physical disks you purchase can help you avoid these problems. For example, I would hope that EMC disk systems handle power loss gracefully. But if you buy ordinary off the shelf PC hardware, you really do need to arrange for a UPS, and some sort of automatic shutdown if the UPS is running low. Otherwise, although the odds are certainly with you, there is no 100% guarantee that a busy database will survive a sudden power outage. Ian
pgsql-hackers by date: