Thread: Why O_SYNC is faster than fsync on ext3
I sent this to Bruce but forgot to cc pgsql-hackers, The patches are likely to go into 2.6.6. People interested in extremely safe fsync writes should also follow the IDE barrier thread and the true fsync() in Linux on IDE thread ----- Forwarded message from Yusuf Goolamabbas <yusufg@outblaze.com> ----- Date: Sat, 20 Mar 2004 20:52:34 +0800 From: Yusuf Goolamabbas <yusufg@outblaze.com> To: Bruce Momjian <pgman@candle.pha.pa.us> Subject: Your fsync thread on hackers Message-ID: <20040320125234.GA11221@outblaze.com> Bruce, haven't followed the thread completely. Accessing the web archive is slow from Hong Kong but I just wanted to point you to this lkml post which shows why O_SYNC is much faster than fsync (at least on ext3) http://marc.theaimsgroup.com/?l=linux-kernel&m=107959907410443&w=2 There are some pending fsync speedups on XFS also. You might want to consider pointing Tom to do this so he can get the Redhat/Fedora guys to look at the patches Hope this helps, Regards, Yusuf ----- End forwarded message ----- -- If you're not using Firefox, you're not surfing the web you're suffering it http://www.mozilla.org/products/firefox/why/
Yusuf Goolamabbas wrote: >I sent this to Bruce but forgot to cc pgsql-hackers, The patches are >likely to go into 2.6.6. People interested in extremely safe fsync >writes should also follow the IDE barrier thread and the true fsync() in >Linux on IDE thread > > Actually the most interesting part of the thread was the initial post from Peter Zaitsev on a fcntl(fd, F_FULLSYNC, NULL): He wrote that this is necessary for Mac OS X to force a flush of the write caches in the disks. Unfortunately I can't find anything about this flag with google. Another interesting point is that right now, ide write caches must be disabled for reliable fsync operations with Linux. Recent suse kernels contain partial support. If the existing patches are completed and merged, it will be safe to enable write caching. Perhaps Bruce's cache flush test could be modified slightly to check that the OS isn't lying about fsync: if fsync is faster than the rotational delay of the disks, then the setup is not suitable for postgres. This could be recommended as a setup test in the install document. -- Manfred
Yusuf Goolamabbas <yusufg@outblaze.com> writes: > Bruce, haven't followed the thread completely. Accessing the web archive > is slow from Hong Kong but I just wanted to point you to this lkml post > which shows why O_SYNC is much faster than fsync (at least on ext3) > http://marc.theaimsgroup.com/?l=linux-kernel&m=107959907410443&w=2 That patch is broken on its face. If O_SYNC doesn't take longer than O_DSYNC, and likewise fsync longer than fdatasync, then the Unix filesystem semantics are not being honored because the file mod time isn't being updated. regards, tom lane