Re: Background fsck - Mailing list pgsql-performance

From Ireneusz Pluta
Subject Re: Background fsck
Date
Msg-id 4D9ED6BD.7090804@wp.pl
Whole thread Raw
In response to Re: Background fsck  (Greg Smith <greg@2ndQuadrant.com>)
List pgsql-performance
Greg Smith wrote:
> The soft update code used in FreeBSD makes sure that there's no damage to the filesystem that
> PostgreSQL can't recover from.  Once the WAL is replayed after a crash, the database is
> consistent.  The main purpose of the background fsck is to find "orphaned" space, things that the
> filesystem incorrectly remembers the state of in regards to whether it was allocated and used.  In
> theory, there's no reason that can't happen in the background, concurrent with normal database
> activity.
>
> In practice, background fsck is such an infrequently used piece of code that it's developed a bit
> of a reputation for being buggier than average.  It's really hard to test it, filesystem code is
> complicated, and the sort of inconsistent data you get after a hard crash is often really
> surprising.  I wouldn't be too concerned about the database integrity, but there is a small risk
> that background fsck will run into something unexpected and panic.  And that's a problem you're
> much less likely to hit using the more stable regular fsck code; thus the recommendations by some
> to avoid it.
>

Thank you all for your responses.

Greg, given your opinion, and these few raised issues found on the net, I think I better stay with
background fsck disabled.

What I was primarily concerned about, was long time waiting in front of console, looking at lazy
fsck messages and nervously confirming that disk LEDs are still blinking. It's even harder with
remote KVM, where LED's view is not available. But my personal comfort is not a priority, anyway, so
I let foreground fsck doing its job for as much time as it needs.

As I said in my another response, the problem initially comes from the machine hanging and having to
be manually power cycled. There is already a significant downtinme before the recycle has a chance
to happen. So yet another fourty minutes of fsck does not matter too much from the point of view of
service availability.

fsck runtime duration could be shortened if I used smaller inode density for the filesystem. I think
that makes much sense for a filesystem fully decicated to a postgres data cluster, specifically if I
have not so many but large tables, which I rather do.

The system in question has:

df -hi | grep -E 'base|ifree'
Filesystem     Size    Used   Avail Capacity iused     ifree %iused  Mounted on
/dev/da1p3     3.0T    1.7T    1.0T    63%    485k      392M    0%   /pg/base
(will I ever have even tens of millions of tables?)

I reserved less inodes in a newer, bigger system:
Filesystem            Size    Used   Avail Capacity iused ifree %iused  Mounted on
/dev/mfid0p8           12T    4.8T    6.0T    45%    217k   49M    0%   /pg/base

or even less in yet newer one:
Filesystem            Size    Used   Avail Capacity iused ifree %iused  Mounted on
/dev/mfid0p1           12T    3.6T    7.4T    33%    202k  3.4M    6%   /pg/base
(ups, maybe too aggressive here?)

When I forced a power drop on these two other systems, to check how they survive, fsck duration on
them was substantially less.

In the inode density context, let me ask you yet another question. Does tuning it in this way have
any other, good or bad, significant impact on system performance?


Irek.



pgsql-performance by date:

Previous
From: Scott Marlowe
Date:
Subject: Re: Background fsck
Next
From: Ivan Voras
Date:
Subject: Re: Background fsck