Re: [PING] fallocate() causes btrfs to never compress postgresql files - Mailing list pgsql-hackers

From Dimitrios Apostolou
Subject Re: [PING] fallocate() causes btrfs to never compress postgresql files
Date
Msg-id aeba99d6-24a1-92af-380d-926d41b1acc0@gmx.net
Whole thread Raw
In response to [PING] fallocate() causes btrfs to never compress postgresql files  (Dimitrios Apostolou <jimis@gmx.net>)
Responses Re: [PING] fallocate() causes btrfs to never compress postgresql files
List pgsql-hackers
On Sun, 1 Jun 2025, Thomas Munro wrote:

> Or for a completely different approach: I wonder if ftruncate() would
> be more efficient on COW systems anyway.  The minimum thing we need is
> for the file system to remember the new size, 'cause, erm, we don't.
> All the rest is probably a waste of cycles, since they reserve real
> space (or fail to) later in the checkpointer or whatever process
> eventually writes the data out.

FWIW I asked the btrfs devs. From
https://github.com/kdave/btrfs-progs/pull/976
I quote Qu Wenruo:

> Only for falloc(), not ftruncate().
>
> The PREALLOC inode flag is added for any preallocated file extent,
> meanwhile truncate only creates holes.
>
> truncate is fast but it's really different from fallocate by there is
> nothing really allocated.
>
> This means the later writes will need to allocate their own data
> extents. This is fine and even preferred for btrfs, but may lead to
> performance drop for more traditional fses.
>
> We're in an era that fs features are not longer that generic, fallocate
> is just one example, in fact fallocate will cause more problems more
> than no compression.
>
> It's really a deep rabbit hole, and is not something simple true or
> false questions.


In other words, btrfs will not try to allocate anything with ftruncate(),
it will just mark the new space as a "hole". As such, the file is not
marked as "PREALLOC" which is what disables compression. Of course there
is no guarantee that further writes will succeed, and as quoted above,
other (non-COW) filesystems might be slower writing the
ftruncate()-allocated space.


Regards,
Dimitris




pgsql-hackers by date:

Previous
From: Masahiro Ikeda
Date:
Subject: Re: Assertion failure in smgr.c when using pg_prewarm with partitioned tables
Next
From: Amit Kapila
Date:
Subject: Re: Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly