Re: Compression of full-page-writes - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Compression of full-page-writes |
Date | |
Msg-id | CA+TgmoZWTg7LY7B34SMMqNszR69nQCy3_uktyh2_tnwf7FmG-g@mail.gmail.com Whole thread Raw |
In response to | Re: Compression of full-page-writes ("ktm@rice.edu" <ktm@rice.edu>) |
Responses |
Re: Compression of full-page-writes
|
List | pgsql-hackers |
On Thu, Oct 24, 2013 at 11:40 AM, ktm@rice.edu <ktm@rice.edu> wrote: > On Thu, Oct 24, 2013 at 11:07:38AM -0400, Robert Haas wrote: >> On Mon, Oct 21, 2013 at 11:52 PM, Fujii Masao <masao.fujii@gmail.com> wrote: >> > So, our consensus is to introduce the hooks for FPW compression so that >> > users can freely select their own best compression algorithm? >> > Also, probably we need to implement at least one compression contrib module >> > using that hook, maybe it's based on pglz or snappy. >> >> I don't favor making this pluggable. I think we should pick snappy or >> lz4 (or something else), put it in the tree, and use it. >> > Hi, > > My vote would be for lz4 since it has faster single thread compression > and decompression speeds with the decompression speed being almost 2X > snappy's decompression speed. The both are BSD licensed so that is not > an issue. The base code for lz4 is c and it is c++ for snappy. There > is also a HC (high-compression) varient for lz4 that pushes its compression > rate to about the same as zlib (-1) which uses the same decompressor which > can provide data even faster due to better compression. Some more real > world tests would be useful, which is really where being pluggable would > help. Well, it's probably a good idea for us to test, during the development cycle, which algorithm works better for WAL compression, and then use that one. Once we make that decision, I don't see that there are many circumstances in which a user would care to override it. Now if we find that there ARE reasons for users to prefer different algorithms in different situations, that would be a good reason to make it configurable (or even pluggable). But if we find that no such reasons exist, then we're better off avoiding burdening users with the need to configure a setting that has only one sensible value. It seems fairly clear from previous discussions on this mailing list that snappy and lz4 are the top contenders for the position of "compression algorithm favored by PostgreSQL". I am wondering, though, whether it wouldn't be better to add support for both - say we added both to libpgcommon, and perhaps we could consider moving pglz there as well. That would allow easy access to all of those algorithms from both front-end and backend-code. If we can make the APIs parallel, it should very simple to modify any code we add now to use a different algorithm than the one initially chosen if in the future we add algorithms to or remove algorithms from the list, or if one algorithm is shown to outperform another in some particular context. I think we'll do well to isolate the question of adding support for these algorithms form the current patch or any other particular patch that may be on the table, and FWIW, I think having two leading contenders and adding support for both may have a variety of advantages over crowning a single victor. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: