Home > mailing lists

Re: [REVIEW] Re: Compression of full-page-writes - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: [REVIEW] Re: Compression of full-page-writes
Date	September 15, 2014 10:58:26
Msg-id	5416C617.3030405@vmware.com Whole thread Raw
In response to	Re: [REVIEW] Re: Compression of full-page-writes (Arthur Silva <arthurprs@gmail.com>)
List	pgsql-hackers

Tree view

On 09/15/2014 02:42 AM, Arthur Silva wrote:
> Em 14/09/2014 12:21, "Andres Freund" <andres@2ndquadrant.com> escreveu:
>>
>> On 2014-09-13 20:27:51 -0500, ktm@rice.edu wrote:
>>>>
>>>> What we are looking for here is uniqueness thus better error
> detection. Not
>>>> avalanche effect, nor cryptographically secure, nor bit distribution.
>>>> As far as I'm aware CRC32C is unbeaten collision wise and time proven.
>>>>
>>>> I couldn't find tests with xxhash and crc32 on the same hardware so I
> spent
>>>> some time putting together a benchmark (see attachment, to run it just
>>>> start run.sh)
>>>>
>>>> I included a crc32 implementation using ssr4.2 instructions (which
> works on
>>>> pretty much any Intel processor built after 2008 and AMD built after
> 2012),
>>>> a portable Slice-By-8 software implementation and xxhash since it's
> the
>>>> fastest software 32bit hash I know of.
>>>>
>>>> Here're the results running the test program on my i5-4200M
>>>>
>>>> crc sb8: 90444623
>>>> elapsed: 0.513688s
>>>> speed: 1.485220 GB/s
>>>>
>>>> crc hw: 90444623
>>>> elapsed: 0.048327s
>>>> speed: 15.786877 GB/s
>>>>
>>>> xxhash: 7f4a8d5
>>>> elapsed: 0.182100s
>>>> speed: 4.189663 GB/s
>>>>
>>>> The hardware version is insanely and works on the majority of Postgres
>>>> setups and the fallback software implementations is 2.8x slower than
> the
>>>> fastest 32bit hash around.
>>>>
>>>> Hopefully it'll be useful in the discussion.
>>
>> Note that all these numbers aren't fully relevant to the use case
>> here. For the WAL - which is what we're talking about and the only place
>> where CRC32 is used with high throughput - the individual parts of a
>> record are pretty darn small on average. So performance of checksumming
>> small amounts of data is more relevant. Mind, that's not likely to go
>> for CRC32, especially not slice-by-8. The cache fooprint of the large
>> tables is likely going to be noticeable in non micro benchmarks.
>
> Indeed, the small input sizes is something I was missing. Something more
> cache friendly would be better, it's just a matter of finding a better
> candidate.

It's worth noting that the extra tables that slicing-by-4 requires are 
and *in addition to* the lookup table we already have. And slicing-by-8 
builds on the slicing-by-4 lookup tables. Our current algorithm uses a 
1kB lookup table, slicing-by-4 a 4kB, and slicing-by-8 8kB. But the 
first 1kB of the slicing-by-4 lookup table is identical to the current 
1kB lookup table, and the first 4kB of the slicing-by-8 are identical to 
the slicing-by-4 tables.

It would be pretty straightforward to use the current algorithm when the 
WAL record is very small, and slicing-by-4 or slicing-by-8 for larger 
records (like FPWs), where the larger table is more likely to pay off. I 
have no idea where the break-even point is with the current algorithm 
vs. slicing-by-4 and a cold cache, but maybe we can get a handle on that 
with some micro-benchmarking.

Although this is complicated by the fact that slicing-by-4 or -8 might 
well be a win even with very small records, if you generate a lot of them.

- Heikki

pgsql-hackers by date:

From: Alexey Klyukin
Date: 15 September 2014, 10:45:24
Subject: Re: implement subject alternative names support for SSL connections

From: Heikki Linnakangas
Date: 15 September 2014, 11:01:17
Subject: Re: B-Tree support function number 3 (strxfrm() optimization)

Re: [REVIEW] Re: Compression of full-page-writes - Mailing list pgsql-hackers

Previous

Next