Re: Zedstore - compressed in-core columnar storage - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Zedstore - compressed in-core columnar storage |
Date | |
Msg-id | 9e1077bc-c5c1-425b-07ad-424d3522363f@enterprisedb.com Whole thread Raw |
In response to | Re: Zedstore - compressed in-core columnar storage (Soumyadeep Chakraborty <soumyadeep2007@gmail.com>) |
Responses |
Re: Zedstore - compressed in-core columnar storage
Re: Zedstore - compressed in-core columnar storage |
List | pgsql-hackers |
Hi, Thanks for the updated patch. It's a quite massive amount of code - I I don't think we had many 2MB patches in the past, so this is by no means a full review. 1) the psql_1.out is missing a bit of expected output (due to 098fb0079) 2) I'm getting crashes in intarray contrib, due to hitting this error in lwlock.c (backtrace attached): /* Ensure we will have room to remember the lock */ if (num_held_lwlocks >= MAX_SIMUL_LWLOCKS) elog(ERROR, "too many LWLocks taken"); I haven't investigates this too much, but it's regular build with asserts and TAP tests, so it should be simple to reproduce using "make check-world" I guess. 3) I did a very simple benchmark, loading a TPC-H data (for 75GB), followed by pg_dump, and the duration (in seconds) looks like this: master zedstore/pglz zedstore/lz4 ------------------------------------------------- copy 1855 68092 2131 dump 751 905 811 And the size of the lineitem table (as shown by \d+) is: master: 64GB zedstore/pglz: 51GB zedstore/lz4: 20GB It's mostly expected lz4 beats pglz in performance and compression ratio, but this seems a bit too extreme I guess. Per past benchmarks (e.g. [1] and [2]) the difference in compression/decompression time should be maybe 1-2x or something like that, not 35x like here. [1] https://www.postgresql.org/message-id/20130621000900.GA12425%40alap2.anarazel.de [2] https://www.postgresql.org/message-id/20130605150144.GD28067%40alap2.anarazel.de Furthermore, the pglz compression is not consuming the most CPU, at least that's what perf says: 24.82% postgres [.] encode_chunk_varlen 20.49% postgres [.] decode_chunk 13.01% postgres [.] merge_attstream_guts.isra.0 12.68% libc-2.32.so [.] __memmove_avx_unaligned_erms 8.72% postgres [.] encode_chunk_fixed 6.16% postgres [.] pglz_compress 4.36% postgres [.] decode_attstream_cont 2.27% postgres [.] 0x00000000000baff0 1.84% postgres [.] AllocSetAlloc 0.79% postgres [.] append_attstream 0.70% postgres [.] palloc So I wonder if this is a sign of a deeper issue - maybe the lower compression ratio (for pglz) triggers some sort of feedback loop in zedstore, or something like that? Not sure, but this seems strange. 4) I looked at some of the code, like merge_attstream etc. and I wonder if this might be related to some of the FIXME comments. For example this bit in merge_attstream seems interesting: * FIXME: we don't actually pay attention to the compression anymore. * We never repack. * FIXME: this is backwords, the normal fast path is if (firsttid1 > lasttid2) But I suppose that should affect both pglz and lz4, and I'm not sure how up to date those comments actually are. BTW the comments in general need updating and tidying up, to make reviews easier. For example the merge_attstream comment references attstream1 and attstream2, but those are not the current parameters of the function. 5) IHMO there should be a #define specifying the maximum number of items per chunk (60). Currently there are literal constants used in various places, sometimes 60, sometimes 59 etc. which makes it harder to understand the code. FWIW 60 seems a bit low, but maybe it's OK. 6) I do think ZSAttStream should track which compression is used by the stream, for two main reasons. Firstly, there's another patch to support "custom compression" methods, which (also) allows multiple compression methods per column. It'd be a bit strange to support that for varlena columns in heap table, and not here, I guess. Secondly, I think one of the interesting columnstore features down the road will be execution on compressed data, which however requires compression method designed for that purpose, and it's often datatype-specific (delta encoding, ...). I don't think we need to go as far as supporting "custom" compression methods here, but I think we should allow different built-in compression methods for different attstreams. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
pgsql-hackers by date: