Re: ZStandard (with dictionaries) compression support for TOAST compression - Mailing list pgsql-hackers
From | Nikhil Kumar Veldanda |
---|---|
Subject | Re: ZStandard (with dictionaries) compression support for TOAST compression |
Date | |
Msg-id | CAFAfj_FE-wHQtLBkOVab4MR=1GKErsLL7=HZbWBVnAxa=xncSA@mail.gmail.com Whole thread Raw |
In response to | Re: ZStandard (with dictionaries) compression support for TOAST compression (Michael Paquier <michael@paquier.xyz>) |
List | pgsql-hackers |
Thanks Michael, for providing feedback. On Fri, May 30, 2025 at 12:21 AM Michael Paquier <michael@paquier.xyz> wrote: > > > > During compression, compression methods (zstd_compress_datum) will > > determine whether to use metadata(dictionary) or not based on > > CompressionInfo.meta. > > Not sure about this one. I removed the meta field and the CompressionInfo struct. I originally used CompressionInfo to carry the compression method, zstd_level, and zstd dict ID downstream, but since we’re now using a default compression level for zstd, it’s no longer needed. > > > ALTER TABLE tblname > > ALTER COLUMN colname > > SET (zstd_level = 5); > > ``` > > > > Specifying that as an attribute option makes sense here, but I don't > think that this has to be linked to the initial patch set that should > extend the toast data for the new compression method. It's a bit hard > to say how relevant that is, and IMV it's kind of hard for users to > know which level makes more sense. Setting up the wrong level can be > equally very costly in CPU. For now, my suggestion would be to focus > on the basics, and discard this part until we figure out the rest. > Ack. I’ve removed that option and will stick with ZSTD_CLEVEL_DEFAULT as the compression level. > +CompressionInfo > +setup_cmp_info(char cmethod, Form_pg_attribute att) Removed setup_cmp_info and its references. > This routine declares a Form_pg_attribute as argument, does not use > it. Due to that, it looks that attoptcache.h is pulled into > toast_compression.c. > Removed it. > Patch 0001 has the concept of metadata with various facilities, like > VARATT_4BCE_HAS_META(), CompressionInfo, etc. However at the current > stage we don't need that at all. Wouldn't it be better to delay this > kind of abstraction layer to happen after we discuss how (and if) the > dictionary part should be introduced rather than pay the cost of the > facility in the first step of the implementation? This is not > required as a first step. The toast zstd routines introduced in patch > 0002 use !meta, discard meta=true as an error case. > Removed all metadata-related abstractions from patch 0001. > +/* Helper: pack <flag, cmid> into a single byte: flag (b0), cmid-2 > (b1..7) */ > > Having a one-liner here is far from enough? This is the kind of thing > where we should spend time describing how things are done and why they > are done this way. This is not sufficient, there's just too much to > guess. The fact that we have VARATT_4BCE_EXTFLAG is only, but there's > no real information about va_ecinfo and that it relates to the three > bits sets, for example. > I’ve added a detailed comment explaining the one-byte layout. > +#define VARTAG_SIZE(PTR) \ > [...] > UNALIGNED_U32() > > This stuff feels magic. It's hard for someone to understand what's > going on here, and there is no explanation about why it's done this > way. > To clarify, we need to read a 32-bit value from an unaligned address (specifically va_extinfo inside varatt_external) to determine the toast_pointer size (by checking the top two bits to see if they equal 0b11, indicating an optional trailer). I wrote two versions of READ_U32_UNALIGNED(ptr) that load four bytes individually and reassemble them according to little- or big-endian order: /** * Safely read a 32-bit unsigned integer from *any* address, even when * that address is **not** naturally aligned to 4 bytes. We do the load * one byte at a time and re-assemble the word in *host* byte order. * For LITTLE ENDIAN systems */ #define READ_U32_UNALIGNED(ptr) \ ( (uint32) (((const uint8 *)(ptr))[0]) \ | ((uint32)(((const uint8 *)(ptr))[1]) << 8) \ | ((uint32)(((const uint8 *)(ptr))[2]) << 16) \ | ((uint32)(((const uint8 *)(ptr))[3]) << 24) ) /** * For BIG ENDIAN systems. */ #define READ_U32_UNALIGNED(ptr) \ ( (uint32) (((const uint8 *)(ptr))[3]) \ | ((uint32)(((const uint8 *)(ptr))[2]) << 8) \ | ((uint32)(((const uint8 *)(ptr))[1]) << 16) \ | ((uint32)(((const uint8 *)(ptr))[0]) << 24) ) Alternatively, one could use: #define READ_U32_UNALIGNED(src) \ ({ \ uint32 _tmp; \ memcpy(&_tmp, (src), sizeof(uint32)); \ _tmp; \ }) I chose the byte-by-byte version to avoid extra instructions in a hot path. > -toast_compress_datum(Datum value, char cmethod) > +toast_compress_datum(Datum value, CompressionInfo cmp) > [...] > - /* If the compression method is not valid, use the current default */ > - if (!CompressionMethodIsValid(cmethod)) > - cmethod = default_toast_compression; > > Removing the fallback to the default toast compression GUC if nothing > is valid does not look right. There could be extensions that depend > on that, and it's unclear what the benefits of setup_cmp_info() are, > because it is not documented, so it's hard for one to understand how > to use these changes. I removed setup_cmp_info, all related code has been deleted. > - result = (struct varlena *) palloc(TOAST_POINTER_SIZE); > + result = (struct varlena *) palloc(TOAST_CMPID_EXTENDED(cm) ? TOAST_POINTER_EXT_SIZE : TOAST_POINTER_NOEXT_SIZE); > [...] > - memcpy(VARDATA_EXTERNAL(result), &toast_pointer, > sizeof(toast_pointer)); > + memcpy(VARDATA_EXTERNAL(result), &toast_pointer, > TOAST_CMPID_EXTENDED(cm) ? TOAST_POINTER_EXT_SIZE - VARHDRSZ_EXTERNAL > : TOAST_POINTER_NOEXT_SIZE - VARHDRSZ_EXTERNAL) ; > > That looks, err... Hard to maintain to me. Okay, that's a > calculation for the extended compression part, but perhaps this is a > sign that we need to think harder about the surroundings of the > toast_pointer to ease such calculations. > I simplified it by introducing a helper macro: Now both the palloc call and the memcpy length calculation simply use TOAST_POINTER_SIZE(cm) and TOAST_POINTER_SIZE(cm) − VARHDRSZ_EXTERNAL, respectively. #define TOAST_POINTER_SIZE(cm) \ (TOAST_CMPID_EXTENDED(cm) ? TOAST_POINTER_EXT_SIZE : TOAST_POINTER_NOEXT_SIZE) > + { > + { > + "zstd_level", > + "Set column's ZSTD compression level", > + RELOPT_KIND_ATTRIBUTE, > + ShareUpdateExclusiveLock > + }, > + DEFAULT_ZSTD_LEVEL, MIN_ZSTD_LEVEL, MAX_ZSTD_LEVEL > + }, > > This could be worth a patch on its own, once we get the basics sorted > out. I'm not even sure that we absolutely need that, TBH. The last > time I've had a discussion on the matter for WAL compression we > discarded the argument about the level because it's hard to understand > how to tune, and the default is enough to work magics. For WAL, we've > been using ZSTD_CLEVEL_DEFAULT in xloginsert.c, and I've not actually > heard much about people wanting to tune the compression level. That > was a few years ago, perhaps there are some more different opinions on > the matter. > Removed it. > Your patch introduces a new compression_zstd, touching very lightly > compression.sql. I think that we should and can do much better than > that in the long term. The coverage of compression.sql is quite good, > and what the zstd code is adding does not cover all of it. Let's > rework the tests of HEAD and split compression.sql for the LZ4 and > pglz parts. If one takes a diff between compression.out and > compression_1.out, he/she would notice that the only differences are > caused by the existence of the lz4 table. This is not the smartest > move we can do if we add more compression methods, so I'd suggest the > following: > - Add a new SQL function called pg_toast_compression_available(text) > or similar, able to return if a toast compression method is supported > or not. This would need two arguments once the initial support for > zstd is done: lz4 and zstd. For head, we only require one: lz4. > - Now, the actual reason why a function returning a boolean result is > useful is for the SQL tests. It is possible with \if to make the > tests conditional if LZ4 is supported or now, limiting the noise if > LZ4 is not supported. See for example the tricks we use for the UTF-8 > encoding or NUMA. > - Move the tests related to lz4 into a separate file, outside > compression.sql, in a new file called compression_lz4.sql. With the > addition of zstd toast support, we would add a new file: > compression_zstd.sql. The new zstd suite would then just need to > copy-paste the original one, with few tweaks. It may be better to > parameterize that but we don't do that anymore these days with > input/output regression files. Agreed. I introduced pg_compression_available(text) and refactored the SQL tests accordingly. I split out LZ4 tests into compression_lz4.sql and created compression_zstd.sql with the appropriate differences. v25-0001-Add-pg_compression_available-and-split-sql-compr.patch - Introduced pg_compression_available function and split sql tests related to compression v25-0002-Design-to-extend-the-varattrib_4b-varatt_externa.patch - Design proposal for varattrib_4b & varatt_external v25-0003-Implement-Zstd-compression-no-dictionary-support.patch - zstd no dictionary compression implementation -- Nikhil Veldanda
Attachment
pgsql-hackers by date: