Re: Variable length varlena headers redux - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: Variable length varlena headers redux |
Date | |
Msg-id | 45D1CA5C.3010506@enterprisedb.com Whole thread Raw |
In response to | Re: Variable length varlena headers redux (Gregory Stark <stark@enterprisedb.com>) |
Responses |
Re: Variable length varlena headers redux
Re: Variable length varlena headers redux |
List | pgsql-hackers |
Gregory Stark wrote: > "Tom Lane" <tgl@sss.pgh.pa.us> writes: >> For example it'd be easy to implement the previously-discussed design >> involving storing uncompressed length words in network byte order: >> SET_VARLENA_LEN does htonl() and VARSIZE does ntohl() and nothing else in >> the per-datatype functions needs to change. Another idea that we were >> kicking around is to make an explicit distinction between little-endian and >> big-endian hardware: on big-endian hardware, store the two TOAST flag bits >> in the MSBs as now, but on little-endian, store them in the LSBs, shifting >> the length value up two bits. This would probably be marginally faster than >> htonl/ntohl depending on hardware and compiler intelligence, but either way >> you get to guarantee that the flag bits are in the physically first byte, >> which is the critical thing needed to be able to tell the difference between >> compressed and uncompressed length values. > > Actually I think neither htonl nor bitshifting the entire 4-byte word is going > to really work here. Both will require 4-byte alignment. Instead I think we > have to access the length byte by byte as a (char*) and do arithmetic. Since > it's the pointer being passed to VARSIZE that isn't too hard, but it might > perform poorly. We would still require all datums with a 4-byte header to be 4-byte aligned, right? When reading, you would first check if it's a compressed or uncompressed header. If compressed, read the 1 byte header, if uncompressed, read the 4-byte header and do htonl or bitshifting. No need to do htonl or bitshifting on unaligned datums. >> The important point here is that VARSIZE() still works, so only code that >> creates a new varlena value need be affected, not code that examines one. > > So what would VARSIZE() return, the size of the payload plus VARHDRSZ > regardless of what actual size the header was? That seems like it would break > the least existing code though removing all the VARHDRSZ offsets seems like it > would be cleaner. My vote would be to change every caller. Though there's a lot of callers, it's a very simple change. To make it posible to compile an external module against 8.2 and 8.3, you could have a simple ifdef block to map the new macro to old behavior. Or we could backport the macro definitions as Magnus suggested. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
pgsql-hackers by date: