Home > mailing lists

Re: Variable length varlena headers redux - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: Variable length varlena headers redux
Date	February 13, 2007 10:26:09
Msg-id	45D1CA5C.3010506@enterprisedb.com Whole thread Raw
In response to	Re: Variable length varlena headers redux (Gregory Stark <stark@enterprisedb.com>)
Responses	Re: Variable length varlena headers redux Re: Variable length varlena headers redux
List	pgsql-hackers

Tree view

Gregory Stark wrote:
> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>> For example it'd be easy to implement the previously-discussed design
>> involving storing uncompressed length words in network byte order:
>> SET_VARLENA_LEN does htonl() and VARSIZE does ntohl() and nothing else in
>> the per-datatype functions needs to change. Another idea that we were
>> kicking around is to make an explicit distinction between little-endian and
>> big-endian hardware: on big-endian hardware, store the two TOAST flag bits
>> in the MSBs as now, but on little-endian, store them in the LSBs, shifting
>> the length value up two bits. This would probably be marginally faster than
>> htonl/ntohl depending on hardware and compiler intelligence, but either way
>> you get to guarantee that the flag bits are in the physically first byte,
>> which is the critical thing needed to be able to tell the difference between
>> compressed and uncompressed length values.
> 
> Actually I think neither htonl nor bitshifting the entire 4-byte word is going
> to really work here. Both will require 4-byte alignment. Instead I think we
> have to access the length byte by byte as a (char*) and do arithmetic. Since
> it's the pointer being passed to VARSIZE that isn't too hard, but it might
> perform poorly.

We would still require all datums with a 4-byte header to be 4-byte 
aligned, right? When reading, you would first check if it's a compressed 
or uncompressed header. If compressed, read the 1 byte header, if 
uncompressed, read the 4-byte header and do htonl or bitshifting. No 
need to do htonl or bitshifting on unaligned datums.

>> The important point here is that VARSIZE() still works, so only code that
>> creates a new varlena value need be affected, not code that examines one.
> 
> So what would VARSIZE() return, the size of the payload plus VARHDRSZ
> regardless of what actual size the header was? That seems like it would break
> the least existing code though removing all the VARHDRSZ offsets seems like it
> would be cleaner.

My vote would be to change every caller. Though there's a lot of 
callers, it's a very simple change.

To make it posible to compile an external module against 8.2 and 8.3, 
you could have a simple ifdef block to map the new macro to old 
behavior. Or we could backport the macro definitions as Magnus suggested.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

pgsql-hackers by date:

From: Bruce Momjian
Date: 13 February 2007, 10:22:24
Subject: Re: Variable length varlena headers redux

From: Tom Lane
Date: 13 February 2007, 10:39:23
Subject: Re: HOT for PostgreSQL 8.3

Re: Variable length varlena headers redux - Mailing list pgsql-hackers

Previous

Next