Home > mailing lists

Re: Variable length varlena headers redux - Mailing list pgsql-hackers

From	Gregory Stark
Subject	Re: Variable length varlena headers redux
Date	February 13, 2007 13:09:24
Msg-id	871wku7znq.fsf@stark.xeocode.com Whole thread Raw
In response to	Re: Variable length varlena headers redux (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Variable length varlena headers redux
List	pgsql-hackers

Tree view

"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> The point I'm trying to get across here is to do things one small step
> at a time; if you insist on a "big bang" patch then it'll never get
> done. You might want to go back and review the CVS history for some
> other big changes like TOAST and the version-1 function-call protocol
> to see our previous uses of this approach.

Believe me I'm not looking for a "big bang" approach. It's just that I've
found problems with any of the incremental approaches. I suppose it's
inevitable that "big bang" approaches always seem like the most perfect since
they automatically involve fixing anything that could be a problem.

You keep suggesting things that I've previously considered and rejected --
perhaps prematurely. Off the top of my head I recall the following four
options from our discussions. It looks like we're circling around option 4.

As long as we're willing to live with the palloc/memcpy overhead at least for
now given that we can reduce it by whittling away at the sites that use the
old macros, that seems like a good compromise and the shortest development
path except perhaps option 1.

And I take it you're not worried about sites that might not detoast a datum or
detoast one in the wrong memory context where previously they were guaranteed
it wouldn't generate a copy? In particular I'm worried about btree code and
plpgsql row/record handling.

I'm not sure what to do about the alignment issue. We could just never align
1-byte headers. That would work just fine as long a the data types that need
alignment don't get ported to the new macros. It seems silly to waste space on
disk in order to save a cpu memcpy that we aren't even going to be saving for
now anyways.

Option 1)

We detect cases where the typmod guarantees either a fixed size or a maximum
size < 256 bytes. In which case instead of copying the typlen from pg_type
into tupledesc and the table's attlen we use the implied attlen or store -3
indicating single byte headers everywhere.

Cons: This implies pallocing and memcpying the datum in heap_deform_tuple.
It doesn't help text or varchar(xxxx) at all even if we mostly store small data in them.

Pros: it buys us 0-byte headers for things like numeric(1,0) and even char(n) on single-byte encodings. It buys us
1-byteheaders for most numerics and char(n) or varchar(n).

Option 2)

We have heap_form_tuple/heap_deform_tuple compress and uncompress the headers.
The rule is that headers are always compressed in a tuple and never compressed
when at the end of a datum.

Cons: This implies pallocing and memcpying the datum in heap_deform_tuple
It requires restricting the range of 1-byte headers to 127 or 63 bytes and always uses 1 byte even for fixed
sizedata. (We could get 0-byte headers for a small range (ascii characters and numeric integers up to 127) but
then1-byte headers would be down to 31 byte data.)

It implies a different varlena format for on-disk and in-memory

Pros: it works for text/varchar as long as we store small data.
It lets us use network byte order or some other format for the on-disk headers without changing the macros to
accessin-memory headers.

Option 3)

We have the toaster (or heap_form_tuple as a shortcut) compress the headers
but delay decompression until pg_detoast_datum. tuples only contain compressed
headers but Datums sometimes point to compressed headers and sometimes
uncompressed headers just as they sometimes point to toasted data and
sometimes detoasted data.

Cons: This still implies pallocing and memcpying the datum at least for now
There could be cases where code expects to deform_tuple and be guaranteed a non-toasted pointer into the
tuple.
requires replacing VARATT_SIZEP with SET_VARLENA_LEN()

Pros: It allows for future macros to examine the compressed datum without decompressing it.

Option 4)

We have compressed data constructed on the fly everywhere possible.

Cons: requires replacing VARATT_SIZEP and also requires hacking VARDATA to find the data in the appropriate place.
Mightneed an additional pair of macros for backwards compatibility in code that really needs to construct a
4-byteheadered varlena.

fragility with risk of VARSIZE / VARDATA being filled in out of order
Requires changing header to be in network byte order all the time.

Pros: one consistent representation for varlena everywhere.

-- Gregory Stark EnterpriseDB http://www.enterprisedb.com

pgsql-hackers by date:

From: Bruce Momjian
Date: 13 February 2007, 13:05:52
Subject: Re: TODO item: update source/timezone for 64-bit tz files

From: Tom Lane
Date: 13 February 2007, 13:10:17
Subject: Re: Variable length varlena headers redux

Re: Variable length varlena headers redux - Mailing list pgsql-hackers

Previous

Next