Re: Variable length varlena headers redux - Mailing list pgsql-hackers
| From | Gregory Stark |
|---|---|
| Subject | Re: Variable length varlena headers redux |
| Date | |
| Msg-id | 871wku7znq.fsf@stark.xeocode.com Whole thread Raw |
| In response to | Re: Variable length varlena headers redux (Tom Lane <tgl@sss.pgh.pa.us>) |
| Responses |
Re: Variable length varlena headers redux
|
| List | pgsql-hackers |
"Tom Lane" <tgl@sss.pgh.pa.us> writes:
> The point I'm trying to get across here is to do things one small step
> at a time; if you insist on a "big bang" patch then it'll never get
> done. You might want to go back and review the CVS history for some
> other big changes like TOAST and the version-1 function-call protocol
> to see our previous uses of this approach.
Believe me I'm not looking for a "big bang" approach. It's just that I've
found problems with any of the incremental approaches. I suppose it's
inevitable that "big bang" approaches always seem like the most perfect since
they automatically involve fixing anything that could be a problem.
You keep suggesting things that I've previously considered and rejected --
perhaps prematurely. Off the top of my head I recall the following four
options from our discussions. It looks like we're circling around option 4.
As long as we're willing to live with the palloc/memcpy overhead at least for
now given that we can reduce it by whittling away at the sites that use the
old macros, that seems like a good compromise and the shortest development
path except perhaps option 1.
And I take it you're not worried about sites that might not detoast a datum or
detoast one in the wrong memory context where previously they were guaranteed
it wouldn't generate a copy? In particular I'm worried about btree code and
plpgsql row/record handling.
I'm not sure what to do about the alignment issue. We could just never align
1-byte headers. That would work just fine as long a the data types that need
alignment don't get ported to the new macros. It seems silly to waste space on
disk in order to save a cpu memcpy that we aren't even going to be saving for
now anyways.
Option 1)
We detect cases where the typmod guarantees either a fixed size or a maximum
size < 256 bytes. In which case instead of copying the typlen from pg_type
into tupledesc and the table's attlen we use the implied attlen or store -3
indicating single byte headers everywhere.
Cons: This implies pallocing and memcpying the datum in heap_deform_tuple.
It doesn't help text or varchar(xxxx) at all even if we mostly store small data in them.
Pros: it buys us 0-byte headers for things like numeric(1,0) and even char(n) on single-byte encodings. It buys us
1-byteheaders for most numerics and char(n) or varchar(n).
Option 2)
We have heap_form_tuple/heap_deform_tuple compress and uncompress the headers.
The rule is that headers are always compressed in a tuple and never compressed
when at the end of a datum.
Cons: This implies pallocing and memcpying the datum in heap_deform_tuple
It requires restricting the range of 1-byte headers to 127 or 63 bytes and always uses 1 byte even for fixed
sizedata. (We could get 0-byte headers for a small range (ascii characters and numeric integers up to 127) but
then1-byte headers would be down to 31 byte data.)
It implies a different varlena format for on-disk and in-memory
Pros: it works for text/varchar as long as we store small data.
It lets us use network byte order or some other format for the on-disk headers without changing the macros to
accessin-memory headers.
Option 3)
We have the toaster (or heap_form_tuple as a shortcut) compress the headers
but delay decompression until pg_detoast_datum. tuples only contain compressed
headers but Datums sometimes point to compressed headers and sometimes
uncompressed headers just as they sometimes point to toasted data and
sometimes detoasted data.
Cons: This still implies pallocing and memcpying the datum at least for now
There could be cases where code expects to deform_tuple and be guaranteed a non-toasted pointer into the
tuple.
requires replacing VARATT_SIZEP with SET_VARLENA_LEN()
Pros: It allows for future macros to examine the compressed datum without decompressing it.
Option 4)
We have compressed data constructed on the fly everywhere possible.
Cons: requires replacing VARATT_SIZEP and also requires hacking VARDATA to find the data in the appropriate place.
Mightneed an additional pair of macros for backwards compatibility in code that really needs to construct a
4-byteheadered varlena.
fragility with risk of VARSIZE / VARDATA being filled in out of order
Requires changing header to be in network byte order all the time.
Pros: one consistent representation for varlena everywhere.
-- Gregory Stark EnterpriseDB http://www.enterprisedb.com
pgsql-hackers by date: