Re: Variable length varlena headers redux - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: Variable length varlena headers redux |
Date | |
Msg-id | 200702090535.l195Z8C07463@momjian.us Whole thread Raw |
In response to | Re: Variable length varlena headers redux (Bruce Momjian <bruce@momjian.us>) |
Responses |
Re: Variable length varlena headers redux
|
List | pgsql-hackers |
Bruce Momjian wrote: > > Uh, I thought the approach was to create type-specific in/out functions, > and add casting so every time there were referenced, they would expand > to a varlena structure in memory. Oh, one more thing. You are going to need to teach the code that walks through a tuple attributes about the short header types. I think you should set pg_type.typlen = -3 (vs -1 for varlena) and put your macro code there too. (As an example, see the macro att_addlength().) I know it is kind of odd to have a data type that is only used on disk, and not in memory, but I see this as a baby varlena type, used only to store and get varlena values using less disk space. --------------------------------------------------------------------------- > > Gregory Stark wrote: > > > > I've been looking at this again and had a few conversations about it. This may > > be easier than I had originally thought but there's one major issue that's > > bugging me. Do you see any way to avoid having every user function everywhere > > use a new macro api instead of VARDATA/VARATT_DATA and VARSIZE/VARATT_SIZEP? > > > > The two approaches I see are either > > > > a) To have two sets of macros, one of which, VARATT_DATA and VARATT_SIZEP are > > for constructing new tuples and behaves exactly as it does now. So you always > > construct a four-byte header datum. Then in heap_form*tuple we check if you > > can use a shorter header and convert. VARDATA/VARSIZE would be for looking at > > existing datums and would interpret the header bits. > > > > This seems very fragile since one stray call site using VARATT_DATA to find > > the data in an existing datum would cause random bugs that only occur rarely > > in certain circumstances. It would even work as long as the size is filled in > > with VARATT_SIZEP first which it usually is, but fail if someone changes the > > order of the statements. > > > > or > > > > b) throw away VARATT_DATA and VARATT_SIZEP and make all user function > > everywhere change over to a new macro api. That seems like a pretty big > > burden. It's safer but means every contrib module would have to be updated and > > so on. > > > > I'm hoping I'm missing something and there's a way to do this without breaking > > the api for every user function. > > > > > > -- Start of included mail From: Tom Lane <tgl@sss.pgh.pa.us> > > > To: Gregory Stark <stark@enterprisedb.com> > > cc: Gregory Stark <gsstark@mit.edu>, Bruce Momjian <bruce@momjian.us>, > > Peter Eisentraut <peter_e@gmx.net>, pgsql-hackers@postgresql.org, > > Martijn van Oosterhout <kleptog@svana.org> > > Subject: Re: [HACKERS] Fixed length data types issue > > Date: Mon, 11 Sep 2006 13:15:43 -0400 > > Lines: 64 > > Xref: stark.xeocode.com work.enterprisedb:683 > > > Gregory Stark <stark@enterprisedb.com> writes: > > > In any case it seems a bit backwards to me. Wouldn't it be better to > > > preserve bits in the case of short length words where they're precious > > > rather than long ones? If we make 0xxxxxxx the 1-byte case it means ... > > > > Well, I don't find that real persuasive: you're saying that it's > > important to have a 1-byte not 2-byte header for datums between 64 and > > 127 bytes long. Which is by definition less than a 2% savings for those > > values. I think its's more important to pick bitpatterns that reduce > > the number of cases heap_deform_tuple has to think about while decoding > > the length of a field --- every "if" in that inner loop is expensive. > > > > I realized this morning that if we are going to preserve the rule that > > 4-byte-header and compressed-header cases can be distinguished from the > > data alone, there is no reason to be very worried about whether the > > 2-byte cases can represent the maximal length of an in-line datum. > > If you want to do 16K inline (and your page is big enough for that) > > you can just fall back to the 4-byte-header case. So there's no real > > disadvantage if the 2-byte headers can only go up to 4K or so. This > > gives us some more flexibility in the bitpattern choices. > > > > Another thought that occurred to me is that if we preserve the > > convention that a length word's value includes itself, then for a > > 1-byte header the bit pattern 10000000 is meaningless --- the count > > has to be at least 1. So one trick we could play is to take over > > this value as the signal for "toast pointer follows", with the > > assumption that the tuple-decoder code knows a-priori how big a > > toast pointer is. I am not real enamored of this, because it certainly > > adds one case to the inner heap_deform_tuple loop and it'll give us > > problems if we ever want more than one kind of toast pointer. But > > it's a possibility. > > > > Anyway, a couple of encodings that I'm thinking about now involve > > limiting uncompressed data to 1G (same as now), so that we can play > > with the first 2 bits instead of just 1: > > > > 00xxxxxx 4-byte length word, aligned, uncompressed data (up to 1G) > > 01xxxxxx 4-byte length word, aligned, compressed data (up to 1G) > > 100xxxxx 1-byte length word, unaligned, TOAST pointer > > 1010xxxx 2-byte length word, unaligned, uncompressed data (up to 4K) > > 1011xxxx 2-byte length word, unaligned, compressed data (up to 4K) > > 11xxxxxx 1-byte length word, unaligned, uncompressed data (up to 63b) > > > > or > > > > 00xxxxxx 4-byte length word, aligned, uncompressed data (up to 1G) > > 010xxxxx 2-byte length word, unaligned, uncompressed data (up to 8K) > > 011xxxxx 2-byte length word, unaligned, compressed data (up to 8K) > > 10000000 1-byte length word, unaligned, TOAST pointer > > 1xxxxxxx 1-byte length word, unaligned, uncompressed data (up to 127b) > > (xxxxxxx not all zero) > > > > This second choice allows longer datums in both the 1-byte and 2-byte > > header formats, but it hardwires the length of a TOAST pointer and > > requires four cases to be distinguished in the inner loop; the first > > choice only requires three cases, because TOAST pointer and 1-byte > > header can be handled by the same rule "length is low 6 bits of byte". > > The second choice also loses the ability to store in-line compressed > > data above 8K, but that's probably an insignificant loss. > > > > There's more than one way to do it ... > > > > regards, tom lane > > > -- End of included mail. > > > > > > > -- > > Gregory Stark > > EnterpriseDB http://www.enterprisedb.com > > -- > Bruce Momjian <bruce@momjian.us> http://momjian.us > EnterpriseDB http://www.enterprisedb.com > > + If your life is a hard drive, Christ can be your backup. + > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
pgsql-hackers by date: