Thread: Combo xids
Currently, we hold xmin and xmax for each tuple. For xmax, we have the multixact mechanism that allows us to represent an array of xids with just a single pseudo xid. So why not hold xmin and xmax as part of a multixact? Instead of caching two xids on each tuple, why not hold just one and allow access to the array of xids via multixact? An idea very similar to combo cids, hence the name combo xids. When would this make sense? Frequently. Most of the time a tuple needs only one xid set. In most cases, we set xmin and xmax a long time apart. Very few cases end with both of them set inside the *same* xmin horizon. In a heavy transactional enviroment, the horizon moves forwards quickly, on the order of a few seconds. Very few rows get inserted and then updated/deleted that quickly. With long reporting queries, data tends to be updated less, so again the rows aren't touched within the same horizon. As a result, we hardly ever need both xmin and xmax at the same time - when we need to set xmax, xmin is already committed/cleaned. What is the benefit? Merging xmin/xmax would save 4 bytes per row. On servers with 8 byte word length, that means that we'd save 8 bytes per row for tables that have between 9 and 40 columns. Which is the majority of tables. How? Clearly this would require changes to tuple format and therefore not something I would suggest for this year ahead, but seems worth getting the idea down, even if it gets ruled out. Multixact is now persistent, so this change wouldn't take much to implement. Given that it would require a change in in-disk format, we might also consider that it could be possible to cache multixactid data on the disk blocks that need it. A list of multixactids could be stored in amongst the tuples on block, with the head of the list being a 2 byte addition to the block header. Which could be used to reduce the overhead of multixactid use. --Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Sat, Jun 01, 2013 at 09:22:05AM +0100, Simon Riggs wrote: > When would this make sense? > Frequently. Most of the time a tuple needs only one xid set. In most > cases, we set xmin and xmax a long time apart. Very few cases end with > both of them set inside the *same* xmin horizon. In a heavy > transactional enviroment, the horizon moves forwards quickly, on the > order of a few seconds. Very few rows get inserted and then > updated/deleted that quickly. With long reporting queries, data tends > to be updated less, so again the rows aren't touched within the same > horizon. As a result, we hardly ever need both xmin and xmax at the > same time - when we need to set xmax, xmin is already > committed/cleaned. Is this really true? Consider a long running query A and a tuple created by B after A. If another transaction comes to update B you can't throw away the xmin because you need it to prove that A can't see the tuple. Or is the idea to create multixacts for each combination of xmin/xmax encountered? And the assumption is that there aren't that many? That could be measured. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > He who writes carelessly confesses thereby at the very outset that he does > not attach much importance to his own thoughts. -- Arthur Schopenhauer
On 01.06.2013 11:22, Simon Riggs wrote: > What is the benefit? > Merging xmin/xmax would save 4 bytes per row. On servers with 8 byte > word length, that means that we'd save 8 bytes per row for tables that > have between 9 and 40 columns. Which is the majority of tables. Hmm, it would probably be much easier to squeeze, say, one byte from the tuple header, than full four bytes. Then you could store store a null bitmap for upto 16 columns, without crossing the 24 byte mark. That would already get you much of the benefit. - Heikki
On 1 June 2013 19:25, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > Hmm, it would probably be much easier to squeeze, say, one byte from the > tuple header, than full four bytes. Then you could store store a null bitmap > for upto 16 columns, without crossing the 24 byte mark. That would already > get you much of the benefit. It seemed worth recording the idea, though I agree that now its recorded its not the best idea for reducing table size. --Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services