Thread: [WIP] Reduce alignment requirements on 64-bit systems.
Hello all, Here is a proof-of-concept patch for reducing the alignment requirement for heap tuples on 64-bit systems. This patch passes the regression tests and a couple of other data sets I have thrown at it. I am hoping to get some early feedback on this patch so I have time to make any corrections, improvements, etc before the November commit-fest deadline. There are two visible savings in the system catalogs using this patch (size was gathered using \S+): Catalog Pre-Patch Size After Patch Size pg_depend 312 kB 296 kB pg_description 160 kB 152 kB One thing I am considering, but I am not sure if it is worth the code uglification, is to wrap the majority of these checks in #ifdefs so they only are used on 64-bit systems. This would completely eliminate the impact for this patch on 32-bit systems, which would not gain any benefit from this patch. Feedback welcome! Thanks, - Ryan
Attachment
Just a quick look. At first point. Your change introduces new page layout version. Which is not acceptable from my point of view for 8.4 (it add another complexity to inplace upgrade). And I guest that it maybe works fine on 64bits x86 but it will fail on SPARC and other machine which requires aligned data. Zdenek Ryan Bradetich napsal(a): > Hello all, > > Here is a proof-of-concept patch for reducing the alignment > requirement for heap tuples on 64-bit systems. > This patch passes the regression tests and a couple of other data sets > I have thrown at it. > > I am hoping to get some early feedback on this patch so I have time to > make any corrections, improvements, > etc before the November commit-fest deadline. > > There are two visible savings in the system catalogs using this patch > (size was gathered using \S+): > > Catalog Pre-Patch Size After Patch Size > > pg_depend 312 kB 296 kB > pg_description 160 kB 152 kB > > > One thing I am considering, but I am not sure if it is worth the code > uglification, is to wrap the majority of these > checks in #ifdefs so they only are used on 64-bit systems. This > would completely eliminate the impact for this > patch on 32-bit systems, which would not gain any benefit from this patch. > > Feedback welcome! > > Thanks, > > - Ryan > > > ------------------------------------------------------------------------ > > -- Zdenek Kotala Sun Microsystems Prague, Czech Republic http://sun.com/postgresql
Hello Zdenek, On Wed, Oct 8, 2008 at 10:59 PM, Zdenek Kotala <Zdenek.Kotala@sun.com> wrote: > Just a quick look. At first point. Your change introduces new page layout > version. Which is not acceptable from my point of view for 8.4 (it add I would like to see this patch (or some variant) go in if possible. Since the inplace upgrades a concern to you, is there anything I can do to help with the inplace upgrades to help offset the disruption this patch causes you? > another complexity to inplace upgrade). And I guest that it maybe works fine > on 64bits x86 but it will fail on SPARC and other machine which requires > aligned data. Did I miss something? My intention was to keep the data aligned so it should work on any platform. The patch checks the user-defined data to see if any column requires the double storage type. If the double storage type is required, it uses the MAXALIGN() macro which preserves the alignment for 64-bit data types. If no columns require the double storage type, then the data will be INTALIGN() which still preserves the alignment requirements. If I have a complete mis-understanding of this issue, please explain it to me and I will either fix it or withdraw the patch. Thanks for your feedback! - Ryan
Ryan Bradetich napsal(a): > Hello Zdenek, Hello Ryan, > > On Wed, Oct 8, 2008 at 10:59 PM, Zdenek Kotala <Zdenek.Kotala@sun.com> wrote: >> Just a quick look. At first point. Your change introduces new page layout >> version. Which is not acceptable from my point of view for 8.4 (it add > > I would like to see this patch (or some variant) go in if possible. > Since the inplace > upgrades a concern to you, is there anything I can do to help with the inplace > upgrades to help offset the disruption this patch causes you? Yaah, wait until 8.5 :-). However, currently there is no clear consensus which upgrade method is best. I hope that It will clear after Prato developers meeting. Until this meeting I cannot say more. >> another complexity to inplace upgrade). And I guest that it maybe works fine >> on 64bits x86 but it will fail on SPARC and other machine which requires >> aligned data. > > Did I miss something? My intention was to keep the data aligned so it > should work > on any platform. The patch checks the user-defined data to see if > any column requires > the double storage type. If the double storage type is required, it > uses the MAXALIGN() > macro which preserves the alignment for 64-bit data types. If no > columns require the > double storage type, then the data will be INTALIGN() which still > preserves the alignment > requirements. I overlooked 'd' test. Your idea seems to me reasonable. Maybe, you could test 'd' alignment only for NOT NULL values. > If I have a complete mis-understanding of this issue, > please explain it > to me and I will either fix it or withdraw the patch. The problem there is add_item which it is used for indexes as well and they has IndexTupleHeader structure. I'm not convenienceabout idea has two different alignment for items on page. I guess another problem is with MAX_TUPLE_CHUNK_SIZE which uses MAXALIGN for computing. It seems to me that toast chunk could waste a space anyway. And of course you should bump page layout version. I also suggest create function/macro to compute hoff and replace code with this function/macro. Zdenek -- Zdenek Kotala Sun Microsystems Prague, Czech Republic http://sun.com/postgresql
"Ryan Bradetich" <rbradetich@gmail.com> wrote: > Here is a proof-of-concept patch for reducing the alignment > requirement for heap tuples on 64-bit systems. > > pg_depend 312 kB 296 kB > pg_description 160 kB 152 kB 5 percent of gain seems reasonable for me. Is it possible to apply your improvement for indexes? I think there are more benefits for small index entries that size are often 12 or 20 bytes. > This would completely eliminate the impact for this > patch on 32-bit systems, which would not gain any benefit from this patch. No! There are *also* benefits even on 32-bit systems, because some of them have 8-byte alignment. (for example, 32-bit Windows) BTW, there might be a small mistabke on the following lines in patch. They do the same things ;-) *************** *** 1019,1025 **** toast_flatten_tuple_attribute(Datum value, new_len += BITMAPLEN(numAttrs); if (olddata->t_infomask& HEAP_HASOID) new_len += sizeof(Oid); ! new_len = MAXALIGN(new_len); Assert(new_len == olddata->t_hoff); new_data_len = heap_compute_data_size(tupleDesc, toast_values, toast_isnull); --- 1025,1034 ---- new_len += BITMAPLEN(numAttrs); if (olddata->t_infomask & HEAP_HASOID) new_len +=sizeof(Oid); ! if (olddata->t_infomask & HEAP_INTALIGN) ! new_len = MAXALIGN(new_len); ! else ! new_len = MAXALIGN(new_len); Assert(new_len == olddata->t_hoff); new_data_len = heap_compute_data_size(tupleDesc, toast_values, toast_isnull); Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
Hello Zdenek, On Thu, Oct 9, 2008 at 12:38 AM, Zdenek Kotala <Zdenek.Kotala@sun.com> wrote: > Ryan Bradetich napsal(a): >> On Wed, Oct 8, 2008 at 10:59 PM, Zdenek Kotala <Zdenek.Kotala@sun.com> >> wrote: >> I would like to see this patch (or some variant) go in if possible. >> Since the inplace >> upgrades a concern to you, is there anything I can do to help with the >> inplace >> upgrades to help offset the disruption this patch causes you? > > Yaah, wait until 8.5 :-). However, currently there is no clear consensus > which upgrade method is best. I hope that It will clear after Prato > developers meeting. Until this meeting I cannot say more. Heh, understood. :) I believe the proposed CRC patch also affects the heap page layout (adds the pd_checksum field to the PageHeaderData). [1] Just pointing out another patch that could affect you as well. My offer to help still stands. > I overlooked 'd' test. Your idea seems to me reasonable. Maybe, you could > test 'd' alignment only for NOT NULL values. Funny you should mention this. I had just started looking into this optimization to see if I could convince myself it would be safe. My initial conclusion indicates it would be safe, but I have not tested nor verified that yet. Having an independent proposal for this boosts my confidence even more. Thanks! > The problem there is add_item which it is used for indexes as well and they > has IndexTupleHeader structure. I'm not convenience about idea has two > different alignment for items on page. Just to clarify, this patch only affects heap storage when (i.e. the is_heap flag is set). I have not had a chance to analyze or see if I can reduce other storage types yet. > I guess another problem is with MAX_TUPLE_CHUNK_SIZE which uses MAXALIGN for > computing. It seems to me that toast chunk could waste a space anyway. > > And of course you should bump page layout version. Thanks. I will do. > I also suggest create function/macro to compute hoff and replace code with > this function/macro. Great. That is some of the feedback I was looking for. I did not implement it yet, because I wanted to see if the basic implementation was feasible first. Thanks again for your feedback! - Ryan [1] http://archives.postgresql.org/pgsql-hackers/2008-10/msg00070.php
Hello ITAGAKI-san, I apologize for not replying earlier, I needed to head out to work. On Thu, Oct 9, 2008 at 5:01 AM, ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> wrote: > "Ryan Bradetich" <rbradetich@gmail.com> wrote: >> Here is a proof-of-concept patch for reducing the alignment >> requirement for heap tuples on 64-bit systems. >> >> pg_depend 312 kB 296 kB >> pg_description 160 kB 152 kB > > 5 percent of gain seems reasonable for me. I agree. I am seeing ~10% gain in some of my tables where the tuple size was 44 bytes but due to the alignment was being padded out to 48 bytes. > Is it possible to apply your improvement for indexes? > I think there are more benefits for small index entries > that size are often 12 or 20 bytes. I am not sure if this improvement will affect indexes or not yet. My current implementation specifically excludes indexes and only affects heap tuples. Now that I have a better understanding of the disk structure for heap tuples, I am planning to look at the index layout in the near future to see if there is some possible gain there as well. >> This would completely eliminate the impact for this >> patch on 32-bit systems, which would not gain any benefit from this patch. > No! There are *also* benefits even on 32-bit systems, because some > of them have 8-byte alignment. (for example, 32-bit Windows) Excellent! I was not aware of that. Thanks for this information. Any ideas on what I should use for this check? I was thinking of using #if SIZEOF_SIZE_T = 8 Guess I should search around for standard solution for this problem. > BTW, there might be a small mistabke on the following lines in patch. > They do the same things ;-) > > *************** > *** 1019,1025 **** toast_flatten_tuple_attribute(Datum value, > new_len += BITMAPLEN(numAttrs); > if (olddata->t_infomask & HEAP_HASOID) > new_len += sizeof(Oid); > ! new_len = MAXALIGN(new_len); > Assert(new_len == olddata->t_hoff); > new_data_len = heap_compute_data_size(tupleDesc, > toast_values, toast_isnull); > --- 1025,1034 ---- > new_len += BITMAPLEN(numAttrs); > if (olddata->t_infomask & HEAP_HASOID) > new_len += sizeof(Oid); > ! if (olddata->t_infomask & HEAP_INTALIGN) > ! new_len = MAXALIGN(new_len); > ! else > ! new_len = MAXALIGN(new_len); > Assert(new_len == olddata->t_hoff); > new_data_len = heap_compute_data_size(tupleDesc, > toast_values, toast_isnull); Thanks for that catch! I have this fixed in my local GIT tree now. Thanks for your feedback and review! - Ryan
Added to TODO: Reduce data row alignment requirements on some 64-bit systems * http://archives.postgresql.org/pgsql-hackers/2008-10/msg00369.php --------------------------------------------------------------------------- Ryan Bradetich wrote: > Hello all, > > Here is a proof-of-concept patch for reducing the alignment > requirement for heap tuples on 64-bit systems. > This patch passes the regression tests and a couple of other data sets > I have thrown at it. > > I am hoping to get some early feedback on this patch so I have time to > make any corrections, improvements, > etc before the November commit-fest deadline. > > There are two visible savings in the system catalogs using this patch > (size was gathered using \S+): > > Catalog Pre-Patch Size After Patch Size > > pg_depend 312 kB 296 kB > pg_description 160 kB 152 kB > > > One thing I am considering, but I am not sure if it is worth the code > uglification, is to wrap the majority of these > checks in #ifdefs so they only are used on 64-bit systems. This > would completely eliminate the impact for this > patch on 32-bit systems, which would not gain any benefit from this patch. > > Feedback welcome! > > Thanks, > > - Ryan [ Attachment, skipping... ] > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce, Thanks for adding this to the TODO. I am planning on continuing to work on this patch after 8.4 releases. I know we are in feature freeze and I do not want to sidetrack the release process. Thanks! - Ryan On Thu, Jan 8, 2009 at 8:41 AM, Bruce Momjian <bruce@momjian.us> wrote: > > Added to TODO: > > Reduce data row alignment requirements on some 64-bit systems > > * http://archives.postgresql.org/pgsql-hackers/2008-10/msg00369.php > > > --------------------------------------------------------------------------- > > Ryan Bradetich wrote: >> Hello all, >> >> Here is a proof-of-concept patch for reducing the alignment >> requirement for heap tuples on 64-bit systems. >> This patch passes the regression tests and a couple of other data sets >> I have thrown at it. >> >> I am hoping to get some early feedback on this patch so I have time to >> make any corrections, improvements, >> etc before the November commit-fest deadline. >> >> There are two visible savings in the system catalogs using this patch >> (size was gathered using \S+): >> >> Catalog Pre-Patch Size After Patch Size >> >> pg_depend 312 kB 296 kB >> pg_description 160 kB 152 kB >> >> >> One thing I am considering, but I am not sure if it is worth the code >> uglification, is to wrap the majority of these >> checks in #ifdefs so they only are used on 64-bit systems. This >> would completely eliminate the impact for this >> patch on 32-bit systems, which would not gain any benefit from this patch. >> >> Feedback welcome! >> >> Thanks, >> >> - Ryan > > [ Attachment, skipping... ] > >> >> -- >> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-hackers > > -- > Bruce Momjian <bruce@momjian.us> http://momjian.us > EnterpriseDB http://enterprisedb.com > > + If your life is a hard drive, Christ can be your backup. + >
Ryan Bradetich wrote: > Bruce, > > Thanks for adding this to the TODO. > I am planning on continuing to work on this patch after 8.4 releases. > I know we are in feature freeze and I do not want to sidetrack the > release process. Great. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +