Re: Performance Improvement by reducing WAL for Update Operation - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Performance Improvement by reducing WAL for Update Operation |
Date | |
Msg-id | 001f01ce1c14$d3af0770$7b0d1650$@kapila@huawei.com Whole thread Raw |
In response to | Re: Performance Improvement by reducing WAL for Update Operation (Heikki Linnakangas <hlinnakangas@vmware.com>) |
Responses |
Re: Performance Improvement by reducing WAL for Update Operation
|
List | pgsql-hackers |
On Wednesday, March 06, 2013 2:57 AM Heikki Linnakangas wrote: > On 04.03.2013 06:39, Amit Kapila wrote: > > On Sunday, March 03, 2013 8:19 PM Craig Ringer wrote: > >> On 02/05/2013 11:53 PM, Amit Kapila wrote: > >>>> Performance data for the patch is attached with this mail. > >>>> Conclusions from the readings (these are same as my previous > patch): > >>>> > > I've been doing investigating the pglz option further, and doing > performance comparisons of the pglz approach and this patch. I'll begin > with some numbers: > > unpatched (63d283ecd0bc5078594a64dfbae29276072cdf45): > > testname | wal_generated | > duration > > -----------------------------------------+---------------+------------- > - > -----------------------------------------+---------------+---- > two short fields, no change | 1245525360 | > 9.94613695144653 > two short fields, one changed | 1245536528 | > 10.146910905838 > two short fields, both changed | 1245523160 | > 11.2332470417023 > one short and one long field, no change | 1054926504 | > 5.90477800369263 > ten tiny fields, all changed | 1411774608 | > 13.4536008834839 > hundred tiny fields, all changed | 635739680 | > 7.57448387145996 > hundred tiny fields, half changed | 636930560 | > 7.56888699531555 > hundred tiny fields, half nulled | 573751120 | > 6.68991994857788 > > Amit's wal_update_changes_v10.patch: > > testname | wal_generated | > duration > > -----------------------------------------+---------------+------------- > - > -----------------------------------------+---------------+---- > two short fields, no change | 1249722112 | > 13.0558869838715 > two short fields, one changed | 1246145408 | > 12.9947438240051 > two short fields, both changed | 1245951056 | > 13.0262880325317 > one short and one long field, no change | 678480664 | > 5.70031690597534 > ten tiny fields, all changed | 1328873920 | > 20.0167419910431 > hundred tiny fields, all changed | 638149416 | > 14.4236788749695 > hundred tiny fields, half changed | 635560504 | > 14.8770561218262 > hundred tiny fields, half nulled | 558468352 | > 16.2437210083008 > > pglz-with-micro-optimizations-1.patch: > > testname | wal_generated | > duration > -----------------------------------------+---------------+------------- > - > -----------------------------------------+---------------+---- > two short fields, no change | 1245519008 | > 11.6702048778534 > two short fields, one changed | 1245756904 | > 11.3233819007874 > two short fields, both changed | 1249711088 | > 11.6836447715759 > one short and one long field, no change | 664741392 | > 6.44810795783997 > ten tiny fields, all changed | 1328085568 | > 13.9679481983185 > hundred tiny fields, all changed | 635974088 | > 9.15514206886292 > hundred tiny fields, half changed | 636309040 | > 9.13769292831421 > hundred tiny fields, half nulled | 496396448 | > 8.77351498603821 > > In each test, a table is created with a large number of identical rows, > and fillfactor=50. Then a full-table UPDATE is performed, and the > UPDATE is timed. Duration is the time spent in the UPDATE (lower is > better), and wal_generated is the amount of WAL generated by the > updates (lower is better). Based on your patch, I have tried some more optimizations: Fixed bug in your patch (pglz-with-micro-optimizations-2): 1. There were some problems in recovery due to wrong length of oldtuple passed in decode which I have corrected. Approach -1 (pglz-with-micro-optimizations-2_roll10_32) 1. Move strategy min length (32) check in log_heap_update 2. Rolling 10 for hash as suggested by you is added. Approach -2 (pglz-with-micro-optimizations-2_roll10_32_1hashkey) 1. This is done on top of Approach-1 changes 2. Used 1 byte data as the hash key. Approach-3 (pglz-with-micro-optimizations-2_roll10_32_1hashkey_batch_literal) 1. This is done on top of Approach-1 and Approach-2 changes 2. Instead of doing copy of literal byte when it founds as non match with history, do all in a batch. Data for all above approaches is in attached file "test_readings" (Apart from your tests, I have added one more test " hundred tiny fields, first 10 changed") Summary - After changes of Approach-1, CPU utilization for all except 2 tests ("hundred tiny fields, all changed", "hundred tiny fields, half changed") is either same or less. The best case CPU utilization has decreased (which is better), but WAL reduction has little bit increased (which is as per expectation due 10 consecutive rollup's). Approach-2 modifications was done to see if there is any overhead of hash calculation. Approach-2 & Approach-3 doesn't result into any improvements. I have investigated the reason for CPU utilization for 2 tests and the reason is that there is nothing to compress in the new tuple and that information it will come to know only after it processes 75% (compression ratio) of tuple bytes. I think any compression algorithm will have this drawback that if data is not compressible, it can consume time inspite of the fact that it will not be able to compress the data. I think most updates will update some part of tuple which will always yield positive results. Apart from above tests, I have run your patch against my old tests, it yields quite positive results, WAL Reduction is more as compare to my patch and CPU utilization is almost similar or my patch is slightly better. The results are in attached file "pgbench_pg_lz_mod" The above all data is for synchronous_commit = off. I can collect the data for synchronous_commit = on and Performance of recovery. Any further suggestions? With Regards, Amit Kapila.
pgsql-hackers by date: