Re: Optimizing pglz compressor - Mailing list pgsql-hackers
| From | Amit Kapila |
|---|---|
| Subject | Re: Optimizing pglz compressor |
| Date | |
| Msg-id | 01fa01ce7272$5b359700$11a0c500$@kapila@huawei.com Whole thread Raw |
| In response to | Re: Optimizing pglz compressor (Heikki Linnakangas <hlinnakangas@vmware.com>) |
| Responses |
Re: Optimizing pglz compressor
|
| List | pgsql-hackers |
On Wednesday, June 26, 2013 2:15 AM Heikki Linnakangas wrote:
> On 19.06.2013 14:01, Amit Kapila wrote:
> > Observations
> > --------------
> > 1. For small data perforamce is always good with patch.
> > 2. For random small/large data performace is good.
> > 3. For medium and large text and same byte data(3K,5K text,
> > 10K,100K,500K same byte), performance is degraded.
>
> Wow, that's strange. What platform and CPU did you test on?
CPU - 4 core
RAM - 24GB
OS - SUSE 11 SP2
Kernel version - 3.0.13
> Are you
> sure you used the same compiler flags with and without the patch?
Yes.
> Can you also try the attached patch, please? It's the same as before,
> but in this version, I didn't replace the prev and next pointers in
> PGLZ_HistEntry struct with int16s. That avoids some table lookups, at
> the expense of using more memory. It's closer to what we have without
> the patch, so maybe that helps on your system.
Yes it helped a lot on my system.
Head: testname | auto
-------------------+----------- 5k text | 3499.888 512b text | 1425.106 256b text |
1769.1261K text | 1378.151 3K text | 4081.254 2k random | -410.928 100k random |
-10.235500k random | -2.094 512b random | -770.665 256b random | -1120.173 1K random |
-570.35110k of same byte | 3602.610 100k of same byte | 36187.863 500k of same byte | 26055.472
After your Patch (pglz-variable-size-hash-table-2.patch)
testname | auto
-------------------+----------- 5k text | 3524.306 512b text | 954.962 256b text |
832.5271K text | 1273.970 3K text | 3963.329 2k random | -300.516 100k random |
-7.538500k random | -1.525 512b random | -439.726 256b random | -440.154 1K random |
-391.07010k of same byte | 3570.921 100k of same byte | 37498.502 500k of same byte | 26904.426
There was minor problem in you patch, in one of experiments it crashed.
Fix is not to access 0th history entry in function pglz_find_match(),
modified patch is attached.
After fix, readings are almost similar:
testname | auto
-------------------+----------- 5k text | 3347.961 512b text | 938.442 256b text |
834.4961K text | 1243.618 3K text | 3790.835 2k random | -306.470 100k random |
-7.589500k random | -1.517 512b random | -442.649 256b random | -438.781 1K random |
-392.10610k of same byte | 3565.449 100k of same byte | 37355.595 500k of same byte | 26776.076
I guess some difference might be due to different way of accessing in
pglz_hist_add().
With Regards,
Amit Kapila.
pgsql-hackers by date: