Thread: Abbreviated keys for text cost model fix

Abbreviated keys for text cost model fix

From

Peter Geoghegan

Date:

22 February 2015, 01:59:24

Over in the abbreviated keys for numeric thread, Tomas Vondra reports
a case where the ad-hoc cost model of abbreviated keys/sortsupport for
text is far too conservative, and aborts a case where abbreviated keys
can still greatly help.

Previously, I proposed additions to the cost model that dealt with all
this, by passing down a reltuples hint from the optimizer or a
relcache entry to tuplesort. Robert seemed to think that this was a
little bit much, and that it could be discussed at a later point. It's
clear that we need to do something about cases like the one reported
by Tomas, though.

Attached patch adds a decay to the threshold that (estimated)
abbreviated key cardinality must cross as a proportion of the
(estimated) cardinality of the original set of strings to be sorted,
while also quadrupling the initial required proportional cardinality
to 20% of full key cardinality (but for just the first 10,000 strings,
before this threshold is decayed aggressively). This seems
significantly less invasive than what I previously proposed, while
still being effective. I suggest that this be treated as a bugfix,
since Tomas reports a 5% -7% regression with his case. OTOH, with this
patch, I saw the same case have a ~300% improvement in performance.

The adjusted cost model strongly prefers to decide to abort early,
which seems desirable. In one way this makes it a little more
vulnerable to skewed sets, since early data matters more, but in
another more important way it's less vulnerable, since perhaps that
skew indicates a general skew in the data, a skew that makes the idea
of measuring the cardinality of abbreviated keys as a proxy for full
key cardinality less useful.

Thoughts?
--
Peter Geoghegan

Attachment

cost_model.patch

Re: Abbreviated keys for text cost model fix

From

Tomas Vondra

Date:

22 February 2015, 21:20:03

Hi,

On 22.2.2015 02:59, Peter Geoghegan wrote:
> Over in the abbreviated keys for numeric thread, Tomas Vondra reports
> a case where the ad-hoc cost model of abbreviated keys/sortsupport for
> text is far too conservative, and aborts a case where abbreviated keys
> can still greatly help.
>
> Previously, I proposed additions to the cost model that dealt with all
> this, by passing down a reltuples hint from the optimizer or a
> relcache entry to tuplesort. Robert seemed to think that this was a
> little bit much, and that it could be discussed at a later point. It's
> clear that we need to do something about cases like the one reported
> by Tomas, though.
>
> Attached patch adds a decay to the threshold that (estimated)
> abbreviated key cardinality must cross as a proportion of the
> (estimated) cardinality of the original set of strings to be sorted,
> while also quadrupling the initial required proportional cardinality
> to 20% of full key cardinality (but for just the first 10,000 strings,
> before this threshold is decayed aggressively). This seems
> significantly less invasive than what I previously proposed, while
> still being effective. I suggest that this be treated as a bugfix,
> since Tomas reports a 5% -7% regression with his case. OTOH, with this
> patch, I saw the same case have a ~300% improvement in performance.
>
> The adjusted cost model strongly prefers to decide to abort early,
> which seems desirable. In one way this makes it a little more
> vulnerable to skewed sets, since early data matters more, but in
> another more important way it's less vulnerable, since perhaps that
> skew indicates a general skew in the data, a skew that makes the idea
> of measuring the cardinality of abbreviated keys as a proxy for full
> key cardinality less useful.
>
> Thoughts?

I've done some more testing, and this helps a lot. I did the tests on
two different machines (one with Xeon E5450, one with i5-2500k), to see
how the patches (this and the datum/numeric ones) behave on different
CPUs. I also extended the tests a bit, to test random data (as before,
and data already sorted in ASC/DESC order). And of course, I did that
with different dataset sizes.

The aggregated results are available as a spreadsheet here:

   Xeon E5450: http://bit.ly/1AyRN7M
   i5-2500k:   http://bit.ly/1z9dLdP

You probably only want to look at the 'summary' sheet, which shows
speedup vs. master. Detailed results along with benchmarking scripts and
logs attached.

In short, this fixes all the cases except for the ASC sorted data. I
haven't done any code review, but I think we want this.

I'll use data from the i5-2500k, but it applies to the Xeon too, except
that the Xeon results are more noisy and the speedups are not that
significant.

For the 'text' data type, and 'random' dataset, the results are these:

      scale    datum    cost-model
    -------------------------------
     100000     328%          323%
    1000000     392%          391%
    2000000      96%          565%
    3000000      97%          572%
    4000000      97%          571%
    5000000      98%          570%

The numbers are speedup vs. master, so 100% means exactly the same
speed, 200% means twice as fast.

So while with 'datum' patch this actually caused very nice speedup for
small datasets - about 3-4x speedup up to 1M rows, for larger datasets
we've seen small regression (~3% slower). With the cost model fix, we
actually see a significant speedup (about 5.7x) for these cases.

I haven't verified whether this produces the same results, but if it
does this is very nice.

For 'DESC' dataset (i.e. data sorted in reverse order), we do get even
better numbers, with up to 6.5x speedup on large datasets.

But for 'ASC' dataset (i.e. already sorted data), we do get this:

      scale    datum    cost-model
    -------------------------------
     100000      85%           84%
    1000000      87%           87%
    2000000      76%           96%
    3000000      82%           90%
    4000000      91%           83%
    5000000      93%           81%

Ummm, not that great, I guess :-(

--
Tomas Vondra                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: Abbreviated keys for text cost model fix

From

Peter Geoghegan

Date:

22 February 2015, 21:30:47

On Sun, Feb 22, 2015 at 1:19 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> In short, this fixes all the cases except for the ASC sorted data. I
> haven't done any code review, but I think we want this.
>
> I'll use data from the i5-2500k, but it applies to the Xeon too, except
> that the Xeon results are more noisy and the speedups are not that
> significant.
>
> For the 'text' data type, and 'random' dataset, the results are these:
>
>       scale    datum    cost-model
>     -------------------------------
>      100000     328%          323%
>     1000000     392%          391%
>     2000000      96%          565%
>     3000000      97%          572%
>     4000000      97%          571%
>     5000000      98%          570%
>
> The numbers are speedup vs. master, so 100% means exactly the same
> speed, 200% means twice as fast.
>
> So while with 'datum' patch this actually caused very nice speedup for
> small datasets - about 3-4x speedup up to 1M rows, for larger datasets
> we've seen small regression (~3% slower). With the cost model fix, we
> actually see a significant speedup (about 5.7x) for these cases.

Cool.

> I haven't verified whether this produces the same results, but if it
> does this is very nice.
>
> For 'DESC' dataset (i.e. data sorted in reverse order), we do get even
> better numbers, with up to 6.5x speedup on large datasets.
>
> But for 'ASC' dataset (i.e. already sorted data), we do get this:
>
>       scale    datum    cost-model
>     -------------------------------
>      100000      85%           84%
>     1000000      87%           87%
>     2000000      76%           96%
>     3000000      82%           90%
>     4000000      91%           83%
>     5000000      93%           81%
>
> Ummm, not that great, I guess :-(

You should try it with the data fully sorted like this, but with one
tiny difference: The very last tuple is out of order. How does that
look?

-- 
Peter Geoghegan

Re: Abbreviated keys for text cost model fix

From

Peter Geoghegan

Date:

22 February 2015, 23:16:46

On Sun, Feb 22, 2015 at 1:30 PM, Peter Geoghegan <pg@heroku.com> wrote:
> You should try it with the data fully sorted like this, but with one
> tiny difference: The very last tuple is out of order. How does that
> look?

Another thing that may be of particular interest to you as a Czech
person is how various locales perform. I believe that the collation
rules of Czech and Hungarian are particularly complex, with several
passes often required for strcoll() (I think maybe more than 4). You
should either measure that, or control for it. I was prepared to
attribute the differences in the two systems to differences in compute
bandwidth between the CPUs involved (which are perhaps more noticeable
now, since memory latency is less of a bottleneck). However, the
inconsistent use of collations could actually matter here.

-- 
Peter Geoghegan

Re: Abbreviated keys for text cost model fix

From

Tomas Vondra

Date:

22 February 2015, 23:22:53

On 23.2.2015 00:16, Peter Geoghegan wrote:
> On Sun, Feb 22, 2015 at 1:30 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> You should try it with the data fully sorted like this, but with one
>> tiny difference: The very last tuple is out of order. How does that
>> look?

I'm running that test now, I'll post the results tomorrow.

> Another thing that may be of particular interest to you as a Czech
> person is how various locales perform. I believe that the collation
> rules of Czech and Hungarian are particularly complex, with several
> passes often required for strcoll() (I think maybe more than 4). You
> should either measure that, or control for it. I was prepared to
> attribute the differences in the two systems to differences in compute
> bandwidth between the CPUs involved (which are perhaps more noticeable
> now, since memory latency is less of a bottleneck). However, the
> inconsistent use of collations could actually matter here.

That's a good point, but all the tests were executed with en_US.UTF-8.

-- 
Tomas Vondra                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Abbreviated keys for text cost model fix

From

Tomas Vondra

Date:

23 February 2015, 16:41:20

Hi,

On 22.2.2015 22:30, Peter Geoghegan wrote:
> On Sun, Feb 22, 2015 at 1:19 PM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>> In short, this fixes all the cases except for the ASC sorted data. I
>> haven't done any code review, but I think we want this.
>>
>> I'll use data from the i5-2500k, but it applies to the Xeon too, except
>> that the Xeon results are more noisy and the speedups are not that
>> significant.
>>
>> For the 'text' data type, and 'random' dataset, the results are these:
>>
>>       scale    datum    cost-model
>>     -------------------------------
>>      100000     328%          323%
>>     1000000     392%          391%
>>     2000000      96%          565%
>>     3000000      97%          572%
>>     4000000      97%          571%
>>     5000000      98%          570%
>>
>> The numbers are speedup vs. master, so 100% means exactly the same
>> speed, 200% means twice as fast.
>>
>> So while with 'datum' patch this actually caused very nice speedup for
>> small datasets - about 3-4x speedup up to 1M rows, for larger datasets
>> we've seen small regression (~3% slower). With the cost model fix, we
>> actually see a significant speedup (about 5.7x) for these cases.
>
> Cool.
>
>> I haven't verified whether this produces the same results, but if it
>> does this is very nice.
>>
>> For 'DESC' dataset (i.e. data sorted in reverse order), we do get even
>> better numbers, with up to 6.5x speedup on large datasets.
>>
>> But for 'ASC' dataset (i.e. already sorted data), we do get this:
>>
>>       scale    datum    cost-model
>>     -------------------------------
>>      100000      85%           84%
>>     1000000      87%           87%
>>     2000000      76%           96%
>>     3000000      82%           90%
>>     4000000      91%           83%
>>     5000000      93%           81%
>>
>> Ummm, not that great, I guess :-(
>
> You should try it with the data fully sorted like this, but with one
> tiny difference: The very last tuple is out of order. How does that
> look?

So here are the results for ASC-ordered dataset, with one 'unsorted' row
added to the end of the dataset. As before the complete scripts are
attached, and the raw results are available in a spreadsheet:

    http://bit.ly/18g1nTU

The durations are much higher than without the single unsorted row added
at the end. Queries often take 20x longer to finish (on the same code),
depending on the scale.

The speedup results (compared to master) look like this:

    scale     query#     datum     numeric   cost model
    100000         1      859%        861%         856%
    100000         2      811%        814%         805%
    100000         3      100%        100%          97%
    1000000        1      805%        804%         807%
    1000000        2      769%        773%         770%
    1000000        3      100%        100%          98%
    2000000        1       97%         97%         673%
    2000000        2       96%         97%         646%
    2000000        3       99%        101%         678%
    3000000        1       98%         98%         578%
    3000000        2       96%         97%         557%
    3000000        3       99%        101%         579%
    4000000        1       99%         99%         513%
    4000000        2       97%         98%         497%
    4000000        3       99%        101%         510%
    5000000        1       99%         99%         469%
    5000000        2       97%         98%         456%
    5000000        3       99%        101%         466%

What's interesting here is that some queries are much faster, but query
#3 is slow until we hit 2M rows:

    select * from (select * from stuff_int_desc order by randint
                   offset 100000000000) foo

Looking at the previous tests, I see this is exactly what's happening to
this query with 'random' dataset - it's slightly slower than master up
until 2M rows, when it suddenly jumps to the same speedup as the other
queries. Can we do something about that?

Anyway, I'm wondering what conclusion we can do from this? I believe
vast majority of datasets in production won't be perfectly sorted,
because when the table is CLUSTERed by index we tend to use index scan
to do the sort (so no problem), or the data are not actually perfectly
sorted (and here we get significant speedup).


--
Tomas Vondra                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

scripts.tgz

Re: Abbreviated keys for text cost model fix

From

Peter Geoghegan

Date:

23 February 2015, 18:22:39

On Mon, Feb 23, 2015 at 8:40 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> The durations are much higher than without the single unsorted row added
> at the end. Queries often take 20x longer to finish (on the same code),
> depending on the scale.

I knew this would happen.  :-)

> What's interesting here is that some queries are much faster, but query
> #3 is slow until we hit 2M rows:
>
>     select * from (select * from stuff_int_desc order by randint
>                    offset 100000000000) foo

Not sure why that would be. Can you see what a "#define
DEBUG_ABBREV_KEYS" (within varlena.c) build shows happens with the
cost model? Enable debug1 output for that.

> Looking at the previous tests, I see this is exactly what's happening to
> this query with 'random' dataset - it's slightly slower than master up
> until 2M rows, when it suddenly jumps to the same speedup as the other
> queries. Can we do something about that?

Possibly.

> Anyway, I'm wondering what conclusion we can do from this? I believe
> vast majority of datasets in production won't be perfectly sorted,
> because when the table is CLUSTERed by index we tend to use index scan
> to do the sort (so no problem), or the data are not actually perfectly
> sorted (and here we get significant speedup).

One conclusion might be that commit a3f0b3d6 should not have added the
pre-check for perfectly sorted input, which is not in the Bentley &
Mcilroy original - exploiting the fact that the set is already
partially sorted is a fine thing, but we're not doing the necessary
bookkeeping. But that's not really why I suggested that test case.
Rather, I wanted to put the possible regression in the event of
perfectly sorted input in perspective. Maybe that regression isn't
worth worrying about, if in the event of a single tuple being out of
order at the end of the set the outcome dramatically shifts to
strongly favor abbreviation. Another way of looking at it would be
that the pre-sorted case is an argument against strxfrm() sorting in
general more than abbreviated keys in particular, an argument that
seems new and strange to me. The analysis of algorithms focuses on the
worst case and average case for a reason, and (say) the glibc
documentation never considers this when discussing strxfrm().

-- 
Peter Geoghegan

Re: Abbreviated keys for text cost model fix

From

Tomas Vondra

Date:

23 February 2015, 19:17:33

On 23.2.2015 19:22, Peter Geoghegan wrote:
> On Mon, Feb 23, 2015 at 8:40 AM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>> The durations are much higher than without the single unsorted row added
>> at the end. Queries often take 20x longer to finish (on the same code),
>> depending on the scale.
> 
> I knew this would happen.  :-)
> 
>> What's interesting here is that some queries are much faster, but query
>> #3 is slow until we hit 2M rows:
>>
>>     select * from (select * from stuff_int_desc order by randint
>>                    offset 100000000000) foo
> 
> Not sure why that would be. Can you see what a "#define 
> DEBUG_ABBREV_KEYS" (within varlena.c) build shows happens with the 
> cost model? Enable debug1 output for that.

test=# select * from (select * from stuff_text_asc order by randtxt
offset 100000000000) foo;
DEBUG:  abbrev_distinct after 160: 152.862464 (key_distinct: 155.187096,
norm_abbrev_card: 0.955390, prop_card: 0.200000)
DEBUG:  abbrev_distinct after 320: 314.784336 (key_distinct: 313.425344,
norm_abbrev_card: 0.983701, prop_card: 0.200000)
DEBUG:  abbrev_distinct after 640: 629.050490 (key_distinct: 643.945298,
norm_abbrev_card: 0.982891, prop_card: 0.200000)
DEBUG:  abbrev_distinct after 1280: 1194.271440 (key_distinct:
1257.153875, norm_abbrev_card: 0.933025, prop_card: 0.200000)
DEBUG:  abbrev_distinct after 2560: 2402.820431 (key_distinct:
2631.772790, norm_abbrev_card: 0.938602, prop_card: 0.200000)
DEBUG:  abbrev_distinct after 5120: 4490.142750 (key_distinct:
5261.865309, norm_abbrev_card: 0.876981, prop_card: 0.200000)
DEBUG:  abbrev_distinct after 10240: 8000.884171 (key_distinct:
10123.941172, norm_abbrev_card: 0.781336, prop_card: 0.200000)
DEBUG:  abbrev_distinct after 20480: 13596.546899 (key_distinct:
20563.364696, norm_abbrev_card: 0.663894, prop_card: 0.130000)
DEBUG:  abbrev_distinct after 40960: 23369.899021 (key_distinct:
40500.600680, norm_abbrev_card: 0.570554, prop_card: 0.084500)
DEBUG:  abbrev_distinct after 81920: 30884.544030 (key_distinct:
81466.848436, norm_abbrev_card: 0.377009, prop_card: 0.054925)randtxt
---------
(0 rows)

>> Looking at the previous tests, I see this is exactly what's
>> happening to this query with 'random' dataset - it's slightly
>> slower than master up until 2M rows, when it suddenly jumps to the
>> same speedup as the other queries. Can we do something about that?
> 
> Possibly.
> 
>> Anyway, I'm wondering what conclusion we can do from this? I believe
>> vast majority of datasets in production won't be perfectly sorted,
>> because when the table is CLUSTERed by index we tend to use index scan
>> to do the sort (so no problem), or the data are not actually perfectly
>> sorted (and here we get significant speedup).
> 
> One conclusion might be that commit a3f0b3d6 should not have added the
> pre-check for perfectly sorted input, which is not in the Bentley &
> Mcilroy original - exploiting the fact that the set is already
> partially sorted is a fine thing, but we're not doing the necessary

Would that pre-check hurt even the random test case? I can imagine it
might behave strangely with almost perfectly sorted data (when only the
very last row is unsorted), but not for random data (but see below).

> bookkeeping. But that's not really why I suggested that test case.
> Rather, I wanted to put the possible regression in the event of
> perfectly sorted input in perspective. Maybe that regression isn't
> worth worrying about, if in the event of a single tuple being out of
> order at the end of the set the outcome dramatically shifts to
> strongly favor abbreviation. Another way of looking at it would be
> that the pre-sorted case is an argument against strxfrm() sorting in
> general more than abbreviated keys in particular, an argument that
> seems new and strange to me. The analysis of algorithms focuses on the
> worst case and average case for a reason, and (say) the glibc
> documentation never considers this when discussing strxfrm().

Yeah, every algorithm has a worst case - either you get a very fast
execution on average with a few rare cases when it's much slower, or you
get very bad average performance.

But after looking at the data a bit more, I actually think this is a red
herring. After looking at the actual runtimes (and not just speedups), I
get this:
     scale    query#     master   cost-model    speedup    100000         1        884          103       856%
1000000        1      12153         1506       807%   2000000         1      26187         3894       673%   3000000
    1      38147         6601       578%   4000000         1      56332        10976       513%   5000000         1
77009        16409       469%
 
    100000         2        883          110       805%   1000000         2      12114         1574       770%
2000000        2      26104         4038       646%   3000000         2      38028         6828       557%   4000000
    2      56138        11294       497%   5000000         2      76720        16829       456%
 
    100000         3        102          106        97%   1000000         3       1507         1537        98%
2000000        3      26984         3977       678%   3000000         3      39090         6753       579%   4000000
    3      57239        11228       510%   5000000         3      78150        16769       466%
 

So while it's true that for the 3rd query we get much worse results
compared to the other queries (i.e. we don't get >400% speedup but ~3%
slowdown compared to master), it's true that master performs
exceptionally well for this query with small datasets. Once we get to 2M
rows, the master performance drops significantly but cost-model keeps
the performance characteristics and the speedup jumps back to ~700%
which is nice.

These numbers are for the 'ASC + unsorted row' test, but I do get
exactly the same pattern for the 'random' tests done previously.

It would be nice if we could address the 3% regression for the last
query, but I guess it's not a big deal. The numbers in general are
absolutely impressive. Kudos.

-- 
Tomas Vondra                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Abbreviated keys for text cost model fix

From

Peter Geoghegan

Date:

23 February 2015, 20:52:52

On Mon, Feb 23, 2015 at 11:17 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> So while it's true that for the 3rd query we get much worse results
> compared to the other queries (i.e. we don't get >400% speedup but ~3%
> slowdown compared to master), it's true that master performs
> exceptionally well for this query with small datasets. Once we get to 2M
> rows, the master performance drops significantly but cost-model keeps
> the performance characteristics and the speedup jumps back to ~700%
> which is nice.
>
> These numbers are for the 'ASC + unsorted row' test, but I do get
> exactly the same pattern for the 'random' tests done previously.

Yeah. Looks like you're comparing a case where the old cost model did
the right thing anyway (i.e. used abbreviation). The difference would
then be entirely explainable as noise. Right?

> It would be nice if we could address the 3% regression for the last
> query, but I guess it's not a big deal. The numbers in general are
> absolutely impressive. Kudos.

Thanks.

-- 
Peter Geoghegan

Re: Abbreviated keys for text cost model fix

From

Tomas Vondra

Date:

24 February 2015, 00:42:31

On 23.2.2015 21:52, Peter Geoghegan wrote:
> On Mon, Feb 23, 2015 at 11:17 AM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>> So while it's true that for the 3rd query we get much worse results
>> compared to the other queries (i.e. we don't get >400% speedup but ~3%
>> slowdown compared to master), it's true that master performs
>> exceptionally well for this query with small datasets. Once we get to 2M
>> rows, the master performance drops significantly but cost-model keeps
>> the performance characteristics and the speedup jumps back to ~700%
>> which is nice.
>>
>> These numbers are for the 'ASC + unsorted row' test, but I do get
>> exactly the same pattern for the 'random' tests done previously.
> 
> Yeah. Looks like you're comparing a case where the old cost model
> did the right thing anyway (i.e. used abbreviation). The difference
> would then be entirely explainable as noise. Right?

I don't think it's just noise - on the i5-2500 CPU, the result are very
consistent. For 1M random rows with this query (#3)
   select * from (select * from stuff_text order by randtxt                                          offset
100000000000)foo

I do get this (with 20 of the query):
                min       max      median    mean      stddev
------------------------------------------------------------  master       932.449   932.902  932.637   932.639   0.116
 cost-model   957.16    958.47   957.497   957.586   0.411

That is super-consistent, IMHO, and the ~2.6% slowdown is clearly
visible. But it might be due to the padding described by Andrew.

Similarly on the Xeon E5450 CPU, although the results there are much
more noisy. Maybe it's more

>> It would be nice if we could address the 3% regression for the
>> last query, but I guess it's not a big deal. The numbers in general
>> are absolutely impressive. Kudos.
> 
> Thanks.

Are you going to add this into the CF? Would be nice to get this into
9.5. Strictly speaking it should go to 2015-06 I guess, but I'd consider
it part of one of the existing sortsupport patches.

-- 
Tomas Vondra                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Abbreviated keys for text cost model fix

From

Peter Geoghegan

Date:

24 February 2015, 00:44:35

On Mon, Feb 23, 2015 at 4:41 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> Are you going to add this into the CF? Would be nice to get this into
> 9.5. Strictly speaking it should go to 2015-06 I guess, but I'd consider
> it part of one of the existing sortsupport patches.

It's a bugfix, IMV. I guess I should add it to the current CF, but I
think I might be forbidden from doing so as a non-CF-manager...
-- 
Peter Geoghegan

Re: Abbreviated keys for text cost model fix

From

Tomas Vondra

Date:

24 February 2015, 02:11:18

On 24.2.2015 01:44, Peter Geoghegan wrote:
> On Mon, Feb 23, 2015 at 4:41 PM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>> Are you going to add this into the CF? Would be nice to get this into
>> 9.5. Strictly speaking it should go to 2015-06 I guess, but I'd consider
>> it part of one of the existing sortsupport patches.
> 
> It's a bugfix, IMV. I guess I should add it to the current CF, but I
> think I might be forbidden from doing so as a non-CF-manager...


Yes, I consider that a bugfix too, fixing a performance regression.

regards

-- 
Tomas Vondra                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Abbreviated keys for text cost model fix

From

Jeremy Harris

Date:

25 February 2015, 00:32:21

On 23/02/15 16:40, Tomas Vondra wrote:
> On 22.2.2015 22:30, Peter Geoghegan wrote:
>> You should try it with the data fully sorted like this, but with one 
>> tiny difference: The very last tuple is out of order. How does that 
>> look?

If this case is actually important, a merge-sort can take
significant advantage of the partial order:


test=# explain analyze select * from (select * from stuff_text_asc order
by randtxt offset 100000000000) foo;                                                             QUERY PLAN


---------------------------------------------------------------------------------------------------------------------------------------Limit
(cost=247054.81..247054.81 rows=1 width=18) (actual
 
time=25133.029..25133.029 rows=0 loops=1)  ->  Sort  (cost=242054.81..247054.81 rows=2000001 width=18) (actual
time=25025.931..25088.406 rows=2000001 loops=1)        Sort Key: stuff_text_asc.randtxt        Sort Method: quicksort
Memory:221213kB  Compares: 95541376        ->  Seq Scan on stuff_text_asc  (cost=0.00..32739.01
 
rows=2000001 width=18) (actual time=0.011..118.390 rows=2000001 loops=1)Planning time: 0.080 msExecution time:
25144.538ms
 
(7 rows)

Time: 25145.185 ms
test=#
test=#
test=# set enable_intmerge_sort to on;
SET
Time: 0.378 ms
test=# explain analyze select * from (select * from stuff_text_asc order
by randtxt offset 100000000000) foo;                                                             QUERY PLAN


--------------------------------------------------------------------------------------------------------------------------------------Limit
(cost=247054.81..247054.81 rows=1 width=18) (actual
 
time=1051.603..1051.603 rows=0 loops=1)  ->  Sort  (cost=242054.81..247054.81 rows=2000001 width=18) (actual
time=943.304..1006.988 rows=2000001 loops=1)        Sort Key: stuff_text_asc.randtxt        Sort Method: internal merge
Memory: 221213kB  Compares: 2000002        ->  Seq Scan on stuff_text_asc  (cost=0.00..32739.01
 
rows=2000001 width=18) (actual time=0.009..98.474 rows=2000001 loops=1)Planning time: 0.072 msExecution time: 1063.434
ms
(7 rows)

Time: 1064.113 ms
test=#
test=# set enable_intmerge_sort to off;
SET
Time: 0.353 ms
test=#
test=#
test=#
test=#
test=#
test=# explain analyze select count(distinct randtxt) from stuff_text_asc;
           QUERY PLAN
 


---------------------------------------------------------------------------------------------------------------------------------Aggregate
(cost=37739.01..37739.02 rows=1 width=18) (actual
 
time=25196.814..25196.815 rows=1 loops=1)  ->  Seq Scan on stuff_text_asc  (cost=0.00..32739.01 rows=2000001
width=18) (actual time=0.010..114.995 rows=2000001 loops=1)Planning time: 0.053 msExecution time: 25196.857 ms
(4 rows)

Time: 25197.371 ms
test=#
test=# explain analyze select count(*) from (select distinct randtxt
from stuff_text_asc) as foo;                                                                QUERY
PLAN

---------------------------------------------------------------------------------------------------------------------------------------------Aggregate
(cost=277054.83..277054.84 rows=1 width=0) (actual
 
time=25521.258..25521.258 rows=1 loops=1)  ->  Unique  (cost=242054.81..252054.81 rows=2000001 width=18) (actual
time=25101.157..25438.622 rows=1999100 loops=1)        ->  Sort  (cost=242054.81..247054.81 rows=2000001 width=18)
(actual time=25101.156..25184.436 rows=2000001 loops=1)              Sort Key: stuff_text_asc.randtxt              Sort
Method:quicksort  Memory: 221213kB  Compares: 95541376              ->  Seq Scan on stuff_text_asc
(cost=0.00..32739.01
rows=2000001 width=18) (actual time=0.011..116.509 rows=2000001 loops=1)Planning time: 0.088 msExecution time:
25532.947ms
 
(8 rows)

Time: 25533.642 ms
test=#
test=#
test=# set enable_intmerge_sort to on;
SET
Time: 0.401 ms
test=# explain analyze select count(*) from (select distinct randtxt
from stuff_text_asc) as foo;                                                             QUERY PLAN


---------------------------------------------------------------------------------------------------------------------------------------Aggregate
(cost=272054.82..272054.83 rows=1 width=0) (actual
 
time=1184.289..1184.289 rows=1 loops=1)  ->  Sort  (cost=242054.81..247054.81 rows=2000001 width=18) (actual
time=1037.019..1100.720 rows=1999100 loops=1)        Sort Key: stuff_text_asc.randtxt        Sort Method: dedup
internalmerge  Memory: 221143kB  Compares:
 
2000001        ->  Seq Scan on stuff_text_asc  (cost=0.00..32739.01
rows=2000001 width=18) (actual time=0.010..106.729 rows=2000001 loops=1)Planning time: 0.086 msExecution time: 1195.891
ms
(7 rows)

Time: 1196.514 ms
test=#


-- 
Cheers, Jeremy

Re: Abbreviated keys for text cost model fix

From

Peter Geoghegan

Date:

25 February 2015, 00:42:26

On Tue, Feb 24, 2015 at 4:32 PM, Jeremy Harris <jgh@wizmail.org> wrote:
> On 23/02/15 16:40, Tomas Vondra wrote:
>> On 22.2.2015 22:30, Peter Geoghegan wrote:
>>> You should try it with the data fully sorted like this, but with one
>>> tiny difference: The very last tuple is out of order. How does that
>>> look?
>
> If this case is actually important, a merge-sort can take
> significant advantage of the partial order:

> test=# set enable_intmerge_sort to on;

What is this?

-- 
Peter Geoghegan

Re: Abbreviated keys for text cost model fix

From

Jeremy Harris

Date:

25 February 2015, 09:05:41

On 25/02/15 00:42, Peter Geoghegan wrote:
> On Tue, Feb 24, 2015 at 4:32 PM, Jeremy Harris <jgh@wizmail.org> wrote:
>> On 23/02/15 16:40, Tomas Vondra wrote:
>>> On 22.2.2015 22:30, Peter Geoghegan wrote:
>>>> You should try it with the data fully sorted like this, but with one
>>>> tiny difference: The very last tuple is out of order. How does that
>>>> look?
>>
>> If this case is actually important, a merge-sort can take
>> significant advantage of the partial order:
> 
>> test=# set enable_intmerge_sort to on;
> 
> What is this?

Associated with an implementation of an internal merge sort,
a control to disable it.  Not in the mainline sourcebase.
-- 
Cheers, Jeremy

Re: Abbreviated keys for text cost model fix

From

Jeremy Harris

Date:

02 March 2015, 23:04:45

On 25/02/15 00:32, Jeremy Harris wrote:
> On 23/02/15 16:40, Tomas Vondra wrote:
>> On 22.2.2015 22:30, Peter Geoghegan wrote:
>>> You should try it with the data fully sorted like this, but with one 
>>> tiny difference: The very last tuple is out of order. How does that 
>>> look?
> 
> If this case is actually important, a merge-sort can take
> significant advantage of the partial order:

Presumably it is not, as nobody commented
on the alleged 20 or 30x speedup.

Re: Abbreviated keys for text cost model fix

From

Arthur Silva

Date:

03 March 2015, 03:08:58

<div dir="ltr"><br /><div class="gmail_extra"><br /><div class="gmail_quote">On Mon, Mar 2, 2015 at 8:04 PM, Jeremy
Harris<span dir="ltr"><<a href="mailto:jgh@wizmail.org" target="_blank">jgh@wizmail.org</a>></span> wrote:<br
/><blockquoteclass="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span
class="">On25/02/15 00:32, Jeremy Harris wrote:<br /> > On 23/02/15 16:40, Tomas Vondra wrote:<br /> >> On
22.2.201522:30, Peter Geoghegan wrote:<br /> >>> You should try it with the data fully sorted like this, but
withone<br /> >>> tiny difference: The very last tuple is out of order. How does that<br /> >>>
look?<br/> ><br /> > If this case is actually important, a merge-sort can take<br /> > significant advantage
ofthe partial order:<br /><br /></span>Presumably it is not, as nobody commented<br /> on the alleged 20 or 30x
speedup.<br/><div class="HOEnZb"><div class="h5"><br /><br /><br /><br /> --<br /> Sent via pgsql-hackers mailing list
(<ahref="mailto:pgsql-hackers@postgresql.org">pgsql-hackers@postgresql.org</a>)<br /> To make changes to your
subscription:<br/><a href="http://www.postgresql.org/mailpref/pgsql-hackers"
target="_blank">http://www.postgresql.org/mailpref/pgsql-hackers</a><br/></div></div></blockquote></div><br
/></div><divclass="gmail_extra">Does it always perform mergesort instead of quicksort when enabled?<br />Seems like the
casefor a hybrid sort (like timsort). I know there was some talk to replace quicksort with timsort back in 2012 but it
wasa deadend at the time.<br /></div></div>

Re: Abbreviated keys for text cost model fix

From

Jeremy Harris

Date:

03 March 2015, 10:00:45

On 03/03/15 03:08, Arthur Silva wrote:
> Does it always perform mergesort instead of quicksort when enabled?

Yes; there seemed no advantage to any additional complexity.
The merge consistently performs fewer comparisons than the
quicksort, on random input - and many fewer if there are
any sorted (or reverse-sorted) sections.  However, it also
consistently takes slightly longer (a few percent).  I was
unable to chase this down but assume it to be a cacheing
effect.  So I don't currently think it should replace the
current sort for all use.

If the planner can identify cases where there is some pre-sort
in the input (CLUSTER was mentioned?), or maybe a correlated
index, it could be worthwhile.

Also useful might be very-expensive comparison cases,
and distinct-output cases (uniqification is supported at
each submerge stage, so data disappears early).

There's potential for running submerges in parallel
using multiple cores, but I've not worked on that.

-- 
Cheers, Jeremy

Re: Abbreviated keys for text cost model fix

From

Peter Geoghegan

Date:

03 March 2015, 17:56:16

On Tue, Mar 3, 2015 at 2:00 AM, Jeremy Harris <jgh@wizmail.org> wrote:
> Yes; there seemed no advantage to any additional complexity.
> The merge consistently performs fewer comparisons than the
> quicksort, on random input - and many fewer if there are
> any sorted (or reverse-sorted) sections.  However, it also
> consistently takes slightly longer (a few percent).  I was
> unable to chase this down but assume it to be a cacheing
> effect.  So I don't currently think it should replace the
> current sort for all use.

It's definitely caching effects. That's a very important consideration here.

-- 
Peter Geoghegan

Re: Abbreviated keys for text cost model fix

From

Robert Haas

Date:

03 April 2015, 12:49:44

On Mon, Feb 23, 2015 at 7:44 PM, Peter Geoghegan <pg@heroku.com> wrote:
> On Mon, Feb 23, 2015 at 4:41 PM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>> Are you going to add this into the CF? Would be nice to get this into
>> 9.5. Strictly speaking it should go to 2015-06 I guess, but I'd consider
>> it part of one of the existing sortsupport patches.
>
> It's a bugfix, IMV. I guess I should add it to the current CF, but I
> think I might be forbidden from doing so as a non-CF-manager...

Committed.  For future reference, I'd prefer to have things like this
added to the next CF rather than no CF at all.

I'm not sure if you've got all the details right here, but the concept
makes sense to me: if we're going to give up, we should do it early.
Doing it later throws away more work and is therefore riskier.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Abbreviated keys for text cost model fix

From

Peter Geoghegan

Date:

03 April 2015, 13:04:31

On Fri, Apr 3, 2015 at 1:49 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> Committed.  For future reference, I'd prefer to have things like this
> added to the next CF rather than no CF at all.

I'll bear that in mind. Thanks.


-- 
Peter Geoghegan