From d7ac8850819b13eddc1aa34a9ece31ca568fbfa7 Mon Sep 17 00:00:00 2001 From: Peter Geoghegan Date: Fri, 5 Sep 2014 16:00:58 -0700 Subject: [PATCH 6/7] Don't pursue ABBREVIATED_KEYS_TIE optimization ABBREVIATED_KEYS_TIE was previously exposed by the abbreviated key infrastructure on the assumption that it was useful to give opclasses the ability to differentiate between the case when a non-abbreviated comparison was required due to an abbreviated comparison being inconclusive, and the case where there was no such leading inconclusive abbreviated comparison to begin with (e.g. on a comparison of the second or subsequent attribute in a multi-column sort). Evidently, at least in the case of text's default opclass (the only opclass that currently supports abbreviated keys), it is preferable to always opportunistically attempt to get away with a "memcmp() == 0" comparison (when the compared text datums are of matching length), regardless of what degree of confidence we have in a successful outcome. Testing of non-sympathetic cases has shown no appreciable penalty, while sympathetic cases, such as multi-column sorts of correlated attributes, show considerable performance improvements. Since there is no longer any distinction made between calls made and not made following an inconclusive abbreviated comparison, the abstract concept of ABBREVIATED_KEYS_TIE appears useless, and so has been removed. --- src/backend/utils/adt/varlena.c | 49 +++++++++++++++++++++++++------------- src/backend/utils/sort/tuplesort.c | 2 +- src/include/utils/sortsupport.h | 19 ++++----------- 3 files changed, 38 insertions(+), 32 deletions(-) diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c index 92d9135..6aa3eaa 100644 --- a/src/backend/utils/adt/varlena.c +++ b/src/backend/utils/adt/varlena.c @@ -1884,7 +1884,7 @@ btsortsupport_worker(SortSupport ssup, Oid collid) * Initialize tiebreak sortsupport state with an authoritative * strcoll()-based comparison func for tie-breaking. */ - ssup->abbrev_tiebreak->abbrev_state = ABBREVIATED_KEYS_TIE; + ssup->abbrev_tiebreak->abbrev_state = ABBREVIATED_KEYS_NO; btsortsupport_worker(ssup->abbrev_tiebreak, collid); } @@ -1965,26 +1965,41 @@ bttextfastcmp_locale(Datum x, Datum y, SortSupport ssup) len1 = VARSIZE_ANY_EXHDR(arg1); len2 = VARSIZE_ANY_EXHDR(arg2); - if (ssup->abbrev_state == ABBREVIATED_KEYS_TIE && len1 == len2) + if (len1 == len2) { /* - * Being called as authoritative tie-breaker for an abbreviated key - * comparison. + * Regardless of whether this is a call to authoritatively tie-break an + * abbreviated key comparison, or a call with no preceding inconclusive + * abbreviated key comparison (e.g. the call concerns the second or + * subsequent attribute of a multi-column sort, where abbreviated keys + * are never used), opportunistically attempt to resolve the comparison + * with a cheap memcmp() that happens to indicate equality. * - * TODO: Consider using this optimization more frequently, perhaps - * even entirely opportunistically. + * With abbreviated keys, in general there is a pretty good chance + * control reached here because the two keys are actually fully equal. + * The ad-hoc costing model within bttext_abbrev_convert() ought to + * ensure that there cannot be an excessive number of these + * opportunistic calls that don't work out; the model does not + * distinguish between abbreviated comparisons and comparisons + * successfully resolved here, allowing the abbreviated key + * optimization to very effectively accelerate sorts on low cardinality + * attributes. * - * In general there is a pretty good chance control reached here - * because the two keys are actually fully equal. Try and give an - * answer using only a cheap memcmp() comparison on the assumption that - * this will indicate equality frequently enough for it to be worth it - * on balance. This is a reasonable assumption, since sorting is - * almost certainly bottlenecked on memory latency. - * - * Original, authoritative key cardinality is weighed within - * bttext_abbrev_abort(). Cases where attempts at resolving tie-breakers in - * this manner are the usual outcome, and yet those attempts usually - * fail should only arise infrequently. + * Doing this for the non-abbreviated key case tends to help a lot with + * multi-column sorts where columns are highly correlated, such as a + * sort on (country, state, city) attributes -- when a tie-breaker + * comparison on the second or subsequent attributes in sought, a + * considerable proportion of those comparisons can be resolved here. + * However, in the non-abbreviated key case this path is entirely (as + * opposed to somewhat) opportunistic, and in general there is a very + * low chance that this will work out. Proceeding even in the + * non-abbreviated case is justified by the fact that in practice, the + * cost in non-sympathetic cases is often indistinguishable from zero. + * Presumably, this is due to the effects of CPU pipelining. The + * bottleneck is very probably memory latency, and so on modern + * hardware there is no appreciable penalty for the often futile + * checking of the first few bytes of each of the pair of strings. + * These bytes already needed to be stored in cachelines. */ if (memcmp(a1p, a2p, len1) == 0) { diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c index ea1252c..cabb632 100644 --- a/src/backend/utils/sort/tuplesort.c +++ b/src/backend/utils/sort/tuplesort.c @@ -2946,7 +2946,7 @@ comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state) Assert(attno == sortKey->abbrev_tiebreak->ssup_attno); Assert(sortKey->abbrev_state == ABBREVIATED_KEYS_YES); - Assert(sortKey->abbrev_tiebreak->abbrev_state == ABBREVIATED_KEYS_TIE); + Assert(sortKey->abbrev_tiebreak->abbrev_state == ABBREVIATED_KEYS_NO); datum1 = heap_getattr(<up, attno, tupDesc, &isnull1); datum2 = heap_getattr(&rtup, attno, tupDesc, &isnull2); diff --git a/src/include/utils/sortsupport.h b/src/include/utils/sortsupport.h index 7b84e0a..0fa75de 100644 --- a/src/include/utils/sortsupport.h +++ b/src/include/utils/sortsupport.h @@ -56,8 +56,7 @@ typedef enum { ABBREVIATED_KEYS_NO = 0, /* Second/subsequent key, or no abbreviation capability */ - ABBREVIATED_KEYS_YES , /* Leading (abbreviated applicable) key (when available) */ - ABBREVIATED_KEYS_TIE, /* "True" leading key, abbreviation tie-breaker */ + ABBREVIATED_KEYS_YES /* Leading (abbreviated applicable) key (when available) */ } SortKeyAbbreviate; typedef struct SortSupportData *SortSupport; @@ -147,22 +146,14 @@ typedef struct SortSupportData * sortsupport routine needs to know if its dealing with a leading key). * Even with a leading key, internal sortsupport clients like tuplesort may * pass ABBREVIATED_KEYS_NO because it isn't feasible to inject the - * conversion routine. However, ABBREVIATED_KEYS_TIE means that it's a - * "proper" sortsupport state, originally generated by the sortsupport - * routine itself -- the core system will never directly create a tie-break - * sort support state. There is very little distinction between - * ABBREVIATED_KEYS_NO and ABBREVIATED_KEYS_TIE, though. The distinction - * only exists to allow sortsupport routines to squeeze a bit more - * performance from the knowledge that an authoritative tie-breaker - * comparison is required because a prior alternative comparison didn't - * work out (as opposed to being called without there ever being an - * abbreviated comparison of the tuple attribute). + * conversion routine. * * A sortsupport routine may itself initially decide against applying the * abbreviation optimization by setting a passed sort support state's - * abbreviated state to ABBREVIATED_KEYS_NO. This is typically used for + * abbreviated state to ABBREVIATED_KEYS_NO. This could be used for * platform-specific special cases where the optimization is thought to be - * ineffective. + * ineffective. By convention, opclasses enable and disable abbreviation + * in deference to USE_ABBREV_KEYS. */ SortKeyAbbreviate abbrev_state; -- 1.9.1