Home > mailing lists

Re: Which qsort is used - Mailing list pgsql-hackers

From	Dann Corbit
Subject	Re: Which qsort is used
Date	December 16, 2005 04:05:45
Msg-id	D425483C2C5C9F49B5B7A41F8944154757D37F@postal.corporate.connx.com Whole thread Raw
In response to	Which qsort is used (Qingqing Zhou <zhouqq@cs.toronto.edu>)
Responses	Re: Which qsort is used
List	pgsql-hackers

Tree view

> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Thursday, December 15, 2005 6:24 PM
> To: Dann Corbit
> Cc: Qingqing Zhou; Greg Stark; Jim C. Nasby; Luke Lonergan; Neil
Conway;
> Bruce Momjian; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Which qsort is used
>
> "Dann Corbit" <DCorbit@connx.com> writes:
> > Radix sort can also be made completely generic by using a callback
> > function.
> > The function gives back n-bits at a time, from the most significant
bits
> > to the least significant.
>
> That's mighty handwavy --- it assumes that the datatype permits a
simple
> breakdown into small pieces that can be sorted lexicographically.

It's not so hard.  For fixed length character strings, the mapping is
just the character.  For integers the mapping is obvious [msb to lsb or
lsb to msb of the integer, depending on whether you are doing msd or lsd
radix sort].  For intel floating point, the transformation is:

#include <assert.h>
#include "inteltyp.h"

uint32
float2key(float f)
{   uint32          sign,                   mant,                   mask;
   assert(sizeof(float) == sizeof(uint32));   mant = *(uint32 *) & f;     /* Load float as array of bits */   sign =
mant& SB_MASK32;    /* Isolate the leading sign bit */   mant ^= SB_MASK32;          /* Invert the sign bit, making + >
-*/   mask = sign - (sign >> 31); /* Either 0 or 0x7fffffff */   mant ^= mask;               /* Invert exp and mant if
negative*/   return mant; 
}

uint64
double2key(double d)
{   uint64          sign,                   mant,                   mask;
   assert(sizeof(double) == sizeof(uint64));   mant = *(uint64 *) & d;     /* Load float as array of bits */   sign =
mant& SB_MASK64;    /* Isolate the leading sign bit */   mant ^= SB_MASK64;          /* Invert the sign bit, making + >
-*/   mask = sign - (sign >> 63); /* Either 0 or 0x7fffffffffffffff */   mant ^= mask;               /* Invert exp and
mantif negative */   return mant; 
}

Where the contents of inteltyp.h are as follows:
/*
** Typdefs for Intel formats.
** See keyxfrm.c for usage.
*/

typedef unsigned long uint32;
#define SB_MASK32 0x80000000UL

#ifdef _MSC_VER
typedef unsigned __int64 uint64;
typedef __int64 sint64;
#define SB_MASK64 0x8000000000000000ui64
#else
typedef unsigned long long uint64;
typedef long long sint64;
#define SB_MASK64 0x8000000000000000ULL
#endif
extern uint32   float2key(float f);
uint64          double2key(double d);

=======================================================
After the above transformation, you just use the same buckets as for
integers.

In general, the creation of the mapping function is no more difficult
than the creation of a comparison function.

> Seems
> to me that's a much stronger requirement than assuming that you can
tell
> which of two whole values is smaller.  What's worse, it needs to be
the
> same number of pieces for every value, which makes it hard to deal
with
> variable-length data.

No.  The number of pieces is irrelevant.  And you can use MSD radix sort
for variable length data.
> > So, for char string, and a radix of 256, you just return the first
char,
> > then the second char, etc. to get the classical radix sort.
>
> Uh, no, you'd need to work right-to-left, after having padded all the
> strings to the same length somehow.

Unless you use MSD radix sort (which is usually better anyway).

>             regards, tom lane

pgsql-hackers by date:

From: Qingqing Zhou
Date: 16 December 2005, 01:53:19
Subject: Re: Which qsort is used

From: Jeff Trout
Date: 16 December 2005, 10:03:25
Subject: Re: Which qsort is used

Re: Which qsort is used - Mailing list pgsql-hackers

Previous

Next