Re: Which qsort is used - Mailing list pgsql-hackers
From | Dann Corbit |
---|---|
Subject | Re: Which qsort is used |
Date | |
Msg-id | D425483C2C5C9F49B5B7A41F8944154757D37F@postal.corporate.connx.com Whole thread Raw |
In response to | Which qsort is used (Qingqing Zhou <zhouqq@cs.toronto.edu>) |
Responses |
Re: Which qsort is used
|
List | pgsql-hackers |
> -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: Thursday, December 15, 2005 6:24 PM > To: Dann Corbit > Cc: Qingqing Zhou; Greg Stark; Jim C. Nasby; Luke Lonergan; Neil Conway; > Bruce Momjian; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Which qsort is used > > "Dann Corbit" <DCorbit@connx.com> writes: > > Radix sort can also be made completely generic by using a callback > > function. > > The function gives back n-bits at a time, from the most significant bits > > to the least significant. > > That's mighty handwavy --- it assumes that the datatype permits a simple > breakdown into small pieces that can be sorted lexicographically. It's not so hard. For fixed length character strings, the mapping is just the character. For integers the mapping is obvious [msb to lsb or lsb to msb of the integer, depending on whether you are doing msd or lsd radix sort]. For intel floating point, the transformation is: #include <assert.h> #include "inteltyp.h" uint32 float2key(float f) { uint32 sign, mant, mask; assert(sizeof(float) == sizeof(uint32)); mant = *(uint32 *) & f; /* Load float as array of bits */ sign = mant& SB_MASK32; /* Isolate the leading sign bit */ mant ^= SB_MASK32; /* Invert the sign bit, making + > -*/ mask = sign - (sign >> 31); /* Either 0 or 0x7fffffff */ mant ^= mask; /* Invert exp and mant if negative*/ return mant; } uint64 double2key(double d) { uint64 sign, mant, mask; assert(sizeof(double) == sizeof(uint64)); mant = *(uint64 *) & d; /* Load float as array of bits */ sign = mant& SB_MASK64; /* Isolate the leading sign bit */ mant ^= SB_MASK64; /* Invert the sign bit, making + > -*/ mask = sign - (sign >> 63); /* Either 0 or 0x7fffffffffffffff */ mant ^= mask; /* Invert exp and mantif negative */ return mant; } Where the contents of inteltyp.h are as follows: /* ** Typdefs for Intel formats. ** See keyxfrm.c for usage. */ typedef unsigned long uint32; #define SB_MASK32 0x80000000UL #ifdef _MSC_VER typedef unsigned __int64 uint64; typedef __int64 sint64; #define SB_MASK64 0x8000000000000000ui64 #else typedef unsigned long long uint64; typedef long long sint64; #define SB_MASK64 0x8000000000000000ULL #endif extern uint32 float2key(float f); uint64 double2key(double d); ======================================================= After the above transformation, you just use the same buckets as for integers. In general, the creation of the mapping function is no more difficult than the creation of a comparison function. > Seems > to me that's a much stronger requirement than assuming that you can tell > which of two whole values is smaller. What's worse, it needs to be the > same number of pieces for every value, which makes it hard to deal with > variable-length data. No. The number of pieces is irrelevant. And you can use MSD radix sort for variable length data. > > So, for char string, and a radix of 256, you just return the first char, > > then the second char, etc. to get the classical radix sort. > > Uh, no, you'd need to work right-to-left, after having padded all the > strings to the same length somehow. Unless you use MSD radix sort (which is usually better anyway). > regards, tom lane
pgsql-hackers by date: