Re: Sorting Improvements for 8.4 - Mailing list pgsql-hackers
From | Dann Corbit |
---|---|
Subject | Re: Sorting Improvements for 8.4 |
Date | |
Msg-id | D425483C2C5C9F49B5B7A41F8944154701000B4A@postal.corporate.connx.com Whole thread Raw |
In response to | Re: Sorting Improvements for 8.4 (Gregory Stark <stark@enterprisedb.com>) |
Responses |
Re: Sorting Improvements for 8.4
Re: Sorting Improvements for 8.4 |
List | pgsql-hackers |
> -----Original Message----- > From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers- > owner@postgresql.org] On Behalf Of Gregory Stark > Sent: Wednesday, December 19, 2007 5:26 PM > To: Jeff Davis > Cc: Mark Mielke; Michał Zaborowski; Simon Riggs; Ron Mayer; pgsql- > hackers@postgresql.org > Subject: Re: [HACKERS] Sorting Improvements for 8.4 > > > "Jeff Davis" <pgsql@j-davis.com> writes: > > > test=> explain analyze select * from sorter order by t; > > test=> explain analyze select * from sorter order by b; > > test=> explain analyze select * from sorter order by f; > > > > On my machine this table fits easily in memory (so there aren't any disk > > reads at all). Sorting takes 7 seconds for floats, 9 seconds for binary > > data, and 20 seconds for localized text. That's much longer than it > > would take to read that data from disk, since it's only 70MB (which > > takes a fraction of a second on my machine). > > > > I think this disproves your hypothesis that sorting happens at disk > > speed. > > I suspect most of that is spent just copying the data around. Which would > not > be helped by having multiple threads doing the copying -- and in fact > might be > exacerbated if it required an extra copy to consolidate all the data in > the > end. Benchmarking a single system will really only explain that system. Someone may have a disk farm with 2GB/Sec throughput: http://www.sql-server-performance.com/articles/per/system_storage_configuration_p1.aspx But such a configuration is very unlikely. Someone may have 10GB/S NIC cards, but those too, are rare. So for any benchmark, we will really just end up with a number for that system. Typically, disk is the bottleneck. I found this on the net somewhere, but it's quite a useful table for capacity planning (to find the weak link in the chainusing back of the envelope calculations): Interface Width Frequency Bytes/Sec Bits/Sec 4-way interleaved PC1600 (DDR200) SDRAM 4 x 64bits 100 MHz DDR 6.4 GB/s 51 Gbps Opteron HyperTransport memory bus 128bits 200 MHz DDR 6.4 GB/s 51 Gbps Pentium 4 "800 MHz" FSB 64bits 200 MHz QDR 6.4 GB/s 51 Gbps PC2 6400 (DDR-II 800) SDRAM 64bits 400 MHz DDR 6.4 GB/s 51 Gbps PC2 5300 (DDR-II 667) SDRAM 64bits 333 MHz DDR 5.3 GB/s 43 Gbps Pentium 4 "533 MHz" FSB 64bits 133 MHz QDR 4.3 GB/s 34 Gbps PC2 4300 (DDR-II 533) SDRAM 64bits 266 MHz DDR 4.3 GB/s 34 Gbps 2-channel PC1066 RDRAM 2 x 16bits 533 MHz DDR 4.3 GB/s 34 Gbps PCI-X 533 64bits 533 MHz 4.3 GB/s 34 Gbps PCI-Express x16 serial/16lanes 2.5 GHz 4 GB/s 32 Gbps Pentium 4 "400 MHz" FSB 64bits 100 MHz QDR 3.2 GB/s 25.6 Gbps 2-channel PC800 RDRAM 2 x 16bits 400 MHz DDR 3.2 GB/s 25.6 Gbps 2-way interleaved PC1600 (DDR200) SDRAM 2 x 64bits 100 MHz DDR 3.2 GB/s 25.6 Gbps PC2 3200 (DDR-II 400) SDRAM 64bits 200 MHz DDR 3.2 GB/s 25.6 Gbps PC3200 (DDR400) SDRAM 64bits 200 MHz DDR 3.2 GB/s 25.6 Gbps PC2700 (DDR333) SDRAM 64bits 167 MHz DDR 2.7 GB/s 21 Gbps PC2100 (DDR266) SDRAM 64bits 133 MHz DDR 2.1 GB/s 17 Gbps AGP 8x 32bits 533 MHz 2.1 GB/s 17 Gbps PCI-X 266 64bits 266 MHz 2.1 GB/s 17 Gbps PCI-Express x8 serial/8lanes 2.5 GHz 2 GB/s 16 Gbps EV6 bus (Athlon/Duron FSB) 64bits 100 MHz DDR 1.6 GB/s 13 Gbps PC1600 (DDR200) SDRAM 64bits 100 MHz DDR 1.6 GB/s 13 Gbps PC800 RDRAM 16bits 400 MHz DDR 1.6 GB/s 13 Gbps PC150 SDRAM 64bits 150 MHz 1.3 GB/s 10.2 Gbps 10 gigabit ethernet serial 10 GHz 1.25 GB/s 10 Gbps OC-192 serial 9.953 GHz 1.24 GB/s 9.953 Gbps 133 MHz FSB 64bits 133 MHz 1.06 GB/s 8.5 Gbps PC133 SDRAM 64bits 133 MHz 1.06 GB/s 8.5 Gbps AGP 4x 32bits 266 MHz 1.06 GB/s 8.5 Gbps PCI-X 64bits 133 MHz 1.06 GB/s 8.5 Gbps PCI-Express x4 serial/4lanes 2.5 GHz 1 GB/s 8 Gbps 100 MHz FSB 64bits 100 MHz 800 MB/s 6.4 Gbps PC100 SDRAM 64bits 100 MHz 800 MB/s 6.4 Gbps PC66 SDRAM 64bits 66 MHz 533 MB/s 4.3 Gbps fast/wide PCI 64bits 66 MHz 533 MB/s 4.3 Gbps AGP 2x 32bits 133 MHz 533 MB/s 4.3 Gbps single-link DVI 12bits 165 MHz DDR 495 MB/s 3.96 Gbps Ultra-320 SCSI 16bits 160 MHz 320 MB/s 2.6 Gbps OC-48 network serial 2.488 GHz 311 MB/s 2.488 Gbps AGP 32bits 66 MHz 266 MB/s 2.1 Gbps PCI-Express x1 serial 2.5 GHz 250 MB/s 2 Gbps Serial ATA/1500 disk serial 1.5 GHz 187 MB/s 1.5 Gbps Ultra-160 SCSI 16bits 80 MHz 160 MB/s 1.3 Gbps OC-24 network serial 1.244 GHz 155 MB/s 1.244 Gbps PCI 32bits 33 MHz 133 MB/s 1.06 Gbps ATA/133 disk 8bits 66 MHz DDR 133 MB/s 1.06 Gbps gigabit ethernet serial 1 GHz 125 MB/s 1 Gbps ATA/100 disk 8bits 50 MHz DDR 100 MB/s 800 Mbps IEEE 1394b serial 800 MHz 100 MB/s 800 Mbps Ultra-2 Wide SCSI 16bits 40 MHz 80 MB/s 640 Mbps OC-12 network serial 622.08 MHz 77.7 MB/s 622.08 Mbps ATA/66 disk 8bits 33 MHz DDR 66 MB/s 533 Mbps USB-2 serial 480 MHz 60 MB/s 480 Mbps IEEE 1394 serial 400 MHz 50 MB/s 400 Mbps Ultra Wide SCSI 16bits 20 MHz 40 MB/s 320 Mbps ATA/33 disk 8bits 16.6 MHz DDR 33 MB/s 266 Mbps Fast Wide SCSI 16bits 10 MHz 20 MB/s 160 Mbps OC-3 network serial 155.52 MHz 19.4 MB/s 155.52 Mbps 100baseT ethernet serial 100 MHz 12.5 MB/s 100 Mbps OC-1 network serial 51.84 MHz 6.5 MB/s 51.84 Mbps T-3 network serial 45 MHz 5.6 MB/s 44.736 Mbps USB serial 12 MHz 1.5 MB/s 12 Mbps 10baseT ethernet serial 10 MHz 1.25 MB/s 10 Mbps IrDA-2 serial 4 MHz 500 KB/s 4 Mbps T-1 network serial 1.5 MHz 193 KB/s 1.544 Mbps > How long does a "explain analyze sinmple select * from sorter" take? > > And assuming you're doing disk sorts (in disk cache) you're doing quite a > lot > of copying to temporary files (in disk cache) and then back to memory. > > > Note that speeding up a query from 20s to 5s isn't terribly useful. If > it's > OLTP you can't be using all your cores for each user anyways. And if it's > DSS > 20s isn't a problem. Unless (of course) there are 20,000 users doing the queries that would take 20 seconds but now they take 5 (when run single-user). They will still have a bit of a wait, of course. > Where parallel processing like this becomes attractive is when you're > running > a 2 hour query on a machine sequentially running scheduled batch jobs > which > can be sped up to 30 minutes. But in that case you're almost certainly > being > limited by your disk bandwidth, not your cpu speed. A linear speedup of 2 or more is always worth while[*]. Since sorting (e.g. for 'group by' and 'order by') and sort joinsare a major database task, I guess that a linear speedup by a factor of 2 might make the database operations on thewhole be 10% faster or so {OK, it's a SWAG}. I guess it would look good on the benchmarks, if nothing else. [*] unless it is already fast enough. If, at peak load a query takes 1 ms, then making the query take 0.5 ms is not goingto win you any medals, especially if the improvement costs $10,000.
pgsql-hackers by date: