Re: 8.4 open item: copy performance regression? - Mailing list pgsql-hackers
From | Stefan Kaltenbrunner |
---|---|
Subject | Re: 8.4 open item: copy performance regression? |
Date | |
Msg-id | 4A3E0FE8.6000508@kaltenbrunner.cc Whole thread Raw |
In response to | Re: 8.4 open item: copy performance regression? (Stefan Kaltenbrunner <stefan@kaltenbrunner.cc>) |
Responses |
Re: 8.4 open item: copy performance regression?
|
List | pgsql-hackers |
Stefan Kaltenbrunner wrote: > Simon Riggs wrote: >> On Sun, 2009-06-21 at 10:28 +0200, Stefan Kaltenbrunner wrote: >>> I did some limited testing on that but I was unable to measure any >>> significant effect - especially since the difference between >>> wal-logged and not is rather small for a non-parallel COPY (ie in the >>> above example you get around 6m20s runtime for wal-logged and ~5m40s >>> in the other case). >> >> This is a common confusion for small tests. >> >> Non-WAL logged case causes all buffers to be written to disk at end of >> COPY. This is roughly the same size as the volume of WAL written. In >> logged case we do not write data blocks, they get written at next >> checkpoint. So the reduction in I/O is not apparent, since during the >> period of the test the I/O is about the same in both cases and less I/O >> in the non-WAL logged case. On longer tests the difference shows more >> clearly because the data blocks start to migrate out of shared buffers >> while the COPY is still running, effecting the test results. > > I was actually testing with and without explicit CHECKPOINTing > before/after the load(and also with longer runs) too - the difference is > negligible especially with only one process involved. > I think the difference is simply not that large because we are still > mostly CPU bound within COPY on reasonably fast IO-subsystems. hmm to further demonstrate that I just did some testing(same config as before and the 16MB for the buffer) by loading those 60M rows into a 20GB ramdisk instead of the SAN(with a CHECKPOINT before and after). this results in the following "improvements": 16384: wal bypass: 5min40s -> 5min10s (~9%) wal logged: 6min20s -> 6min8s (~3%) vmstat 5 output shows that the system is in fact CPU bound (ie using ~6% which is more or less a full core on a 16 core box) and not doing anything IO-wise. r b swpd free buff cache si so bi bo in cs us sy id wa st 1 0 0 19010444 118648 15415684 0 0 0 6 1068 2151 6 0 94 0 0 1 0 0 18870448 118652 15555204 0 0 0 6 1069 2142 6 0 94 0 0 1 0 0 18730568 118684 15694872 0 0 0 185 1080 2151 6 0 94 0 0 1 0 0 18591236 118692 15834516 0 0 0 19 1072 2144 6 0 94 0 0 1 0 0 18451472 118696 15973532 0 0 0 46 1073 2152 6 0 94 0 0 1 0 0 18311720 118704 16113204 0 0 0 7 1059 2136 6 0 94 0 0 1 0 0 18171968 118704 16252944 0 0 0 0 1077 2171 6 0 94 0 0 1 0 0 18032088 118712 16392300 0 0 0 54 1062 2138 6 0 94 0 0 1 0 0 17891716 118720 16532060 0 0 0 8 1078 2176 6 0 94 0 0 So I do think that IO is in fact not too significant for this kind of testing and we still have ways to go in terms of CPU efficiency in COPY. Stefan
pgsql-hackers by date: