tweaking MemSet() performance - Mailing list pgsql-hackers
From | Neil Conway |
---|---|
Subject | tweaking MemSet() performance |
Date | |
Msg-id | 87wuqaw7xu.fsf@mailbox.samurai.com Whole thread Raw |
Responses |
Re: tweaking MemSet() performance
Re: tweaking MemSet() performance Re: [HACKERS] tweaking MemSet() performance |
List | pgsql-hackers |
In include/c.h, MemSet() is defined to be different than the stock function memset() only when copying less than or equal to MEMSET_LOOP_LIMIT bytes (currently 64). The comments above the macro definition note: * We got the 64 number by testing this against the stock memset() on* BSD/OS 3.0. Larger values were slower. bjm1997/09/11** I think the crossover point could be a good deal higher for* most platforms, actually. tgl 2000-03-19 I decided to investigate Tom's suggestion and determine the performance of MemSet() versus memset() on my machine, for various values of MEMSET_LOOP_LIMIT. The machine this is being tested on is a Pentium 4 1.8 Ghz with RDRAM, running Linux 2.4.19pre8 with GCC 3.1.1 and glibc 2.2.5 -- the results may or may not apply to other machines. The test program was: #include <string.h> #include "postgres.h" #undef MEMSET_LOOP_LIMIT #define MEMSET_LOOP_LIMIT BUFFER_SIZE int main(void) {char buffer[BUFFER_SIZE];long long i; for (i = 0; i < 99000000; i++){ MemSet(buffer, 0, sizeof(buffer));} return 0; } (I manually changed MemSet() to memset() when testing the performance of the latter function.) It was compiled like so: gcc -O2 -DBUFFER_SIZE=xxx -Ipgsql/src/include memset.c (The -O2 optimization flag is important: the results are significantly different if it is not used.) Here are the results (each timing is the 'total' listing from 'time ./a.out'): BUFFER_SIZE = 64 MemSet() -> 2.756, 2.810, 2.789 memset() -> 13.844, 13.782, 13.778 BUFFER_SIZE = 128 MemSet() -> 5.848, 5.989, 5.861 memset() -> 15.637, 15.631, 15.631 BUFFER_SIZE = 256 MemSet() -> 9.602, 9.652, 9.633 memset() -> 19.305, 19.370, 19.302 BUFFER_SIZE = 512 MemSet() -> 17.416, 17.462, 17.353 memset() -> 26.657, 26.658, 26.678 BUFFER_SIZE = 1024 MemSet() -> 32.144, 32.179, 32.086 memset() -> 41.186, 41.115, 41.176 BUFFER_SIZE = 2048 MemSet() -> 60.39, 60.48, 60.32 memset() -> 71.19, 71.18, 71.17 BUFFER_SIZE = 4096 MemSet() -> 118.29, 120.07, 118.69 memset() -> 131.40, 131.41 ... at which point I stopped benchmarking. Is the benchmark above a reasonable assessment of memset() / MemSet() performance when copying word-aligned amounts of memory? If so, what's a good value for MEMSET_LOOP_LIMIT (perhaps 512)? Also, if anyone would like to contribute the results of doing the benchmark on their particular system, that might provide some useful additional data points. Cheers, Neil -- Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
pgsql-hackers by date: