Re: FunctionCallN improvement. - Mailing list pgsql-hackers
From | a_ogawa |
---|---|
Subject | Re: FunctionCallN improvement. |
Date | |
Msg-id | PIEMIKOOMKNIJLLLBCBBEEJCCEAA.a_ogawa@hi-ho.ne.jp Whole thread Raw |
In response to | Re: FunctionCallN improvement. (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: FunctionCallN improvement.
|
List | pgsql-hackers |
Tom Lane wrote: > Neil Conway <neilc@samurai.com> writes: > > I agree; I think the macro is a nice improvement to readability. > > But a dead loss for performance, since it does a MemSet *and* some other > operations. What's worse, it changes a word-aligned MemSet into a > non-aligned one, knocking out all the optimizations therein. Thanks for your advice. I change MemSet to for-loop in this macro. I think FunctionCallInfoData is large to initialize it by using MemSet. MemSet is very fast in most cases. However, when it only has to initialize a part of large structure, it might be faster to initialize the few members directly. I made the test program to measure the effect of this macro. The test program was: --------------------------------------------------------------------------- #include "postgres.h" #include "fmgr.h" #include <stdio.h> /** Initialize minimum fields of FunctionCallInfoData that must be* initialized.*/ #define InitFunctionCallInfoData(Fcinfo, Flinfo, Nargs) \ do { \ int i_; \ (Fcinfo)->flinfo = Flinfo; \ (Fcinfo)->context = NULL; \ (Fcinfo)->resultinfo= NULL; \ (Fcinfo)->isnull = false; \ (Fcinfo)->nargs = Nargs; \ for(i_ = 0; i_ < Nargs; i_++) (Fcinfo)->argnull[i_]= false; \ } while(0) /** dummyFunc is to control excessive optimization.* When this function is not called from loop, the initialization of* FunctionCallInfoDatamight move outside of the loop by gcc.*/ void dummyFunc(FunctionCallInfoData *fcinfo, int cnt) { fcinfo->arg[0] = Int32GetDatum(cnt); } void TestMemSet(int cnt, int nargs) { FunctionCallInfoData fcinfo; printf("test MemSet: %d\n", cnt); for(; cnt; cnt--) { MemSet(&fcinfo, 0, sizeof(fcinfo)); dummyFunc(&fcinfo, cnt); } } void TestMacro(int cnt, int nargs) { FunctionCallInfoData fcinfo; printf("test Macro: %d\n", cnt); for(; cnt; cnt--) { InitFunctionCallInfoData(&fcinfo, NULL, nargs); dummyFunc(&fcinfo, cnt); } } int main(int argc, char **argv) { int test_cnt; int nargs; if(argc != 4) { printf("usage: fmgrtest -memset|-macro test_cnt nargs\n"); return 1; } test_cnt = atoi(argv[2]); nargs = atoi(argv[3]); if(strcmp(argv[1], "-memset") == 0) TestMemSet(test_cnt, nargs); if(strcmp(argv[1], "-macro") == 0) TestMacro(test_cnt,nargs); return 0; } --------------------------------------------------------------------------- It was compiled like so: gcc -O2 -o test_fmgr -I ${PGSRC}/src/include/ test_fmgr.c Executed the test of MemSet: time ./test_fmgr -memset 10000000 9 Executed the test of Macro that uses for loop: time ./test_fmgr -macro 10000000 9 Results: (1)linux Kernel 2.4.9 (Pentium III 800MHz, gcc-3.4.1)MemSet real 0m1.486s, user 0m1.480s, sys 0m0.000sMacro(nargs=9)real 0m0.606s, user 0m0.600s, sys 0m0.000sMacro(nargs=3) real 0m0.375s, user 0m0.370s, sys 0m0.000sMacro(nargs=2)real 0m0.298s, user 0m0.290s, sys 0m0.000s (*)In the test of MemSet, nargs is not related. (2)Solaris8 (Ultra SPARC III 750MHz, gcc-2.95.3)MemSet real 2.0s, user 2.0s, sys 0.0sMacro(nargs=9) real 0.7s, user0.7s, sys 0.0sMacro(nargs=3) real 0.3s, user 0.3s, sys 0.0sMacro(nargs=2) real 0.2s, user 0.2s, sys 0.0s The effect of this macro can be seen in the application that outputs a lot of data such as psql and pg_dump. These applications enlarge the load of FunctionCall3. This is a result of pg_dump. Environment: linux Kernel 2.4.9, Pentium III 800MHz, PostgreSQL 8.0.1, gcc-3.4.1,compile option: -O2, My database have about 400,000 tuples.Results(time pg_dump > dump.sql): Originalcode: real 0m5.369s, user 0m0.600s, sys 0m0.120s Using this macro in fmgr.c: real 0m5.061s, user 0m0.550s,sys 0m0.120s I think this macro is improvement to readability and performance. regards, --- A.Ogawa ( a_ogawa@hi-ho.ne.jp )
pgsql-hackers by date: