Re: [COMMITTERS] pgsql-server/ /configure /configure.in rc/incl ... - Mailing list pgsql-performance
From | Sean Chittenden |
---|---|
Subject | Re: [COMMITTERS] pgsql-server/ /configure /configure.in rc/incl ... |
Date | |
Msg-id | 20030307060412.GA19138@perrin.int.nxad.com Whole thread Raw |
In response to | Re: [COMMITTERS] pgsql-server/ /configure /configure.in rc/incl ... (Neil Conway <neilc@samurai.com>) |
Responses |
Re: [COMMITTERS] pgsql-server/ /configure /configure.in rc/incl ...
|
List | pgsql-performance |
> > I don't have my copy of Steven's handy (it's some 700mi away atm > > otherwise I'd cite it), but if Tom or someone else has it handy, look > > up the example re: the performance gain from read()'ing an mmap()'ed > > file versus a non-mmap()'ed file. The difference is non-trivial and > > _WELL_ worth the time given the speed increase. > > Can anyone confirm this? If so, one easy step we could take in this > direction would be adapting COPY FROM to use mmap(). Weeee! Alright, so I got to have some fun writing out some simple tests with mmap() and friends tonight. Are the results interesting? Absolutely! Is this a simple benchmark? Yup. Do I think it simulates PostgreSQL? Eh, not particularly. Does it demonstrate that mmap() is a win and something worth implementing? I sure hope so. Is this a test program to demonstrate the ideal use of mmap() in PostgreSQL? No. Is it a place to start a factual discussion? I hope so. I have here four tests that are conditionalized by cpp. # The first one uses read() and write() but with the buffer size set # to the same size as the file. gcc -O3 -finline-functions -fkeep-inline-functions -funroll-loops -o test-mmap test-mmap.c /usr/bin/time ./test-mmap > /dev/null Beginning tests with file: services Page size: 4096 File read size is the same as the file size Number of iterations: 100000 Start time: 1047013002.412516 Time: 82.88178 Completed tests 82.09 real 2.13 user 68.98 sys # The second one uses read() and write() with the default buffer size: # 65536 gcc -O3 -finline-functions -fkeep-inline-functions -funroll-loops -DDEFAULT_READSIZE=1 -o test-mmap test-mmap.c /usr/bin/time ./test-mmap > /dev/null Beginning tests with file: services Page size: 4096 File read size is default read size: 65536 Number of iterations: 100000 Start time: 1047013085.16204 Time: 18.155511 Completed tests 18.16 real 0.90 user 14.79 sys # Please note this is significantly faster, but that's expected # The third test uses mmap() + madvise() + write() gcc -O3 -finline-functions -fkeep-inline-functions -funroll-loops -DDEFAULT_READSIZE=1 -DDO_MMAP=1 -o test-mmap test-mmap.c /usr/bin/time ./test-mmap > /dev/null Beginning tests with file: services Page size: 4096 File read size is the same as the file size Number of iterations: 100000 Start time: 1047013103.859818 Time: 8.4294203644 Completed tests 7.24 real 0.41 user 5.92 sys # Faster still, and twice as fast as the normal read() case # The last test only calls mmap()'s once when the file is opened and # only msync()'s, munmap()'s, close()'s the file once at exit. gcc -O3 -finline-functions -fkeep-inline-functions -funroll-loops -DDEFAULT_READSIZE=1 -DDO_MMAP=1 -DDO_MMAP_ONCE=1 -o test-mmaptest-mmap.c /usr/bin/time ./test-mmap > /dev/null Beginning tests with file: services Page size: 4096 File read size is the same as the file size Number of iterations: 100000 Start time: 1047013111.623712 Time: 1.174076 Completed tests 1.18 real 0.09 user 0.92 sys # Substantially faster Obviously this isn't perfect, but reading and writing data is faster (specifically moving pages through the VM/OS). Doing partial writes from mmap()'ed data should be faster along with scanning through mmap()'ed portions of - or completely mmap()'ed - files because the pages are already loaded in the VM. PostgreSQL's LRU file descriptor cache could easily be adjusted to add mmap()'ing of frequently accessed files (specifically, system catalogs come to mind). It's not hard to figure out how often particular files are accessed and to either _avoid_ mmap()'ing a file that isn't accessed often, or to mmap() files that _are_ accessed often. mmap() does have a cost, but I'd wager that mmap()'ing the same file a second or third time from a different process would be more efficient. The speedup of searching through an mmap()'ed file may be worth it, however, to mmap() all files if the system is under a tunable resource limit (max_mmaped_bytes?). If someone is so inclined or there's enough interest, I can reverse this test case so that data is written to an mmap()'ed file, but the same performance difference should hold true (assuming this isn't a write to a tape drive ::grin::). The URL for the program used to generate the above tests is at: http://people.freebsd.org/~seanc/mmap_test/ Please ask if you have questions. -sc -- Sean Chittenden
Attachment
pgsql-performance by date: