Re: reindex/vacuum locking/performance? - Mailing list pgsql-performance
From | Bruce Momjian |
---|---|
Subject | Re: reindex/vacuum locking/performance? |
Date | |
Msg-id | 200310061856.h96IuxO13756@candle.pha.pa.us Whole thread Raw |
In response to | Re: reindex/vacuum locking/performance? (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-performance |
Tom Lane wrote: > Neil Conway <neilc@samurai.com> writes: > > On Sun, 2003-10-05 at 19:50, Neil Conway wrote: > > I was hoping you'd reply to this, Tom -- you were referring to O_DIRECT, > > right? > > Not necessarily --- as you point out, it's not clear that O_DIRECT would > help us. What would be way cool is something similar to what James > Rogers was talking about: a way to tell the kernel not to promote this > page all the way to the top of its LRU list. I'm not sure that *any* > Unixen have such an API, let alone one that's common across more than > one platform :-( Solaris has "free-behind", which prevents a large kernel sequential scan from blowing out the cache. I only read about it in the Mauro Solaris Internals book, and it seems to be done automatically. I guess most OS's don't do this optimization because they usually don't read files larger than their cache. I see BSD/OS madvise() has: #define MADV_NORMAL 0 /* no further special treatment */ #define MADV_RANDOM 1 /* expect random page references */ #define MADV_SEQUENTIAL 2 /* expect sequential references */ #define MADV_WILLNEED 3 /* will need these pages */ --> #define MADV_DONTNEED 4 /* don't need these pages */ #define MADV_SPACEAVAIL 5 /* insure that resources are reserved */ The marked one seems to have the control we need. Of course, the kernel madvise() code has: /* Not yet implemented */ Looks like NetBSD implements it, but it also unmaps the page from the address space, which might be more than we want. NetBSD alao has: #define MADV_FREE 6 /* pages are empty, free them */ which frees the page. I am unclear on its us. FreeBSD has this comment: /* * vm_page_dontneed * * Cache, deactivate, or do nothing as appropriate. This routine * is typically used by madvise() MADV_DONTNEED. * * Generally speaking we want to move the page into the cache so * it gets reused quickly. However, this can result in a silly syndrome * due to the page recycling too quickly. Small objects will not be * fully cached. On the otherhand, if we move the page to the inactive * queue we wind up with a problem whereby very large objects * unnecessarily blow away our inactive and cache queues. * * The solution is to move the pages based on a fixed weighting. We * either leave them alone, deactivate them, or move them to the cache, * where moving them to the cache has the highest weighting. * By forcing some pages into other queues we eventually force the * system to balance the queues, potentially recovering other unrelated * space from active. The idea is to not force this to happen too * often. */ The Linux comment is: /* * Application no longer needs these pages. If the pages are dirty, * it's OK to just throw them away. The app will be more careful about * data it wants to keep. Be sure to free swap resources too. The * zap_page_range call sets things up for refill_inactive to actually free * these pages later if no one else has touched them in the meantime, * although we could add these pages to a global reuse list for * refill_inactive to pick up before reclaiming other pages. * * NB: This interface discards data rather than pushes it out to swap, * as some implementations do. This has performance implications for * applications like large transactional databases which want to discard * pages in anonymous maps after committing to backing store the data * that was kept in them. There is no reason to write this data out to * the swap area if the application is discarding it. * * An interface that causes the system to free clean pages and flush * dirty pages is already available as msync(MS_INVALIDATE). */ It seems mmap is more for controlling the memory mapping of files rather than controlling the cache itself. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
pgsql-performance by date: