Re: [GENERAL] Hardware optimising - Mailing list pgsql-general
From | Aaron J. Seigo |
---|---|
Subject | Re: [GENERAL] Hardware optimising |
Date | |
Msg-id | 37C5CB32.8E379739@gtv.ca Whole thread Raw |
In response to | Hardware optimising (Michael <grim@argh.demon.co.uk>) |
Responses |
Re: [GENERAL] Hardware optimising
|
List | pgsql-general |
hi... > 128 MB 100 MHz SDRAM > AMD K6-2/300 CPU > 10 GB 7200RPM 9.0ms IBM IDE HDD > > It will, over the next few months, as money becomes available, be upgraded to: > > 256 MB 100 MHz SDRAM > Dual Athlon 500 CPUs > 10 GB UltraII Wide SCSI drive > > The database will contain several million records and needs to be able to do > very fast selects from tables with a lot of rows, and do small updates and > inserts onto these tables at a good speed also. seems there are a lot of opinions on this one floating around, and not alot of explanations to go along with them. =) in my experience, it really depends on what you are doing. if you are going to be doing the same selects alot, then extra RAM will help a lot. just be sure to set the buffers when starting postmaster to be relatively high (i.e. a few thousand). also, since you are doing few inserts, i'd turn off f-sync for greater speed and less disk access (which with your current IDE will be expensive (time wise) at best). this will run the risk of losing inserts if the machine crashes, but if you are using a stable OS (i.e. not NT) then you'll probably be just fine if the inserts are few compared to selects. also, if you are going to be doing a lot of pre- and post-processing of the data (i.e. grabbing bits of data based on a algorythm (sp) or getting bits of data and massaging them about a lot (i.e. creating graphs, lay-out, doing analysis, etc)) then RAM will also see a boost as you will be able to do these in memory, allowing the database the disk more to itself... the SCSI drive will see an increase in speed to be sure! in fact, i'd suggest giving the database the drive all to itself for data... leave everything else on the IDE drive (OS, database engine, etc) and format the SCSI drive with large i-node blocks (i.e. 1MB) and just let the data reside on the SCSI disk. besides gaining the speed of the disk you'll also allow the rest of the system to stay the hell out of the way of that disk intensive database! =) you'll probably see a tremendous increase in speed doing it this way (large inodes, only database data) than if you just simply replace the IDE with the SCSI drive... of course, as time gos on, if you use a mirroring raid array by adding another disk, you'll see even more speed increase. other RAIDs, while preserving your data, will result in slow downs.. though it will still be faster than an IDE drive with everything on it. RAID 5 is cheaper (more out of your disk space) but will be a bit slower than a single disk system or a RAID 5... but RAID 5 is a nice way to go... however, if you do go RAID, DO NOT use software RAID. why? well... it negates some of the fail-safe power of the RAID (although if well set up you can render this moot) but more importantly it will drag significantly on your processors (~10% or so is common) as for the processor, this will see an increase, of course. note, however, that since PostgreSQL is _not_ multithreaded, that it will run only on one of the processors. (i'm about to assume you are using linux here... 'scuse me if i'm wrong) however, the good news is that you can encourage linux (through the scheduler) to run postgres on one of the processors and everything else on the other one. this should give the database its own processor more oft than not. things may still drift, etc... but it will be better this way.... the processor boost will be important, again, if you are doing lots of pre/post-processing of data. it will also see an improvement if you are offering other services (i.e. WWW) on the machine (which i'm guessing you are). this will require a kernel recompile and some muckin' about to get it all running as quickly/smoothly/efficiently as possible. this is another side of things to look at: RAM is quick and instant. power down, slap in some more ram, power up. fast down time. DISK upgrade will take more time. i.e. formatting; setting it in your FSTAB, etc; changing your start up scripts to tell postgres where the data is now; copying things to the new disk, etc... this will result in some fairly good down time. the installation can be done quickly (if well thought out, i.e. pre-format the drive, etc...) and the rest can be done while online. although you'll want to shut the database down while copying data files. the longer you wait on this one, the longer your down time will be (more data to copy, etc...) CPU upgrade will require downtime to install (not nearly as fast or easy as RAM).. then kernel recompiling.. then testing of the new kernel... then tweaking the system. probably resulting in even more down time than with the disk upgrade. i'm guessing that while the system is new, you'll probably be more agreeable to longer downtimes. so perhaps the disk upgrade would be better earlier on in that it will probably give you the best improvement while absorbing down time impact early in on the venture when it might not be noticed so much (i'm guessing here again as to the nature of your usage... assuming the demands on the system will start out smallish and grow as time gos on...) second, i'd do the RAM upgrade. cheap, fast, good improvements. third, i'd tackle the CPUs (tricky, not as cheap, fair amount of downtime...) but that's just me. and those who know me know that i'm often off to one side of the field. usually looking at the clouds, in fact. =) as a bit of last advice, SysAdmin magazine, numerous O'Reilly books, and many online FAQs and HOW-TOs give some really good advice on these sorts of issues. these aren't really database related as much as they are systems administration questions and are applicable to most high-demand services. get the books/mags, read incessently, keep up to date on what's what, etc, etc and your new service can only stand to benefit. Aaron Seigo somebody who types a lot during the course of a day.
pgsql-general by date: