Re: PG engine takeover or switch over - Mailing list pgsql-admin
From | Chander Ganesan |
---|---|
Subject | Re: PG engine takeover or switch over |
Date | |
Msg-id | 46C455DF.7080202@otg-nc.com Whole thread Raw |
In response to | Re: PG engine takeover or switch over (Hannes Dorbath <light@theendofthetunnel.de>) |
Responses |
Re: PG engine takeover or switch over
|
List | pgsql-admin |
Hannes Dorbath wrote: > On 15.08.2007 21:30, Medi Montaseri wrote: >> I am looking for some suggestions to optimize the following >> problem/solution. >> >> Given two nodes A and B (two computers) in a active-passive mode >> where A is >> running PG engine, and B is simply standing by, and a common storage >> (twin tailed) ( or from pg_ctl point of view -D /common/data ), I am >> looking >> for a faster solution during the takeover where A has crashed and B >> is to >> start PG engine and run with it. >> >> My current solution is to start PG engine which should take little >> time to >> study the configuration files and /common/data and fork a few >> childrens. But >> I am still interested in optimizing this start-up cost. >> >> For example, would I gain anything by starting PG engine on both A >> and B, >> but on B I point it to /common/dummy and during the takeover, I >> somehow tell >> it to now read from /common/data, for example have two >> postmaster.conf or >> PGDATA and then use pg_ctl reload. > > Starting up PostgreSQL should be very fast, given no recovery to be > done and decent hardware. > > PostgreSQL does not fork a lot unless it is accepting new connections > and if reading a config file is slow on your system, something else is > broken. > > In a active/passive setup your should be able to switch over in under > 3 seconds. If there was a lot of load on the failed node the recovery > times on the new active node increase. The only thing you can do about > that is getting faster disks.. > Your startup time in the event of a failure will be predicated on the number of WAL files that need to be played back in order to perform auto-recovery. For example, if you've set your checkpoint_segments to some high number, PostgreSQL will need to play back those WAL files to ensure that transactions that were committed to disk are correctly represented in your "on disk" object data (tables, indexes, etc). Note that in the case of a "graceful" shutdown, you'll find that PostgreSQL doesn't need to replay WAL files (since it checkpoints prior to shutting down) and as such the startup time is pretty fast. You can decrease the amount of time it takes to recover by decreasing the number of segments per checkpoint; however, this may result in more frequent checkpointing and as as result a reduction in overall performance on your active node. Basically, you're balancing the "cost" of auto crash recovery with the frequency of checkpointing. An alternative (and perhaps faster) method of failover would be to have a cluster in "warm standby" mode (this wouldn't rely on a shared pgdata directory). In such cases you would only have to wait for the last few WAL files to be played back in order to recover. In the case of a high checkpoint_segments number this would allow you to recover quickly from failures, but would introduce PITR overhead (copying WAL files when WAL files become full), but that's a pretty small cost - and you may already have that cost if you do PITR backups. There is no way to have postgresql "switch" data directories to speed up startup. -- Chander Ganesan Open Technology Group, Inc. One Copley Parkway, Suite 210 Morrisville, NC 27560 919-463-0999/866-229-3386 http://www.otg-nc.com
pgsql-admin by date: