Re: Base Backup Streaming - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: Base Backup Streaming |
Date | |
Msg-id | 4D20AB51.7020400@enterprisedb.com Whole thread Raw |
In response to | Base Backup Streaming (was: Sync Rep Design) (Dimitri Fontaine <dimitri@2ndQuadrant.fr>) |
Responses |
Re: Base Backup Streaming
|
List | pgsql-hackers |
On 02.01.2011 14:47, Dimitri Fontaine wrote: > Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes: >> BTW, there's a bunch of replication related stuff that we should work to >> close, that are IMHO more important than synchronous replication. Like >> making the standby follow timeline changes, to make failovers smoother, and >> the facility to stream a base-backup over the wire. I wish someone worked on >> those... > > So, we've been talking about base backup streaming at conferences and we > have a working prototype. We even have a needed piece of it in core > now, that's the pg_read_binary_file() function. What we still miss is > an overall design and some integration effort. Let's design first. We even have a rudimentary patch to add the required backend support: http://archives.postgresql.org/message-id/4C80D9B8.2020301@enterprisedb.com That just needs to be polished into shape, and documentation. > I propose the following new pg_ctl command to initiate the cloning: > > pg_ctl clone [-D datadir] [-s on|off] [-t filename] "primary_conninfo" > > As far as user are concerned, that would be the only novelty. Once that > command is finished (successfully) they would edit postgresql.conf and > start the service as usual. A basic recovery.conf file is created with > the given options, standby_mode is driven by -s and defaults to off, and > trigger_file defaults to being omitted and is given by -t. Of course > the primary_conninfo given on the command line is what ends up into the > recovery.conf file. > > That alone would allow for making base backups for recovery purposes and > for standby preparing. +1. Or maybe it would be better make it a separate binary, rather than part of pg_ctl. > To support for this new tool, the simplest would be to just copy what > I've been doing in the prototype, that is run a query to get the primary > file listing (per tablespace, not done in the prototype) then get their > bytea content over the wire. That means there's no further backend > support code to write. It would be so much nicer to have something more integrated, like the patch I linked above. Running queries requires connecting to a real database, which means that the user needs to have privileges to do that and you need to know the name of a valid database. Ideally this would all work through a replication connection. I think we should go with that from day one. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
pgsql-hackers by date: