Re: Streaming replication status - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: Streaming replication status |
Date | |
Msg-id | 4B482F6B.80900@enterprisedb.com Whole thread Raw |
In response to | Re: Streaming replication status (Fujii Masao <masao.fujii@gmail.com>) |
Responses |
Re: Streaming replication status
|
List | pgsql-hackers |
Fujii Masao wrote: > On Sat, Jan 9, 2010 at 6:16 AM, Heikki Linnakangas > <heikki.linnakangas@enterprisedb.com> wrote: >> * If there's no WAL to send, walsender doesn't notice if the client has >> closed connection already. This is the issue Fujii reported already. >> We'll need to add a select() call to the walsender main loop to check if >> the socket has been closed. > > We should reactivate pq_wait() and secure_poll()? I don't think we need all that, a simple select() should be enough. Though I must admit I'm not very familiar with select/poll(). >> * We still have a related issue, though: if standby is configured to >> archive to the same location as master (as it always is on my laptop, >> where I use the postgresql.conf of the master unmodified in the server), >> right after failover the standby server will try to archive all the old >> WAL files that were streamed from the master; but they exist already in >> the archive, as the master archived them already. I'm not sure if this >> is a pilot error, or if we should do something in the server to tell >> apart WAL segments streamed from master and those generated in the >> standby server after failover. Maybe we should immediately create a >> .done file for every file received from master? > > There is no guarantee that such file has already been archived by master. > This is just an idea, but new WAL record indicating the completion of the > archiving would be useful for the standby to create .done file. But, this idea > might kill the "archiving during recovery" idea discussed above. > > Personally, I'm OK with that issue because we can avoid it by tweaking > archive_command. Could we revisit this discussion with the "archiving > during recovery" discussion later? Ok. The workaround is to configure standby to archive to a different location. If you need to restore from that, you'll need to stitch together the logs from the old master and the new one. >> * A standby that connects to master, initiates streaming, and then sits >> idle without stalls recycling of old WAL files in the master. That will >> eventually lead to a full disk in master. Do we need some kind of a >> emergency valve on that? > > I think that we need the GUC parameter to specify the maximum number > of log file segments held in pg_xlog directory to send to the standby server. > The replication to the standby which falls more than that GUC value behind > is just terminated. > http://archives.postgresql.org/pgsql-hackers/2009-12/msg01901.php Oh yes, sounds good. >> * Do we really need to split the sleep in walsender to NAPTIME_PER_CYCLE >> increments? > > Yes. It's required for some platforms (probably HP-UX) in which signals > cannot interrupt the sleep. I'm thinking that the wal_sender_delay is so small that maybe it's not worth worrying about. >> * Walreceiver should flush less aggresively than after each received >> piece of WAL as noted by XXX comment. > >> * XXX: Flushing after each received message is overly aggressive. Should >> * implement some sort of lazy flushing. Perhaps check in the main loop >> * if there's any more messages before blocking and waiting for one, and >> * flush the WAL if there isn't, just blocking. > > In this approach, if messages continuously arrive from master, the fsync > would be delayed until WAL segment is switched. Likewise, recovery also > would be delayed, which seems to be problem. That seems OK to me. If messages are really coming in that fast, fsyncing the whole WAL segment at a time is probably most efficient. But if that really is too much, you could still do extra flushes within XLogRecv() every few megabytes for example. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
pgsql-hackers by date: