Re: pg_rewind failure by file deletion in source server - Mailing list pgsql-hackers
From | Fujii Masao |
---|---|
Subject | Re: pg_rewind failure by file deletion in source server |
Date | |
Msg-id | CAHGQGwGTsV6-txUo3rd7GaLAHn9C0N2WBK9g9J5zJ11_JJV9Hg@mail.gmail.com Whole thread Raw |
In response to | Re: pg_rewind failure by file deletion in source server (Heikki Linnakangas <hlinnaka@iki.fi>) |
Responses |
Re: pg_rewind failure by file deletion in source server
|
List | pgsql-hackers |
On Tue, Jun 23, 2015 at 11:21 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > On 06/23/2015 05:03 PM, Fujii Masao wrote: >> >> On Tue, Jun 23, 2015 at 9:19 PM, Heikki Linnakangas <hlinnaka@iki.fi> >> wrote: >>> >>> On 06/23/2015 07:51 AM, Michael Paquier wrote: >>>> >>>> >>>> So... Attached are a set of patches dedicated at fixing this issue: >>> >>> >>> >>> Thanks for working on this! >>> >>>> - 0001, add if_not_exists to pg_tablespace_location, returning NULL if >>>> path does not exist >>>> - 0002, same with pg_stat_file, returning NULL if file does not exist >>>> - 0003, same with pg_read_*file. I added them to all the existing >>>> functions for consistency. >>>> - 0004, pg_ls_dir extended with if_not_exists and include_dot_dirs >>>> (thanks Robert for the naming!) >>>> - 0005, as things get complex, a set of regression tests aimed to >>>> covering those things. pg_tablespace_location is platform-dependent, >>>> so there are no tests for it. >>>> - 0006, the fix for pg_rewind, using what has been implemented before. >>> >>> >>> >>> With thes patches, pg_read_file() will return NULL for any failure to >>> open >>> the file, which makes pg_rewind to assume that the file doesn't exist in >>> the >>> source server, and will remove the file from the destination. That's >>> dangerous, those functions should check specifically for ENOENT. >> >> >> I'm wondering if using pg_read_file() to copy the file from source server >> is reasonable. ISTM that it has two problems as follows. >> >> 1. It cannot read very large file like 1GB file. So if such large file was >> created in source server after failover, pg_rewind would not be able >> to copy the file. No? > > > pg_read_binary_file() handles large files just fine. It cannot return more > than 1GB in one call, but you can call it several times and retrieve the > file in chunks. That's what pg_rewind does, except for reading the control > file, which is known to be small. Yeah, you're right. I found that pg_rewind creates a temporary table to fetch the file in chunks. This would prevent pg_rewind from using the *hot standby* server as a source server at all. This is of course a limitation of pg_rewind, but we might want to alleviate it in the future. >> 2. Many users may not allow a remote client to connect to the >> PostgreSQL server as a superuser for some security reasons. IOW, >> there would be no entry in pg_hba.conf for such connection. >> In this case, pg_rewind always fails because pg_read_file() needs >> superuser privilege. No? >> >> I'm tempting to implement the replication command version of >> pg_read_file(). That is, it reads and sends the data like BASE_BACKUP >> replication command does... > > > Yeah, that would definitely be nice. Peter suggested it back in January > (http://www.postgresql.org/message-id/54AC4801.7050300@gmx.net). I think > it's way too late to do that for 9.5, however. I'm particularly worried that > if we design the required API in a rush, we're not going to get it right, > and will have to change it again soon. That might be difficult in a minor > release. Using pg_read_file() and friends is quite flexible, even though we > just find out that they're not quite flexible enough right now (the ENOENT > problem). I agree that it's too late to do what I said... But just using pg_read_file() cannot address the #2 problem that I pointed in my previous email. Also requiring a superuer privilege on pg_rewind really conflicts with the motivation why we added replication privilege. So we should change pg_read_file() so that even replication user can read the file? Or replication user version of pg_read_file() should be implemented? Regards, -- Fujii Masao
pgsql-hackers by date: