Re: Allow replication roles to use file access functions - Mailing list pgsql-hackers
From | Michael Paquier |
---|---|
Subject | Re: Allow replication roles to use file access functions |
Date | |
Msg-id | CAB7nPqRudM6GScUj3_Kvxev+JOPTKtG_k1d316=mzMed_+cMeA@mail.gmail.com Whole thread Raw |
In response to | Re: Allow replication roles to use file access functions (Stephen Frost <sfrost@snowman.net>) |
Responses |
Re: Allow replication roles to use file access functions
|
List | pgsql-hackers |
On Thu, Sep 3, 2015 at 10:05 AM, Stephen Frost wrote: > * Michael Paquier (michael.paquier@gmail.com) wrote: >> On Thu, Sep 3, 2015 at 4:52 AM, Alvaro Herrera wrote: > The replication role already has read-only access to everything > (nearly?) in the PGDATA directory. The specific issue in this case, I > believe, is if the functions mentioned provide access beyond what the > replication role already has access to. It seems to me that this difference of files is symbolized by what is filtered out in a base backup, like postmaster.opts or postmaster.pid. The matter here is: are we going to have in the future files that we do not want to include in a base backup that are superuser-only? Today I cannot recall we have any of that, we may have in the future though. >> I don't entirely understand why pg_rewind should be invoking any of these >> functions to begin with. Isn't it working directly with the disk data, >> with both postmasters shut down? > > Certain information is needed from the new master to rewind the old > master. What I would hope is that we'd provide a way for *just* that > information to be made available to the replication role rather than > completely general access through the functions mentioned. The rewound node does not only need the diffs of blocks from a source, but also some more information like clog. There are actually three different approaches to solve that: 1) Use a differential backup to me, or the possibility to fetch a set of data block diffs from a source node using an LSN and then re-apply them on the target node. The major disadvantage of this approach is actually performance: we would need to scan the source node completely, so that's slow. And pg_rewind is fast because it only scans the WAL records of the node to be rewound from the last checkpoint before WAL forked. 2) Request data blocks from the source node using the replication protocol, that's what pg_read_binary_file actually does, we just don't have the logic on replication protocol side, though we could. 3) Create a new set of functions similar to the existing file access functions that are usable by the replication user, except that they refuse to return file entries that match the existing filters in basebackup.c. That is doable, with a routine in basebackup.c that decides if a given file string can be read or not. Base backups could use this routine as well. I just came up with 3), which looks rather appealing to me. >> > Another problem with it is that access to the filesystem is about halfway >> > to being superuser anyway, even if it's only read-only access. It would >> > certainly let you read things that one would not expect a replication >> > user to be able to read (like files unrelated to Postgres). >> >> Doesn't pg_read_file et al insist that the path is below the data >> directorY? > > I don't believe that's the case, actually.. I might be wrong though, > I'm not looking at the code right now. The use of PGDATA is enforced =# select pg_read_file('/etc/passwd'); ERROR: 42501: absolute path not allowed LOCATION: convert_and_check_filename, genfile.c:73 If PGDATA includes soft links it is possible to point to those places though, but that's not something anybody sane would do, still it can be done and I've seen worse: $ cd $PGDATA && ln -s /etc etc $ psql -c 'select pg_read_file('etc/passwd')' # Adult content, not an ERROR From this perspective allowing a replication user to have access to those functions is dangerous, a superuser being the equivalent of an all-mightly god on a server, still that's not the case of the replication user. > This is definitely a big part of > the question, but I'd like to ask- what, exactly, does pg_rewind > actually need access to? Is it only the WAL, or are heap and WAL files > needed? Not only, +clog, configuration files, etc. > Consider the discussion about delta backups, et al, using LSNs. Perhaps > the replication protocol should be extended to allow access to arbitrary > WAL, querying what WAL is available, and access to the heap files, > perhaps even specific pages in the heap files and relation forks, > instead of giving pg_rewind access to these extremely general > nearly-OS-user-level functions. The problem when using differential backups in this case is performance as mentioned above. We would need to scan the whole target cluster, which may take time, the current approach of pg_rewind only needs to scan WAL records to find the list of blocks modified, and directly requests them from the source. I would expect pg_rewind to be as quick as possible. > Further, for my two cents, I view any access to the system by a > replication role to a non-replication-protocol communication with the > server as being inappropriate. To that point, I'd actually remove the > ability for the replication roles to call pg_start/stop_backup as normal > SQL calls since that implies the replication user has connected to a > normal backend instead of talking the replication protocol. I'm not > sure if we can really put that cat back in the bag, but I'd be a lot > happier if we could. Allowing access to these functions outside of the > replication protocol doubles-down on that bad idea, in my view. That would break many existing applications, but I definitely see the rightful concern here: basically replication roles should just use the replication protocol and have no access to directly to SQL. -- Michael
pgsql-hackers by date: