Re: [HACKERS] Bug with pg_basebackup and 'shared' tablespace - Mailing list pgsql-hackers
From | Kyotaro HORIGUCHI |
---|---|
Subject | Re: [HACKERS] Bug with pg_basebackup and 'shared' tablespace |
Date | |
Msg-id | 20170929.141917.20719591.horiguchi.kyotaro@lab.ntt.co.jp Whole thread Raw |
In response to | Re: [HACKERS] Bug with pg_basebackup and 'shared' tablespace (Pierre Ducroquet <pierre.ducroquet@people-doc.com>) |
Responses |
Re: [HACKERS] Bug with pg_basebackup and 'shared' tablespace
|
List | pgsql-hackers |
Hi, At Tue, 19 Sep 2017 17:07:10 +0200, Pierre Ducroquet <pierre.ducroquet@people-doc.com> wrote in <1678633.OBaBNztJnJ@pierred-pdoc> > On Tuesday, September 19, 2017 12:52:37 PM CEST you wrote: > > On Thu, Sep 14, 2017 at 2:17 AM, Pierre Ducroquet <p.psql@pinaraf.info> > wrote: > > > All my apologies for the schockingly long time with no answer on this > > > topic. > > No problem. That's the concept called life I suppose. > > > > > I will do my best to help review some patches in the current CF. > > > > Thanks for the potential help! > > > > > Attached to this email is the new version of the patch, checked against > > > HEAD and REL_10_STABLE, with the small change required by Michael (affect > > > true/ false to the boolean instead of 1/0) applied. > > > > Thanks for the new version. > > > > So I have been pondering about this patch, and allowing multiple > > versions of Postgres to run in the same tablespace is mainly here for > > upgrade purposes, so I don't think that we should encourage such > > practices for performance reasons. There is as well > > --tablespace-mapping which is here to cover a similar need and we know > > that this solution works, at least people have spent time to make sure > > it does. > > > > Another problem that I have with this patch is that on HEAD, > > permission checks are done before receiving any data. I think that > > this is really saner to do any checks like that first, so as you can > > know if the tablespace is available for write access before writing > > any data to it. With this patch, you may finish by writing a bunch of > > data to a tablespace path, but then fail on another tablespace because > > of permission issues so the backup becomes useless after transferring > > and writing potentially hundreds of gigabytes. This is no fun to > > detect that potentially too late, and can impact the responsiveness of > > instances to get backups and restore from them. > > > > All in all, this patch gets a -1 from me in its current shape. If > > Horiguchi-san or yourself want to process further with this patch, of > > course feel free to do so, but I don't feel that we are actually > > trying to solve a new problem here. I am switching the patch to > > "waiting on author". Hmm. I recall that my opinion on the "problem" was that we should just prohibit sharing rather than allow it. However, if there's who wants it and the behavior is not so bizarre, I don't reject the direction. As long as I saw the patch, it just delays digging of the top directory until the CATVER directory becomes knwon without additional checks. The behavior is identical to what the current version does on tablespace directories so the pointed problems don't seem to be new ones caused by this patch and such situation seems a bit malicious. > Hi > > The point of this patch has never been to 'promote' sharing tablespaces > between PostgreSQL versions. This is not a documented feature, and it would be > imho a bad idea to promote it because of bugs like this one. That *was* a bug, but could be a not-a-bug after we define the behavior reasonably, and may be should be documented. > But that bug is a problem for people who are migrating between postgresql > releases one database at a time on the same server (maybe not a best practice, > but a real use case nevertheless). That's how I met this bug, and that's why I > wanted to patch it. I know there is a workaround, but to me it seems counter- > intuitive that with no warning I can create a postgresql cluster that can not > be restored without additional pg_basebackup settings. > If it is indeed forbidden to share a tablespace between postgresql releases, > why don't we enforce it ? Or at least show a warning when CREATE TABLESPACE > encounter a non-empty folder ? > > Regarding the permission check, I don't really see how this is a real problem > with the patch. I have to check on master, but it seems to me that the > permission check can still be done before starting writing data on disc. We > could just delay the 'empty' folder check, and keep checking the folder > permissions. > > And I will do the pgindent run, thanks for the nitpick :) So I'll wait for the new version, but I have a comment on the patch. It would practically work but I don't like the fact that the patch relies on the specific directory/file ordering in the tar stream. This is not about the CATVER directory but lower directories. Being more strict, it actually makes excessive calls to verify_dir_is_em..() for more lower directories contrarily to expectations. I think that we can take more more robust way to detect the CATVER directory. Just checking if it is a top-level directory will work. That is, making the following change. - if (firstfile && !basetablespace) + /* copybuf must contain at least one '/' here */ + if (!basetablespace && strchr(copybuf, '/')[1] == 0) This condition exactly hits only CATVER directories not being disturbed by file ordering of the tar stream. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
pgsql-hackers by date: