Re: The danger of deleting backup_label - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: The danger of deleting backup_label |
Date | |
Msg-id | CA+TgmoZJ+WcTsPxzvFbTbV4M7wiyjftL0tyfZOmimeQ0x4V2=g@mail.gmail.com Whole thread Raw |
In response to | Re: The danger of deleting backup_label (David Steele <david@pgmasters.net>) |
Responses |
Re: The danger of deleting backup_label
|
List | pgsql-hackers |
On Tue, Oct 17, 2023 at 4:17 PM David Steele <david@pgmasters.net> wrote: > Given that the above can't be back patched, I'm thinking we don't need > backup_label at all going forward. We just write the values we need for > recovery into pg_control and return *that* from pg_backup_stop() and > tell the user to store it with their backup. We already have "These > files are vital to the backup working and must be written byte for byte > without modification, which may require opening the file in binary > mode." in the documentation so dealing with pg_control should not be a > problem. pg_control also has a CRC so we will know if it gets munged. Yeah, I was thinking about this kind of idea, too. I think it might be a good idea, although I'm not completely certain about that, either. On the positive side, you can't remove backup_label in error if backup_label is not a thing. You certainly can't remove the control file. You can, however, use the original control file instead of the one that you were supposed to use. However, that is not really any different from failing to write the backup_label into the backup directory, which you can already do today. Also, it adds very little net complexity to the low-level backup procedure. Instead of needing to write the backup_label into the backup directory, you write the control file -- but that's instead, not in addition. So overall it seems like the complexity is similar to what we have today but one possible mistake is eliminated. Also on the positive side, I suspect we could remove a decent amount of code for dealing with backup_label files. We wouldn't have to read them any more (and the code to do that is pretty rough-and-ready) and we wouldn't have to do different things based on whether the backup_label exists or not. The logic in xlog*.c is extremely complicated, and everything that we can do to reduce the number of cases that need to be considered is not just good, but great. But there are also some negatives. First, anything that is stored in the backup_label but not the control file has to (a) move into the control file, (b) be stored someplace else, or (c) be eliminated as a concept. We're likely to get complaints about (a), especially if the data in question is anything big. Any proposal to do (b) risks undermining the whole theory under which this is a good proposal, namely that removing backup_label gives us one less thing to worry about. So that brings us to (c). Personally, I would lose very little sleep if the LABEL field died and never came back, and I wouldn't miss START TIME and STOP TIME either, but somebody else might well feel differently. I don't think it's trivial to get rid of BACKUP METHOD, as there unfortunately seems to be code that depends on knowing the difference between BACKUP FROM: streamed and BACKUP FROM: pg_rewind. I suspect that BACKUP FROM: primary/standby might have the same issue, but I'm not sure. STOP TIMELINE could be a problem too. I think that if somebody could do some rejiggering to eliminate some of the differences between the cases here, that could be really good general cleanup irrespective of what we decide about this proposal, and moving some things in to pg_control is probably reasonable too. For instance, it would seem crazy to me to argue that storing the backup end location in the control file is OK, but storing the backup end TLI there would not be OK. But the point here is that there's probably a good deal of careful thinking that would need to be done here about exactly where all of the stuff that currently exists in the backup_label file but not in pg_control needs to end up. Second, right now, the stuff that we return at the end of a backup is all text data. With this proposal, it becomes binary data. I entirely realize that people should only be doing these kinds of backups using automated tools that that those automated tools should be perfectly capable of handling binary data without garbling anything. But that's about as realistic as supposing that people won't instantly remove the backup_label file the moment it seems like it will solve some problem, even when the directions clearly state that this should only be done in some other situation that is not the one the user is facing. It just occurred to me that one thing we could do to improve the user experience here is offer some kind of command-line utility to assist with taking a low-level backup. This could be done even if we don't proceed with this proposal e.g. pg_llbackup -d $CONNTR --backup-label=PATH --tablespace-map=PATH --copy-data-directory=SHELLCOMMAND I don't know for sure how much that would help, but I wonder if it might actually help quite a bit, because right now people do things like use psql in a shell script to try to juggle a database connection and then in some other part of the shell script do the data copying. But it is quite easy to screw up the error handling or the psql session lifetime or something like that, and this would maybe provide a nicer interface. Details likely need a good deal of kibitizing. There might be other problems, too. This is just what occurs to me off the top of my head. But I think it's an interesting angle to explore further. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: