Thread: pg_dump bugs reported as pg_upgrade bugs
FYI, you might wonder why so many bugs reported on pg_upgrade eventually are bugs in pg_dump. Well, of course, partly is it because pg_upgrade relies on pg_dump, but a bigger issue is that pg_upgrade will fail if pg_dump or its restoration generate _any_ errors. My guess is that many people are using pg_dump and restore and just ignoring errors or fixing them later, while this is not possible when using pg_upgrade. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Embrace your flaws. They make you human, rather than perfect, which you will never be.
Bruce Momjian <bruce@momjian.us> writes: > FYI, you might wonder why so many bugs reported on pg_upgrade eventually > are bugs in pg_dump. Well, of course, partly is it because pg_upgrade > relies on pg_dump, but a bigger issue is that pg_upgrade will fail if > pg_dump or its restoration generate _any_ errors. My guess is that many > people are using pg_dump and restore and just ignoring errors or fixing > them later, while this is not possible when using pg_upgrade. pg_dump scripts are *designed* to be tolerant of errors, mainly so that you can restore into a situation that's not exactly like where you dumped from, with the possible need to resolve errors or decide that they're not problems. So your depiction of what happens in dump/restore is not showing a problem; it's about using those tools as they were intended to be used. Indeed, there's a bit of disconnect there with pg_upgrade, which would like to present a zero-user-involvement, nothing-to-see-here facade. regards, tom lane
On Wed, Nov 30, 2022 at 12:22:57AM -0500, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > FYI, you might wonder why so many bugs reported on pg_upgrade eventually > > are bugs in pg_dump. Well, of course, partly is it because pg_upgrade > > relies on pg_dump, but a bigger issue is that pg_upgrade will fail if > > pg_dump or its restoration generate _any_ errors. My guess is that many > > people are using pg_dump and restore and just ignoring errors or fixing > > them later, while this is not possible when using pg_upgrade. > > pg_dump scripts are *designed* to be tolerant of errors, mainly so > that you can restore into a situation that's not exactly like where > you dumped from, with the possible need to resolve errors or decide > that they're not problems. So your depiction of what happens in > dump/restore is not showing a problem; it's about using those tools > as they were intended to be used. > > Indeed, there's a bit of disconnect there with pg_upgrade, which would > like to present a zero-user-involvement, nothing-to-see-here facade. Agreed, a disconnect, plus if it is a table or index restore that fails, pg_upgrade would fail later because there would be no system catalogs to move the data into. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Embrace your flaws. They make you human, rather than perfect, which you will never be.
On Wed, Nov 30, 2022 at 12:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > pg_dump scripts are *designed* to be tolerant of errors, mainly so > that you can restore into a situation that's not exactly like where > you dumped from, with the possible need to resolve errors or decide > that they're not problems. So your depiction of what happens in > dump/restore is not showing a problem; it's about using those tools > as they were intended to be used. > > Indeed, there's a bit of disconnect there with pg_upgrade, which would > like to present a zero-user-involvement, nothing-to-see-here facade. Yes. I think it's good that the pg_dump scripts are designed to be tolerant of errors, but I also think that we've got to clearly envisage that error-tolerance as the backup plan. Fifteen years ago, it may have been acceptable to imagine that every dump-and-restore was going to be performed by a human being who could make an intelligent judgement about whether the errors that occurred were concerning or not, but today, at the scale that PostgreSQL is being used, that's not realistic. Unattended operation is common, and the number of instances vastly outstrips the number of people who are truly knowledgeable about the internals. The goalposts have moved because the project is successful and widely adopted. All of this is true even apart from pg_upgrade, but the existence of pg_upgrade and the fact that pg_upgrade is the only way to perform a quick major version upgrade exacerbates the problem quite a bit. I don't know what consequences this has concretely, really. I have no specific change to propose. I just think that we need to wrench ourselves out of a mind-set where we imagine that some errors are OK because the DBA will know how to fix things up. The DBA is a script. If there's a knowledgeable person at all they have 10,000 instances to look after and don't have time to fiddle with each one. The aspects of PostgreSQL that tend to require manual fiddling (HA, backups, upgrades, autovacuum) are huge barriers to wider adoption and large-scale deployment in a way that probably just wasn't true when the project wasn't as successful as it now is. -- Robert Haas EDB: http://www.enterprisedb.com