Re: backup manifests - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: backup manifests |
Date | |
Msg-id | 20200330010740.4remdjgnftyuiz2v@alap3.anarazel.de Whole thread Raw |
In response to | Re: backup manifests (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: backup manifests
|
List | pgsql-hackers |
Hi, On 2020-03-29 20:42:35 -0400, Robert Haas wrote: > > What do you think of having the verification process also call pg_waldump to > > validate the WAL CRCs (shown upthread)? That looked helpful and simple. > > I don't love calls to external binaries, but I think the thing that > really bothers me is that pg_waldump is practically bound to terminate > with an error, because the last WAL segment will end with a partial > record. I don't think that's the case here. You should know the last required record, which should allow to specify the precise end for pg_waldump. If it errors out reading to that point, we'd be in trouble. > For the same reason, I think there's really no such thing as > validating a single WAL file. I suppose you'd need to know the exact > start and end locations for a minimal WAL replay and check that all > records between those LSNs appear OK, ignoring any apparent problems > after the minimum ending point, or at least ignoring any problems due > to an incomplete record in the last file. We don't have a tool for > that currently, and I don't think I can write one this week. Or at > least, not a good one. pg_waldump -s / -e? > > > + parse->pathname = palloc(raw_length + 1); > > > > I don't see this freed anywhere; is it? (It's useful to make peak memory > > consumption not grow in proportion to the number of files backed up.) > > We need the hash table to remain populated for the whole run time of > the tool, because we're essentially doing a full join of the actual > directory contents against the manifest contents. That's a bit > unfortunate but it doesn't seem simple to improve. I think the only > people who are really going to suffer are people who have an enormous > pile of empty or nearly-empty relations. People who have large > databases for the normal reason - i.e. a reasonable number of tables > that hold a lot of data - will have manifests of very manageable size. Given that that's a pre-existing issue - at a significantly larger scale imo - e.g. for pg_dump (even in the --schema-only case), and that there are tons of backend side issues with lots of relations too, I think that's fine. You could of course implement something merge-join like, and implement the sorted input via a disk base sort. But that's a lot of work (good luck making tuplesort work in the frontend...). So I'd not go there unless there's a lot of evidence this is a serious practical issue. If we find this use too much memory, I think we'd be better off condensing pathnames into either fewer allocations, or a RelFileNode as part of the struct (with a fallback to string for other types of files). But I'd also not go there for now. Greetings, Andres Freund
pgsql-hackers by date: