Home > mailing lists

Re: backup manifests - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: backup manifests
Date	March 30, 2020 01:07:40
Msg-id	20200330010740.4remdjgnftyuiz2v@alap3.anarazel.de Whole thread Raw
In response to	Re: backup manifests (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: backup manifests
List	pgsql-hackers

Tree view

Hi,

On 2020-03-29 20:42:35 -0400, Robert Haas wrote:
> > What do you think of having the verification process also call pg_waldump to
> > validate the WAL CRCs (shown upthread)?  That looked helpful and simple.
> 
> I don't love calls to external binaries, but I think the thing that
> really bothers me is that pg_waldump is practically bound to terminate
> with an error, because the last WAL segment will end with a partial
> record.

I don't think that's the case here. You should know the last required
record, which should allow to specify the precise end for pg_waldump. If
it errors out reading to that point, we'd be in trouble.

> For the same reason, I think there's really no such thing as
> validating a single WAL file. I suppose you'd need to know the exact
> start and end locations for a minimal WAL replay and check that all
> records between those LSNs appear OK, ignoring any apparent problems
> after the minimum ending point, or at least ignoring any problems due
> to an incomplete record in the last file. We don't have a tool for
> that currently, and I don't think I can write one this week. Or at
> least, not a good one.

pg_waldump -s / -e?

> > > +             parse->pathname = palloc(raw_length + 1);
> >
> > I don't see this freed anywhere; is it?  (It's useful to make peak memory
> > consumption not grow in proportion to the number of files backed up.)
> 
> We need the hash table to remain populated for the whole run time of
> the tool, because we're essentially doing a full join of the actual
> directory contents against the manifest contents. That's a bit
> unfortunate but it doesn't seem simple to improve. I think the only
> people who are really going to suffer are people who have an enormous
> pile of empty or nearly-empty relations. People who have large
> databases for the normal reason - i.e. a reasonable number of tables
> that hold a lot of data - will have manifests of very manageable size.

Given that that's a pre-existing issue - at a significantly larger scale
imo - e.g. for pg_dump (even in the --schema-only case), and that there
are tons of backend side issues with lots of relations too, I think
that's fine.

You could of course implement something merge-join like, and implement
the sorted input via a disk base sort. But that's a lot of work (good
luck making tuplesort work in the frontend...). So I'd not go there
unless there's a lot of evidence this is a serious practical issue.

If we find this use too much memory, I think we'd be better off
condensing pathnames into either fewer allocations, or a RelFileNode as
part of the struct (with a fallback to string for other types of
files). But I'd also not go there for now.

Greetings,

Andres Freund

pgsql-hackers by date:

From: David Steele
Date: 30 March 2020, 01:05:17
Subject: Re: backup manifests

From: Tom Lane
Date: 30 March 2020, 01:08:30
Subject: Re: snapper vs. HEAD

Re: backup manifests - Mailing list pgsql-hackers

Previous

Next