pg_dump and premature optimizations for objects not to be dumped - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | pg_dump and premature optimizations for objects not to be dumped |
Date | |
Msg-id | 19767.1452279786@sss.pgh.pa.us Whole thread Raw |
Responses |
Re: pg_dump and premature optimizations for objects not to be dumped
|
List | pgsql-hackers |
I looked into Karsten Hilbert's report here http://www.postgresql.org/message-id/20160108114529.GB22446@hermes.hilbert.loc of being unable to run pg_upgrade on a database containing the pg_trgm extension. After some investigation, the problem is explained like this: 1. Karsten had installed pg_trgm into pg_catalog rather than some user schema. 2. When getTypes() processes the gtrgm type created by pg_trgm, it decides the type won't be dumped (cf selectDumpableType, which quite properly rejects types in pg_catalog). This causes it not to generate a ShellTypeInfo subsidiary object. 3. Later, getExtensionMembership decides that we *should* dump gtrgm and associated I/O routines, since we're in binary-upgrade mode. 4. Later still, repairTypeFuncLoop breaks the dependency loops between gtrgm and its I/O routines, but it doesn't mark the shell type as dumpable because there isn't one. 5. So we end up with no shell type being dumped, which does not work in binary-upgrade mode because we can't create the shell type when we see the first I/O function; we don't know what OID to give it. The core of the problem, therefore, is the choice in getTypes to not create a ShellTypeInfo if it thinks the type is not to be dumped. That was probably fine when it was written, but now that we can change our minds about object dumpability later, it's clearly broken. It's really just unnecessary optimization anyway, since if we create the object and don't use it, we've not wasted much but cycles. I've verified that a patch like this: diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index 56c0528..6acbf4a 100644 *** a/src/bin/pg_dump/pg_dump.c --- b/src/bin/pg_dump/pg_dump.c *************** getTypes(Archive *fout, int *numTypes) *** 3664,3671 **** * should copy the base type's catId, but then it might capture the * pg_depend entriesfor the type, which we don't want. */ ! if (tyinfo[i].dobj.dump && (tyinfo[i].typtype == TYPTYPE_BASE || ! tyinfo[i].typtype == TYPTYPE_RANGE)) { stinfo = (ShellTypeInfo*) pg_malloc(sizeof(ShellTypeInfo)); stinfo->dobj.objType = DO_SHELL_TYPE; --- 3664,3671 ---- * should copy the base type's catId, but then it might capture the * pg_depend entriesfor the type, which we don't want. */ ! if (tyinfo[i].typtype == TYPTYPE_BASE || ! tyinfo[i].typtype == TYPTYPE_RANGE) { stinfo = (ShellTypeInfo *) pg_malloc(sizeof(ShellTypeInfo)); stinfo->dobj.objType = DO_SHELL_TYPE; fixes Karsten's symptom without obvious other problems. However ... there are a whole bunch of *other* places where pg_dump skips work on objects that are not to be dumped, and some of them would add quite a bit more cost than this one. An example is just above this code, where we skip getDomainConstraints() for domains we think aren't to be dumped. A fix for that wouldn't be a one-liner either, because if we do pull in the domain's constraints unconditionally, we'd have to do something to cause the domain's own dumpability flag (and updates thereof) to get propagated to them. Even if we fix all these issues today, it's not hard to imagine new bugs of similar ilk sneaking in. Exacerbating this is that some of those checks are fine because they occur after getExtensionMembership(), at which point the dump-or-not decisions won't change. But in most places in pg_dump, it's not instantly obvious whether you're looking at a flag that is still subject to change or not. The thing that seems possibly most robust to me is to pull in the extension membership data *first* and incorporate it into the initial selectDumpableObject tests, thus getting us back to the pre-9.1 state of affairs where the initial settings of the object dump flags could be trusted. This might be expensive though; and it would certainly add a good chunk of work to the race-condition window where we've taken pg_dump's transaction snapshot but not yet acquired lock on any of the tables. Thoughts? Anybody have another idea about what to do about this? regards, tom lane
pgsql-hackers by date: