Re: pg_dump and dependencies and --section ... it's a mess - Mailing list pgsql-hackers
From | Andrew Dunstan |
---|---|
Subject | Re: pg_dump and dependencies and --section ... it's a mess |
Date | |
Msg-id | 4FE39DB3.1010005@dunslane.net Whole thread Raw |
In response to | pg_dump and dependencies and --section ... it's a mess (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: pg_dump and dependencies and --section ... it's a mess
|
List | pgsql-hackers |
On 06/21/2012 02:13 PM, Tom Lane wrote: > Don't know if everybody on this list has been paying attention to the > pgsql-bugs thread about bug #6699. The shortest example of the problem > is > > create table t1 (f1 int primary key, f2 text); > create view v1 as select f1, f2 from t1 group by f1; > > The view's query is legal only because of the primary key on f1, > else the reference to f2 would be considered inadequately grouped. > So when we create the view we mark it as dependent on that primary key > (or more accurately, the view's _RETURN rule is so marked). This all > works fine so far as the backend is concerned: you can't drop the PK > without dropping or redefining the view. But it gives pg_dump > indigestion. If you do "pg_dump -Fc | pg_restore -l -v" you see > (relevant entries only): > > 5; 2615 2200 SCHEMA - public postgres PRE_DATA > 168; 1259 41968 TABLE public t1 postgres PRE_DATA > ; depends on: 5 > 1921; 2606 41975 CONSTRAINT public t1_pkey postgres POST_DATA > ; depends on: 168 168 > 169; 1259 41976 VIEW public v1 postgres PRE_DATA > ; depends on: 1919 5 > 1922; 0 41968 TABLE DATA public t1 postgres DATA > ; depends on: 168 > > Now, there are two not-nice things about this. First, the view is shown > as depending on "object 1919", which is pg_dump's internal DumpId for > the view's _RETURN rule --- but that's nowhere to be seen in the dump, > so the fact that it's dependent on the t1_pkey constraint is not > visible in the dump. This puts parallel pg_restore at risk of doing > the wrong thing. Second, because view definitions are preferentially > dumped in the PRE_DATA section, the t1_pkey constraint has been hoisted > up to come before the table data. That means that in an ordinary serial > restore, the index will be created before the table's data is loaded, > which is undesirable for efficiency reasons. > > It gets worse though. I've labeled the above archive items with their > "SECTION" codes (which for some reason pg_restore -l -v doesn't print) > so that you can see that we've got a POST_DATA item that must come > before a PRE_DATA item. This wreaks havoc with the entire concept > of splitting dump files into three sections. When I first realized > that, I was thinking that we would have to revert the --section flag > we added to pg_dump/pg_restore in 9.2, for lack of any way to guarantee > that the items can actually be restored if they are split according > to sections. > > I think that we can fix it though. There are provisions in pg_dump > already for splitting a view apart from its _RETURN rule --- and if the > rule comes out as a separate object, it's POST_DATA. So at least in > principle, we could fix things by dumping this scenario this way: > > SCHEMA public PRE_DATA > TABLE t1 PRE_DATA > TABLE v1 PRE_DATA (at this point it's a table not a view) > TABLE DATA t1 DATA > CONSTRAINT t1_pkey POST_DATA > RULE for v1 POST_DATA (adding the rule turns v1 into a view) > > The problem is how to persuade pg_dump to do that; as the code stands, > it will only break a view apart from its rule if that's necessary to > break a circular dependency loop, and there is none in this example. > > Another point worth thinking about is that if --section is to be trusted > at all, we need some way of guaranteeing that the dependency-sorting > code won't happily create other similar cases. There is nothing in > there right now that would think anything is wrong with an ordering > that breaks the section division. > > I believe the right fix for both of these issues is to add knowledge of > the section concept to the topological sort logic, so that an ordering > that puts POST_DATA before DATA or PRE_DATA after DATA is considered to > be a dependency-ordering violation. One way to do that is to add dummy > "fencepost" objects to the sort, representing the start and end of the > DATA section. However, these objects would need explicit dependency > links to every other DumpableObject, so that doesn't sound very good > from a performance standpoint. What I'm going to go look at is whether > we can mark DumpableObjects with their SECTION codes at creation time > (rather than adding that information at ArchiveEntry() time) and then > have the topo sort logic take that marking into account in addition to > the explicit dependency links. > > Assuming we can make that work, how far should it be back-patched? > The --section issue is new in 9.2, but this would also take care of > the restore problems for views with constraint dependencies, which > are allowed as of 9.1, so I'm thinking we'd better put it into 9.1. > > The other problem is the bogus dependency IDs that pg_dump emits by > virtue of not caring whether the dependency links to an object it's > actually dumped. I posted a patch for that in the pgsql-bugs thread > but pointed out that it would not work before 9.2 without additional > surgery. If we fix the view vs constraint issue by adding section > knowledge to the sort, I think we can maybe get away with not fixing > the dependency IDs in back branches. parallel pg_restore is definitely > at risk without the fix, but in the absence of any other known problem > cases I'm inclined to not change the back branches more than we have to. > > Thoughts? > > Wow, talk about subtle. When I first read this my face turned a bit red, but after reading it a couple more times I don't feel quite so guilty, especially since the constraint dependency didn't exist at the time we did parallel pg_restore. If I'm understanding you correctly, fixing the bogus dependency thing is more an insurance policy than fixing a case (other than the constraint dependency) that is known to be broken. If that is correct, and given that we've had 3 years of parallel pg_restore and not heard of such issues, it doesn't seem unreasonable not to backpatch. (There's another bug to do with parallel pg_restore and clustering that Andrew Hammond raised back in January, that I want to fix when I get some time.) cheers andrew
pgsql-hackers by date: