Re: Improving test coverage of extensions with pg_dump - Mailing list pgsql-hackers
From | Michael Paquier |
---|---|
Subject | Re: Improving test coverage of extensions with pg_dump |
Date | |
Msg-id | CAB7nPqRz9UcQJjUM_gXOGMt1dkqHp09GB=N_brLPdjMUg-_SGw@mail.gmail.com Whole thread Raw |
In response to | Re: Improving test coverage of extensions with pg_dump (Andres Freund <andres@anarazel.de>) |
Responses |
Re: Improving test coverage of extensions with pg_dump
|
List | pgsql-hackers |
On Sat, Sep 26, 2015 at 10:22 PM, Andres Freund <andres@anarazel.de> wrote: > On 2015-09-26 21:54:46 +0900, Michael Paquier wrote: >> On Wed, Sep 23, 2015 at 1:04 AM, Alvaro Herrera <alvherre@2ndquadrant.com> >> wrote: >> > We discussed this in some other thread, not long ago. I looked briefly >> > in the archives but couldn't find it. I think the conclusion was >> > something along the lines of "hmm, tough". > >> That's hours, even days of fun ahead for sure. > > To me that's a somewhat serious bug, and something that we probably need > to address at some point. Well, addressing it at the root of the problem is... Let's say... Really tough. I have put my head on this stuff for a couple of hours and tried to draw up a couple of ways to fix this. First, even if pg_dump relies heavily on the assumption that attributes need to be ordered by attnum, I thought that it would have been possible to use attinhcount to order the columns in the same way when taking the dump from a source db and a target (where dump of source has been restored to). This would work if there is no more than 1 level of child relations, but with grand-child relations the order gets messed up again. Locating a fix on the backend side would make things for pg_dump easier, an idea would be to simply reorder the attribute numbers when a column is added to a parent table. For example with something like that: CREATE TABLE test_parent (c1 integer, c2 integer); CREATE TABLE test_child_1 (c3 integer) inherits (test_parent); CREATE TABLE test_child_2 (c4 integer) inherits (test_child_1); ALTER TABLE test_parent ADD COLUMN c5 integer; We get the following attribute ordering: =# SELECT c.relname, a.attname, a.attnum FROM pg_attribute a JOIN pg_class c ON (a.attrelid = c.oid) WHERE a.attrelid IN ('test_parent'::regclass, 'test_child_1'::regclass, 'test_child_2'::regclass) AND a.attnum > 0 ORDER BY c.relname, a.attnum; relname | attname | attnum --------------+---------+-------- test_child_1 | c1 | 1 test_child_1 | c2 | 2 test_child_1 | c3 | 3 test_child_1 | c5 | 4 test_child_2 | c1 | 1 test_child_2 | c2 | 2 test_child_2 | c3 | 3 test_child_2 | c4 | 4 test_child_2 | c5 | 5 test_parent | c1 | 1 test_parent | c2 | 2 test_parent | c5 | 3 (12 rows) Now, what we could do on a child relation when a new attribute in its parent relation is to increment the attnum of the existing columns with attinhcount = 0 by 1, and insert the new column at the end of the existing ones where attinhcount > 0, to give something like that: relname | attname | attnum --------------+---------+-------- test_child_1 | c1 | 1 test_child_1 | c2 | 2 test_child_1 | c5 | 3 test_child_1 | c3 | 4 test_child_2 | c1 | 1 test_child_2 | c2 | 2 test_child_2 | c3 | 3 test_child_2 | c5 | 4 test_child_2 | c4 | 5 test_parent | c1 | 1 test_parent | c2 | 2 test_parent | c5 | 3 (12 rows) Looking at tablecmds.c, this could be intuitively done as a part of ATExecAddColumn by scanning the attributes of the child relation from the end. But it is of course not that simple, a lot of code paths rely on the attnum of the current attributes. One is CookedConstraint, but that's a can of worms for back branches. Another bandaid solution, that would be less invasive for a backpatch is to reproduce the sequence of DDL commands within the dump itself: we would need to dump attinhcount in getTableAttrs and use it to guess what are the columns on the child relations that have been added on a parent relation after the children have been created. This would not solve the case of data-only dumps containing INSERT queries that have no column names on databases restored from past schema dumps though. Perhaps you did not look at the last patch I sent on this thread, but I changed it so as a schedule is used with a call to pg_regress. That's a more scalable approach as you were concerned about as we can plug in more easily new modules and new regression tests without modifying the perl script used to kick the tests, though it does not run the main regression test suite and it actually cannot run it, except with a modified schedule or with a filter of the child-parent tables. Patch is attached for consideration, which looks like a good base for future support, feel free to disagree :) Thanks, -- Michael
Attachment
pgsql-hackers by date: