Thread: Re: [ADMIN] Schema comparisons

Re: [ADMIN] Schema comparisons

From

Tom Lane

Date:

28 February 2004, 11:39:43

Mark Lubratt <mark.lubratt@indeq.com> writes:
> On Feb 27, 2004, at 10:28 PM, Tom Lane wrote:
>> Mark Lubratt <mark.lubratt@indeq.com> writes:
>>> I've been trying to be careful, but I've gotten out of synch with
>>> whether or not I've applied the changes I've made to the development
>>> system to the production system.  Is there a utility that will compare
>>> the tables, functions, trigger, views, etc. between two systems and
>>> flag the schema elements that aren't in synch between the two?
>> 
>> Have you tried diffing pg_dump output?  It's not the greatest tool but
>> it's helpful.

> Yes, I did.  It was quite cumbersome.  Especially since the OIDs and 
> TOC entry numbers didn't matchup; and, since those didn't always match, 
> the order of objects wasn't quite the same either.  So, diff was 
> throwing a lot of false positives at me.

Yeah.  CVS-tip pg_dump doesn't show OIDs by default, to make it easier
to use for purposes like this.  The ordering issue is the bigger problem
though.  I presume that the object creation history is different in the
two databases and so pg_dump's habit of sorting by OID isn't helpful.

It occurs to me that this could be solved now that we have
dependency-driven ordering in pg_dump.  The ordering algorithm is
presently* Order by object type, and by OID within types;* Move objects as needed to honor dependencies.
Ordering by OID should no longer be needed for correctness, because
the second phase will take care of any dependency problems.  We
could instead make the initial sort be by object name (within types).
This should ensure that the schema output is identical for logically
equivalent databases, even if their history is different.

(When dumping from a pre-7.3 database, we'd have to stick to the OID
algorithm for lack of dependency info, but of course that case is
getting less interesting as time wears on.)

Comments?  Anyone see a reason not to do this?
        regards, tom lane

Re: [ADMIN] Schema comparisons

From

Michael Brusser

Date:

28 February 2004, 13:57:59

Interestingly I tried to address the same problem few days ago.
I used pg_dump, grep, etc - in the end I got what I needed, but
it was a cumbersome ordeal.

I think ideally it would be great to have a utility that would
give me a clean diff. between the schemas.

Perhaps pg_dump could have a new arg to produce the output
most suitable for this utility.

Mike.


> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Tom Lane
> Sent: Saturday, February 28, 2004 10:40 AM
> To: Mark Lubratt
> Cc: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] [ADMIN] Schema comparisons
>
>
> Mark Lubratt <mark.lubratt@indeq.com> writes:
> > On Feb 27, 2004, at 10:28 PM, Tom Lane wrote:
> >> Mark Lubratt <mark.lubratt@indeq.com> writes:
> >>> I've been trying to be careful, but I've gotten out of synch with
> >>> whether or not I've applied the changes I've made to the development
> >>> system to the production system.  Is there a utility that will compare
> >>> the tables, functions, trigger, views, etc. between two systems and
> >>> flag the schema elements that aren't in synch between the two?
> >>
> >> Have you tried diffing pg_dump output?  It's not the greatest tool but
> >> it's helpful.
>
> > Yes, I did.  It was quite cumbersome.  Especially since the OIDs and
> > TOC entry numbers didn't matchup; and, since those didn't always match,
> > the order of objects wasn't quite the same either.  So, diff was
> > throwing a lot of false positives at me.
>
> Yeah.  CVS-tip pg_dump doesn't show OIDs by default, to make it easier
> to use for purposes like this.  The ordering issue is the bigger problem
> though.  I presume that the object creation history is different in the
> two databases and so pg_dump's habit of sorting by OID isn't helpful.
>
> It occurs to me that this could be solved now that we have
> dependency-driven ordering in pg_dump.  The ordering algorithm is
> presently
>     * Order by object type, and by OID within types;
>     * Move objects as needed to honor dependencies.
> Ordering by OID should no longer be needed for correctness, because
> the second phase will take care of any dependency problems.  We
> could instead make the initial sort be by object name (within types).
> This should ensure that the schema output is identical for logically
> equivalent databases, even if their history is different.
>
> (When dumping from a pre-7.3 database, we'd have to stick to the OID
> algorithm for lack of dependency info, but of course that case is
> getting less interesting as time wears on.)
>
> Comments?  Anyone see a reason not to do this?
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>

Re: [ADMIN] Schema comparisons

From

Richard Huxton

Date:

28 February 2004, 14:00:09

On Saturday 28 February 2004 15:39, Tom Lane wrote:
> Mark Lubratt <mark.lubratt@indeq.com> writes:
> > On Feb 27, 2004, at 10:28 PM, Tom Lane wrote:
> >> Mark Lubratt <mark.lubratt@indeq.com> writes:
> >>> I've been trying to be careful, but I've gotten out of synch with
> >>> whether or not I've applied the changes I've made to the development
> >>> system to the production system.  Is there a utility that will compare
> >>> the tables, functions, trigger, views, etc. between two systems and
> >>> flag the schema elements that aren't in synch between the two?
> >>
> >> Have you tried diffing pg_dump output?  It's not the greatest tool but
> >> it's helpful.

> Comments?  Anyone see a reason not to do this?

It would help me out too - I have similar problems to Mark with keeping 
various copies in sync.

I've been looking at storing $REVISION$ in comments for each object, so my 
install scripts can halt if there is a problem. Not wanting to use my only 
comment slot for this I was thinking about an extension to the COMMENT ON 
statement:
COMMENT ON TABLE foo IS 'This is where I stroe my foos.';
COMMENT ON TABLE foo SECTION 'default' IS 'I meant store my foos.';
COMMENT ON TABLE foo SECTION 'revision' IS '1.19';
COMMENT ON TABLE foo SECTION 'bar' IS 'baz';

From first inspections, it seems to be a matter of adding a column to a 
base-table and changing some queries/use a view+base-table. I thought it 
might be of use to the pgadmin crew etc, but haven't got to the point of 
writing up my notes and seeing if there is interest.

Is there any point in thinking this through further, or is it me not thinking 
clearly?

--  Richard Huxton Archonet Ltd

Re: [ADMIN] Schema comparisons

From

Tom Lane

Date:

28 February 2004, 22:01:42

Richard Huxton <dev@archonet.com> writes:
> I've been looking at storing $REVISION$ in comments for each object, so my 
> install scripts can halt if there is a problem. Not wanting to use my only 
> comment slot for this I was thinking about an extension to the COMMENT ON 
> statement:
> COMMENT ON TABLE foo IS 'This is where I stroe my foos.';
> COMMENT ON TABLE foo SECTION 'default' IS 'I meant store my foos.';
> COMMENT ON TABLE foo SECTION 'revision' IS '1.19';
> COMMENT ON TABLE foo SECTION 'bar' IS 'baz';

This seems a little, um, specialized.  Why don't you just keep the info
in a user-defined table?
        regards, tom lane

Re: [ADMIN] Schema comparisons

From

"Alex J. Avriette"

Date:

28 February 2004, 22:19:06

On Sat, Feb 28, 2004 at 10:39:40AM -0500, Tom Lane wrote:

> >> Have you tried diffing pg_dump output?  It's not the greatest tool but
> >> it's helpful.
> 
> > Yes, I did.  It was quite cumbersome.  Especially since the OIDs and 
> > TOC entry numbers didn't matchup; and, since those didn't always match, 
> > the order of objects wasn't quite the same either.  So, diff was 
> > throwing a lot of false positives at me.
> 
> Yeah.  CVS-tip pg_dump doesn't show OIDs by default, to make it easier
> to use for purposes like this.  The ordering issue is the bigger problem
> though.  I presume that the object creation history is different in the
> two databases and so pg_dump's habit of sorting by OID isn't helpful.

I recently had to figure out what was different between the "live" schema
and the schema in cvs at work. This was a really painful process, and it
occurred to me that it wouldn't be terribly hard to write a perl program
to do it (I wound up using vim and diff). Is there interest in such a tool?
I could probably have one written within a day or two.

Alex

--
alex@posixnap.net
Alex J. Avriette, Solaris Systems Masseur
http://envy.posixnap.net/~alex/articles/nro-wahhabi.html

Re: [ADMIN] Schema comparisons

From

Tim Larson

Date:

28 February 2004, 23:19:29

On Sat, Feb 28, 2004 at 09:23:48PM -0500, Alex J. Avriette wrote:
> On Sat, Feb 28, 2004 at 10:39:40AM -0500, Tom Lane wrote:
> 
> > >> Have you tried diffing pg_dump output?  It's not the greatest tool but
> > >> it's helpful.
> > 
> > > Yes, I did.  It was quite cumbersome.  Especially since the OIDs and 
> > > TOC entry numbers didn't matchup; and, since those didn't always match, 
> > > the order of objects wasn't quite the same either.  So, diff was 
> > > throwing a lot of false positives at me.
> > 
> > Yeah.  CVS-tip pg_dump doesn't show OIDs by default, to make it easier
> > to use for purposes like this.  The ordering issue is the bigger problem
> > though.  I presume that the object creation history is different in the
> > two databases and so pg_dump's habit of sorting by OID isn't helpful.
> 
> I recently had to figure out what was different between the "live" schema
> and the schema in cvs at work. This was a really painful process, and it
> occurred to me that it wouldn't be terribly hard to write a perl program
> to do it (I wound up using vim and diff). Is there interest in such a tool?
> I could probably have one written within a day or two.

I sometimes supplement vim/diff with xxdiff, meld, and winmerge.

Hope this helps someone,
--Tim Larson

Re: [ADMIN] Schema comparisons

From

Richard Huxton

Date:

29 February 2004, 10:08:13

On Sunday 29 February 2004 02:01, Tom Lane wrote:
> Richard Huxton <dev@archonet.com> writes:
> > I've been looking at storing $REVISION$ in comments for each object, so
> > my install scripts can halt if there is a problem. Not wanting to use my
> > only comment slot for this I was thinking about an extension to the
> > COMMENT ON statement:
> > COMMENT ON TABLE foo IS 'This is where I stroe my foos.';
> > COMMENT ON TABLE foo SECTION 'default' IS 'I meant store my foos.';
> > COMMENT ON TABLE foo SECTION 'revision' IS '1.19';
> > COMMENT ON TABLE foo SECTION 'bar' IS 'baz';
>
> This seems a little, um, specialized.  Why don't you just keep the info
> in a user-defined table?

For the same reasons you don't store existing comments in a user-defined 
table:
1. It's convenient to have a standard (across providers) place for them.
2. It's meta-data, not data.
3. It gets dumped along with my table.

If it's just a case of "looks like a waste of time" then I might well waste my 
time and do it. On the other hand, if it's a case of "unnecessary 
complication - don't want it in the code" then I'll not bother.
--  Richard Huxton Archonet Ltd

Re: [ADMIN] Schema comparisons

From

Robert Treat

Date:

29 February 2004, 12:59:46

On Saturday 28 February 2004 21:23, Alex J. Avriette wrote:
> On Sat, Feb 28, 2004 at 10:39:40AM -0500, Tom Lane wrote:
> > >> Have you tried diffing pg_dump output?  It's not the greatest tool but
> > >> it's helpful.
> > >
> > > Yes, I did.  It was quite cumbersome.  Especially since the OIDs and
> > > TOC entry numbers didn't matchup; and, since those didn't always match,
> > > the order of objects wasn't quite the same either.  So, diff was
> > > throwing a lot of false positives at me.
> >
> > Yeah.  CVS-tip pg_dump doesn't show OIDs by default, to make it easier
> > to use for purposes like this.  The ordering issue is the bigger problem
> > though.  I presume that the object creation history is different in the
> > two databases and so pg_dump's habit of sorting by OID isn't helpful.
>
> I recently had to figure out what was different between the "live" schema
> and the schema in cvs at work. This was a really painful process, and it
> occurred to me that it wouldn't be terribly hard to write a perl program
> to do it (I wound up using vim and diff). Is there interest in such a tool?
> I could probably have one written within a day or two.
>

I've gone the vim-diff route in the past myself, but a nice command line tool 
to do it written in perl could certianly be nice. If nothing else you could 
toss it up on gborg.  Incidentally I think there is already a tool that does 
this on sourceforge, but it uses tcl and requires a running webserver, so 
it's a little overbearing for most peoples needs imho.

Robert Treat
-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

Re: [ADMIN] Schema comparisons

From

Christopher Kings-Lynne

Date:

02 March 2004, 23:40:48

> I recently had to figure out what was different between the "live" schema
> and the schema in cvs at work. This was a really painful process, and it
> occurred to me that it wouldn't be terribly hard to write a perl program
> to do it (I wound up using vim and diff). Is there interest in such a tool?
> I could probably have one written within a day or two.

Someone wrote a utility called 'pgdiff' that generated the SQL commands 
necessary to transform on db in to another IIRC.

Chris

Re: [ADMIN] Schema comparisons

From

Richard Huxton

Date:

03 March 2004, 04:14:55

On Wednesday 03 March 2004 03:44, Christopher Kings-Lynne wrote:
> > I recently had to figure out what was different between the "live" schema
> > and the schema in cvs at work. This was a really painful process, and it
> > occurred to me that it wouldn't be terribly hard to write a perl program
> > to do it (I wound up using vim and diff). Is there interest in such a
> > tool? I could probably have one written within a day or two.
>
> Someone wrote a utility called 'pgdiff' that generated the SQL commands
> necessary to transform on db in to another IIRC.

I think it was started, but didn't reach completion (if we're thinking about 
the same thing).

--  Richard Huxton Archonet Ltd