Thread: Enhancement request for pg_dump
It can be done of course, but as you can see in my examples the statements in pg_dump generated scripts are grouped together by the objects.
It is easier to analyze the differences when all these differences for an object are clustered together, and aren't dispersed in the diff file.
It also will break the multi-line statements.
It is also possible to write a more complex Perl script, or sed/awk, but it would be so much easier to do it directly in pg_dump.
Sergei
It is easier to analyze the differences when all these differences for an object are clustered together, and aren't dispersed in the diff file.
It also will break the multi-line statements.
It is also possible to write a more complex Perl script, or sed/awk, but it would be so much easier to do it directly in pg_dump.
Sergei
> Currently as in PG 9.4, 9.5 the order of the statements in the script > produced by pg_dump is uncertain even for the same versions of the databases > and pg_dump. > One database may script grants like > > REVOKE ALL ON TABLE contracttype FROM PUBLIC; > REVOKE ALL ON TABLE contracttype FROM madmin; > GRANT ALL ON TABLE contracttype TO madmin; > GRANT SELECT ON TABLE contracttype TO mro; > GRANT SELECT,INSERT,DELETE,UPDATE ON TABLE contracttype TO musers; > > and the other may change the order of grants like > > REVOKE ALL ON TABLE contracttype FROM PUBLIC; > REVOKE ALL ON TABLE contracttype FROM madmin; > GRANT ALL ON TABLE contracttype TO madmin; > GRANT SELECT,INSERT,DELETE,UPDATE ON TABLE contracttype TO musers; > GRANT SELECT ON TABLE contracttype TO mro; > > It complicates the usage of pg_dump to compare the structures of the two > similar databases like DEV and PROD, two development branches etc. > If the order of the statements generated by pg_dump would be guaranteed then > it will be very easy to compare the structures and > security rights of the two databases using only pg_dump and a diff/merge > tool. Currently we encounter a lot of false differences. > A sorted order of the DDL and DCL statements in a dump can be implemented as > a flag to pg_dump or even better as a default behavior. Since the actual order of statements inside the text mode dump file does not matter (no restore is being attempted) -- rather only that the order is predictable -- would it not suffice to run the two dumps through a generic text sort program ? pg_dump -D DEV ... | sort > broken-but-sorted-dump-1.txtpg_dump -D PROD ... | sort > broken-but-sorted-dump-2.txtdiff ... broken-but-sorted-dump-1.txt broken-but-sorted-dump-2.txt Karsten -- GPG key ID E4071346 @ eu.pool.sks-keyservers.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
I don't see how these questions are related to the proposed pg_dump improvement. I suggest to improve pg_dump so it can be used instead of the third party tools like DBSteward and SQLWorkbench/J etc. to compare two different databases or existing dumps, and to identify the differences. The use cases will be exactly the same as for the third party tools. The positive difference will be that pg_dump is a very reliable, always available and supports all the latest PostgreSQL features. Do you imply that there shouldn't be any reasons to compare different databases to find the differences between them? Sergei > > On Apr 17, 2016, at 12:41 PM, Sergei Agalakov <Sergei(dot)Agalakov(at)getmyle(dot)com> wrote: > > > > I know about DBSteward. I don't like to bring PHP infrastructure only to be able to compare two dumps, > > and to deal with potential bugs in the third party tools. The pg_dump in other hand is always here, and is always trusted. > > SQLWorkbench/J also can compare two schemas, and requires only Java. Again, I trust pg_dump more. > >http://www.sql-workbench.net/ > > > > May be pg_dump was never INTENDED to generate the dump files with the determined order of the statements, > > but it CAN do it with the minor changes, and be more useful to administrators. Why rely on the third party tools > > for the tasks that can be done with the native, trusted tools? > > > > Sergei > Does it matter if they differ if you cannot recreate the correct one exactly from source-controllled DDL? Or know howthey are supposed to differ if this is a migration point?
On Sun, 17 Apr 2016 14:10:50 -0600 Sergei Agalakov <Sergei.Agalakov@getmyle.com> wrote: > I don't see how these questions are related to the proposed pg_dump > improvement. > I suggest to improve pg_dump so it can be used instead of the third > party tools like DBSteward and SQLWorkbench/J etc. > to compare two different databases or existing dumps, and to identify > the differences. The use cases will be exactly > the same as for the third party tools. The positive difference will be > that pg_dump is a very reliable, always available and supports all the > latest PostgreSQL features. > Do you imply that there shouldn't be any reasons to compare different > databases to find the differences between them? Nobody has weighed in on this, but I have a theory ... I (personally) worry that adding features like you suggest to pg_dump would interfere with its ability to perform complete dump of a large database in a _rapid_ manner. Using pg_dump as a backup tool has an inherent desire for the tool to be as fast and low-impact on the operation of the database as possible. Features that would force pg_dump to care about ordering that isn't necessary to its core functionality of providing a reliable backup are liable to slow it down. They might also overcomplicate it, making it more difficult to maintain reliably. When you consider that possibility, and the fact that pg_dump isn't _supposed_ to be a tool to help you with schema maintenance, it's easy to see why someone would look for different approach to the problem. And I feel that's what all the answers have attempted to do: suggest ways to get what you want without asking them to be implemented in a tool that isn't really the right place for them anyway. While your arguments toward making this change are valid, I'm not sure that they are compelling enough to justify adding a feature where it doesn't really belong. Another side to this, is that your request suggests that your development process is suboptimal. Of course, I can't be 100% sure since you haven't explained your process ... but my experience is that people who feel the need to automagically sync prod and dev databases have a suboptimal development process. Thus, the suggestions are also biased toward helping you improve your process instead of adjusting a tool to better support a suboptimal process. Of course, if the people actually doing the work on the code disagree with me, then they'll make the change. I'm just expressing an opinion. > Sergei > > > > On Apr 17, 2016, at 12:41 PM, Sergei Agalakov <Sergei(dot)Agalakov(at)getmyle(dot)com> wrote: > > > > > > I know about DBSteward. I don't like to bring PHP infrastructure only to be able to compare two dumps, > > > and to deal with potential bugs in the third party tools. The pg_dump in other hand is always here, and is always trusted. > > > SQLWorkbench/J also can compare two schemas, and requires only Java. Again, I trust pg_dump more. > > >http://www.sql-workbench.net/ > > > > > > May be pg_dump was never INTENDED to generate the dump files with the determined order of the statements, > > > but it CAN do it with the minor changes, and be more useful to administrators. Why rely on the third party tools > > > for the tasks that can be done with the native, trusted tools? > > > > > > Sergei > > Does it matter if they differ if you cannot recreate the correct one exactly from source-controllled DDL? Or know howthey are supposed to differ if this is a migration point? > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general -- Bill Moran
On 04/17/2016 01:10 PM, Sergei Agalakov wrote: > I don't see how these questions are related to the proposed pg_dump > improvement. > I suggest to improve pg_dump so it can be used instead of the third > party tools like DBSteward and SQLWorkbench/J etc. > to compare two different databases or existing dumps, and to identify > the differences. The use cases will be exactly > the same as for the third party tools. The positive difference will be > that pg_dump is a very reliable, always available and supports all the > latest PostgreSQL features. > Do you imply that there shouldn't be any reasons to compare different > databases to find the differences between them? I don't think that is what is being said, more the right tool for the right job. pg_dump --> pg_restore/psql are for capturing some or all of the information(including possibly data) in a database at a point in time so that information can be recreated at another point in time. While by necessity that includes capturing schema data (or not in the case of -a) it is not really a schema comparison tool. While that is in the realm of doable it means developer time to replicate something that other tools do. Given the ever lengthening list of requested features in general, this request might have difficulty reaching a sufficient level of priority, esp. in light of the presence of existing tools. > > Sergei > >> > On Apr 17, 2016, at 12:41 PM, Sergei Agalakov >> <Sergei(dot)Agalakov(at)getmyle(dot)com> wrote: >> > >> > I know about DBSteward. I don't like to bring PHP infrastructure >> only to be able to compare two dumps, >> > and to deal with potential bugs in the third party tools. The >> pg_dump in other hand is always here, and is always trusted. >> > SQLWorkbench/J also can compare two schemas, and requires only Java. >> Again, I trust pg_dump more. >> >http://www.sql-workbench.net/ >> > >> > May be pg_dump was never INTENDED to generate the dump files with >> the determined order of the statements, >> > but it CAN do it with the minor changes, and be more useful to >> administrators. Why rely on the third party tools >> > for the tasks that can be done with the native, trusted tools? >> > >> > Sergei >> Does it matter if they differ if you cannot recreate the correct one >> exactly from source-controllled DDL? Or know how they are supposed to >> differ if this is a migration point? > > -- Adrian Klaver adrian.klaver@aklaver.com
On 04/17/2016 01:10 PM, Sergei Agalakov wrote: > I don't see how these questions are related to the proposed pg_dump > improvement. > I suggest to improve pg_dump so it can be used instead of the third > party tools like DBSteward and SQLWorkbench/J etc. > to compare two different databases or existing dumps, and to identify > the differences. The use cases will be exactly > the same as for the third party tools. The positive difference will be > that pg_dump is a very reliable, always available and supports all the > latest PostgreSQL features. > Do you imply that there shouldn't be any reasons to compare different > databases to find the differences between them? To follow up my previous post and to illustrate some of the difficulties, from your original post: "One database may script grants like REVOKE ALL ON TABLE contracttype FROM PUBLIC; REVOKE ALL ON TABLE contracttype FROM madmin; GRANT ALL ON TABLE contracttype TO madmin; GRANT SELECT ON TABLE contracttype TO mro; GRANT SELECT,INSERT,DELETE,UPDATE ON TABLE contracttype TO musers; and the other may change the order of grants like REVOKE ALL ON TABLE contracttype FROM PUBLIC; REVOKE ALL ON TABLE contracttype FROM madmin; GRANT ALL ON TABLE contracttype TO madmin; GRANT SELECT,INSERT,DELETE,UPDATE ON TABLE contracttype TO musers; GRANT SELECT ON TABLE contracttype TO mro; " From the perspective of the database both of the above lead to the same end result, so order is not important. Of course a diff is going to see it differently. The solution is then to impose an order, but how would that be determined? For instance what about: GRANT SELECT ON TABLE contracttype TO mro; GRANT SELECT,INSERT,DELETE,UPDATE ON TABLE contracttype TO musers; vs GRANT SELECT ON TABLE contracttype TO mro; GRANT INSERT, SELECT, DELETE,UPDATE ON TABLE contracttype TO musers; Again diff will see them as not the same, but functionally they are the same. So who decides order and how far do you reach down into the statements? As Bill wrote the issue is after the fact version control versus before the fact version control. Trying to match things up after various people have been turned loose at will on different instances of databases is much more difficult then having them go through a structured version control system first. > > Sergei > >> > On Apr 17, 2016, at 12:41 PM, Sergei Agalakov >> <Sergei(dot)Agalakov(at)getmyle(dot)com> wrote: >> > >> > I know about DBSteward. I don't like to bring PHP infrastructure >> only to be able to compare two dumps, >> > and to deal with potential bugs in the third party tools. The >> pg_dump in other hand is always here, and is always trusted. >> > SQLWorkbench/J also can compare two schemas, and requires only Java. >> Again, I trust pg_dump more. >> >http://www.sql-workbench.net/ >> > >> > May be pg_dump was never INTENDED to generate the dump files with >> the determined order of the statements, >> > but it CAN do it with the minor changes, and be more useful to >> administrators. Why rely on the third party tools >> > for the tasks that can be done with the native, trusted tools? >> > >> > Sergei >> Does it matter if they differ if you cannot recreate the correct one >> exactly from source-controllled DDL? Or know how they are supposed to >> differ if this is a migration point? > > -- Adrian Klaver adrian.klaver@aklaver.com
On 04/17/2016 01:58 PM, Adrian Klaver wrote: > On 04/17/2016 01:10 PM, Sergei Agalakov wrote: >> I don't see how these questions are related to the proposed pg_dump >> improvement. >> I suggest to improve pg_dump so it can be used instead of the third >> party tools like DBSteward and SQLWorkbench/J etc. >> to compare two different databases or existing dumps, and to identify >> the differences. The use cases will be exactly >> the same as for the third party tools. The positive difference will be >> that pg_dump is a very reliable, always available and supports all the >> latest PostgreSQL features. >> Do you imply that there shouldn't be any reasons to compare different >> databases to find the differences between them? > > To follow up my previous post and to illustrate some of the > difficulties, from your original post: > > "One database may script grants like > > REVOKE ALL ON TABLE contracttype FROM PUBLIC; > REVOKE ALL ON TABLE contracttype FROM madmin; > GRANT ALL ON TABLE contracttype TO madmin; > GRANT SELECT ON TABLE contracttype TO mro; > GRANT SELECT,INSERT,DELETE,UPDATE ON TABLE contracttype TO musers; > > and the other may change the order of grants like > > REVOKE ALL ON TABLE contracttype FROM PUBLIC; > REVOKE ALL ON TABLE contracttype FROM madmin; > GRANT ALL ON TABLE contracttype TO madmin; > GRANT SELECT,INSERT,DELETE,UPDATE ON TABLE contracttype TO musers; > GRANT SELECT ON TABLE contracttype TO mro; > " > > From the perspective of the database both of the above lead to the same > end result, so order is not important. Of course a diff is going to see > it differently. The solution is then to impose an order, but how would > that be determined? For instance what about: > > GRANT SELECT ON TABLE contracttype TO mro; > GRANT SELECT,INSERT,DELETE,UPDATE ON TABLE contracttype TO musers; > > vs > > GRANT SELECT ON TABLE contracttype TO mro; > GRANT INSERT, SELECT, DELETE,UPDATE ON TABLE contracttype TO musers; > > Again diff will see them as not the same, but functionally they are the > same. So who decides order and how far do you reach down into the > statements? Bad example. Some testing shows Postgres will reorder the GRANTS as: SELECT,INSERT,DELETE,UPDATE from whatever order they where entered as. > > > As Bill wrote the issue is after the fact version control versus before > the fact version control. Trying to match things up after various people > have been turned loose at will on different instances of databases is > much more difficult then having them go through a structured version > control system first. > > >> >> Sergei >> >>> > On Apr 17, 2016, at 12:41 PM, Sergei Agalakov >>> <Sergei(dot)Agalakov(at)getmyle(dot)com> wrote: >>> > >>> > I know about DBSteward. I don't like to bring PHP infrastructure >>> only to be able to compare two dumps, >>> > and to deal with potential bugs in the third party tools. The >>> pg_dump in other hand is always here, and is always trusted. >>> > SQLWorkbench/J also can compare two schemas, and requires only Java. >>> Again, I trust pg_dump more. >>> >http://www.sql-workbench.net/ >>> > >>> > May be pg_dump was never INTENDED to generate the dump files with >>> the determined order of the statements, >>> > but it CAN do it with the minor changes, and be more useful to >>> administrators. Why rely on the third party tools >>> > for the tasks that can be done with the native, trusted tools? >>> > >>> > Sergei >>> Does it matter if they differ if you cannot recreate the correct one >>> exactly from source-controllled DDL? Or know how they are supposed to >>> differ if this is a migration point? >> >> > > -- Adrian Klaver adrian.klaver@aklaver.com
I hardly can see that a sorting of the grants by users will create a measurable impact on the pg_dump performance in a real database. One can imaging a database with tens of thousands of objects and tens of thousands of users and almost no data, but it would be quite unusual. Anyway, if a sorting behavior is initiated by a command line parameter, and isn't a default behavior of pg_dump then this argument doesn't work. After all pg_dump isn't the tool to do _just_ reliable backup. It can be used for migration, it can be used for schema cloning, to initiate a standby... There are many flags for pg_dump that are absolutely unnecessary for full database backup. So they do "... might also overcomplicate it, making it more difficult to maintain reliably" but they do exists, and serve a purpose. I don't understand why people have started to create the theories about our development process? Had I requested a tool to magically synchronize DEV and PROD? No, I asked about a tool to _find_ the unexpected differences between databases. If you never encountered a situation when in the dozens of environments the databases has diverged because somebody has done something manually - good for you, you are lucky guy then. I did. Sergei > On Sun, 17 Apr 2016 14:10:50 -0600 > Sergei Agalakov <Sergei(dot)Agalakov(at)getmyle(dot)com> wrote: > > > I don't see how these questions are related to the proposed pg_dump > > improvement. > > I suggest to improve pg_dump so it can be used instead of the third > > party tools like DBSteward and SQLWorkbench/J etc. > > to compare two different databases or existing dumps, and to identify > > the differences. The use cases will be exactly > > the same as for the third party tools. The positive difference will be > > that pg_dump is a very reliable, always available and supports all the > > latest PostgreSQL features. > > Do you imply that there shouldn't be any reasons to compare different > > databases to find the differences between them? > > Nobody has weighed in on this, but I have a theory ... > > I (personally) worry that adding features like you suggest to pg_dump > would interfere with its ability to perform complete dump of a large > database in a _rapid_ manner. Using pg_dump as a backup tool has an > inherent desire for the tool to be as fast and low-impact on the > operation of the database as possible. > > Features that would force pg_dump to care about ordering that isn't > necessary to its core functionality of providing a reliable backup > are liable to slow it down. They might also overcomplicate it, making > it more difficult to maintain reliably. > > When you consider that possibility, and the fact that pg_dump isn't > _supposed_ to be a tool to help you with schema maintenance, it's easy > to see why someone would look for different approach to the problem. > > And I feel that's what all the answers have attempted to do: suggest > ways to get what you want without asking them to be implemented in a > tool that isn't really the right place for them anyway. While your > arguments toward making this change are valid, I'm not sure that > they are compelling enough to justify adding a feature where it > doesn't really belong. > > Another side to this, is that your request suggests that your > development process is suboptimal. Of course, I can't be 100% sure > since you haven't explained your process ... but my experience is > that people who feel the need to automagically sync prod and dev > databases have a suboptimal development process. Thus, the suggestions > are also biased toward helping you improve your process instead of > adjusting a tool to better support a suboptimal process. > > Of course, if the people actually doing the work on the code disagree > with me, then they'll make the change. I'm just expressing an opinion. > > > Sergei > > > > > > On Apr 17, 2016, at 12:41 PM, Sergei Agalakov > <Sergei(dot)Agalakov(at)getmyle(dot)com> wrote: > > > > > > > > I know about DBSteward. I don't like to bring PHP infrastructure > only to be able to compare two dumps, > > > > and to deal with potential bugs in the third party tools. The > pg_dump in other hand is always here, and is always trusted. > > > > SQLWorkbench/J also can compare two schemas, and requires only > Java. Again, I trust pg_dump more. > > > >http://www.sql-workbench.net/ > > > > > > > > May be pg_dump was never INTENDED to generate the dump files > with the determined order of the statements, > > > > but it CAN do it with the minor changes, and be more useful to > administrators. Why rely on the third party tools > > > > for the tasks that can be done with the native, trusted tools? > > > > > > > > Sergei > > > Does it matter if they differ if you cannot recreate the correct > one exactly from source-controllled DDL? Or know how they are > supposed to differ if this is a migration point? > > > > > > -- > > Sent via pgsql-general mailing list > (pgsql-general(at)postgresql(dot)org) > > To make changes to your subscription: > > http://www.postgresql.org/mailpref/pgsql-general > > > -- > Bill Moran
fyi, if you have a feature request or enhancement, then the proper place for that is here -> https://postgresql.uservoice.com/forums/21853-general
On Sun, Apr 17, 2016 at 8:26 PM, Sergei Agalakov <Sergei.Agalakov@getmyle.com> wrote:
I hardly can see that a sorting of the grants by users will create a measurable impact on the pg_dump performance in a real database.
One can imaging a database with tens of thousands of objects and tens of thousands of users and almost no data, but it would be quite unusual.
Anyway, if a sorting behavior is initiated by a command line parameter, and isn't a default behavior of pg_dump then this argument doesn't work.
After all pg_dump isn't the tool to do _just_ reliable backup. It can be used for migration, it can be used for schema cloning, to initiate a standby...
There are many flags for pg_dump that are absolutely unnecessary for full database backup. So they do
"... might also overcomplicate it, making it more difficult to maintain reliably" but they do exists, and serve a purpose.
I don't understand why people have started to create the theories about our development process? Had I requested a tool to magically synchronize
DEV and PROD? No, I asked about a tool to _find_ the unexpected differences between databases. If you never encountered a situation when in the
dozens of environments the databases has diverged because somebody has done something manually - good for you, you are lucky guy then.
I did.
SergeiOn Sun, 17 Apr 2016 14:10:50 -0600> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)Sergei Agalakov <Sergei(dot)Agalakov(at)getmyle(dot)com> wrote:
> I don't see how these questions are related to the proposed pg_dump
> improvement.
> I suggest to improve pg_dump so it can be used instead of the third
> party tools like DBSteward and SQLWorkbench/J etc.
> to compare two different databases or existing dumps, and to identify
> the differences. The use cases will be exactly
> the same as for the third party tools. The positive difference will be
> that pg_dump is a very reliable, always available and supports all the
> latest PostgreSQL features.
> Do you imply that there shouldn't be any reasons to compare different
> databases to find the differences between them?
Nobody has weighed in on this, but I have a theory ...
I (personally) worry that adding features like you suggest to pg_dump
would interfere with its ability to perform complete dump of a large
database in a _rapid_ manner. Using pg_dump as a backup tool has an
inherent desire for the tool to be as fast and low-impact on the
operation of the database as possible.
Features that would force pg_dump to care about ordering that isn't
necessary to its core functionality of providing a reliable backup
are liable to slow it down. They might also overcomplicate it, making
it more difficult to maintain reliably.
When you consider that possibility, and the fact that pg_dump isn't
_supposed_ to be a tool to help you with schema maintenance, it's easy
to see why someone would look for different approach to the problem.
And I feel that's what all the answers have attempted to do: suggest
ways to get what you want without asking them to be implemented in a
tool that isn't really the right place for them anyway. While your
arguments toward making this change are valid, I'm not sure that
they are compelling enough to justify adding a feature where it
doesn't really belong.
Another side to this, is that your request suggests that your
development process is suboptimal. Of course, I can't be 100% sure
since you haven't explained your process ... but my experience is
that people who feel the need to automagically sync prod and dev
databases have a suboptimal development process. Thus, the suggestions
are also biased toward helping you improve your process instead of
adjusting a tool to better support a suboptimal process.
Of course, if the people actually doing the work on the code disagree
with me, then they'll make the change. I'm just expressing an opinion.
> Sergei
>
> > > On Apr 17, 2016, at 12:41 PM, Sergei Agalakov <Sergei(dot)Agalakov(at)getmyle(dot)com> wrote:
> > >
> > > I know about DBSteward. I don't like to bring PHP infrastructure only to be able to compare two dumps,
> > > and to deal with potential bugs in the third party tools. The pg_dump in other hand is always here, and is always trusted.
> > > SQLWorkbench/J also can compare two schemas, and requires only Java. Again, I trust pg_dump more.
> > >http://www.sql-workbench.net/
> > >
> > > May be pg_dump was never INTENDED to generate the dump files with the determined order of the statements,
> > > but it CAN do it with the minor changes, and be more useful to administrators. Why rely on the third party tools
> > > for the tasks that can be done with the native, trusted tools?
> > >
> > > Sergei
> > Does it matter if they differ if you cannot recreate the correct one exactly from source-controllled DDL? Or know how they are supposed to differ if this is a migration point?
>
>
> --
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
--
Bill Moran
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

Nobody asks for pg_dump to be a schema comparison tool. As you tell yourself it is a most reliable schema capturing tool. All I am asking is that if pg_dump is executed on two databases with the identical schemas and security it should be able to produce the identical SQL dumps of these schemas and security. As you have mentioned in other e-mail pg_dump actually rewrites some statements for consistency. It just doesn't do it consistently everywhere. I can't say anything about priorities of development for pg_dump. The proposed change seems to be a low hanging fruit, it isn't difficult to add ORDER BY in the appropriate places. The other question is if this is a useful enhancement. The existence of the third party tools doesn't seem to be very relevant here. Should be stopped the development of pgAdmin or psql because exist the third party tools with the similar functionality? :-) Sergei > On 04/17/2016 01:10 PM, Sergei Agalakov wrote: > > I don't see how these questions are related to the proposed pg_dump > > improvement. > > I suggest to improve pg_dump so it can be used instead of the third > > party tools like DBSteward and SQLWorkbench/J etc. > > to compare two different databases or existing dumps, and to identify > > the differences. The use cases will be exactly > > the same as for the third party tools. The positive difference will be > > that pg_dump is a very reliable, always available and supports all the > > latest PostgreSQL features. > > Do you imply that there shouldn't be any reasons to compare different > > databases to find the differences between them? > > I don't think that is what is being said, more the right tool for the > right job. pg_dump --> pg_restore/psql are for capturing some or all of > the information(including possibly data) in a database at a point in > time so that information can be recreated at another point in time. > While by necessity that includes capturing schema data (or not in the > case of -a) it is not really a schema comparison tool. While that is in > the realm of doable it means developer time to replicate something that > other tools do. Given the ever lengthening list of requested features in > general, this request might have difficulty reaching a sufficient level > of priority, esp. in light of the presence of existing tools. > > > > > Sergei > > > >> > On Apr 17, 2016, at 12:41 PM, Sergei Agalakov > >> <Sergei(dot)Agalakov(at)getmyle(dot)com> wrote: > >> > > >> > I know about DBSteward. I don't like to bring PHP infrastructure > >> only to be able to compare two dumps, > >> > and to deal with potential bugs in the third party tools. The > >> pg_dump in other hand is always here, and is always trusted. > >> > SQLWorkbench/J also can compare two schemas, and requires only Java. > >> Again, I trust pg_dump more. > >> >http://www.sql-workbench.net/ > >> > > >> > May be pg_dump was never INTENDED to generate the dump files with > >> the determined order of the statements, > >> > but it CAN do it with the minor changes, and be more useful to > >> administrators. Why rely on the third party tools > >> > for the tasks that can be done with the native, trusted tools? > >> > > >> > Sergei > >> Does it matter if they differ if you cannot recreate the correct one > >> exactly from source-controllled DDL? Or know how they are supposed to > >> differ if this is a migration point? > > > > > > > -- > Adrian Klaver > adrian(dot)klaver(at)aklaver(dot)com
Thank you, I know this place. I just wanted to check that my request will have the peoples support. So far it doesn't. It looks like that or people never need to compare two PG databases to find the differences in the schemas or security, or happy to use the third party tools to do it, and don't want any native support. If I see any support from other people for this idea then I shall go to https://postgresql.uservoice.com/forums/21853-general, but looking on, say, "Partitions in Oracle style" that are marked as have been started in 2010 (sure, INHERITANCE is so much Oracle style partitions!) I don't see it to be very useful. Sergei > fyi, if you have a feature request or enhancement, then the proper > place for that is here -> > https://postgresql.uservoice.com/forums/21853-general
On 18 April 2016 at 13:10, Sergei Agalakov <Sergei.Agalakov@getmyle.com> wrote: > Thank you, I know this place. > I just wanted to check that my request will have the peoples support. > So far it doesn't. It looks like that or people never need to compare two PG > databases to find the differences in the schemas or security, > or happy to use the third party tools to do it, and don't want any native > support. If I see any support from other people for this idea then I shall > go to https://postgresql.uservoice.com/forums/21853-general, but looking on, > say, "Partitions in Oracle style" that are marked as have been started in > 2010 > (sure, INHERITANCE is so much Oracle style partitions!) I don't see it to be > very useful. I can't particularly vouch for that site, as I've personally never seen it before, but I'd like to say that you'll probably get along better if you appeared to have a more optimistic view. If you bothered to consider the "parallel query option" item listed on that site, and compared that to the current status of 9.6, you might feel differently. EDB and others have put lots of work in to parallel query for 9.6. If your intentions here are to gather support for your cause then I highly recommend not appearing negative. Keep in mind that you've not paid some company for a license for PostgreSQL and the people reading your emails here are most likely not at your beckon call, and are not here to fulfill all your PostgreSQL wishes. To me your proposal does seem quite half thought through. Do you really suppose we just sort the GRANT output and call it done. pg_dump now has stable output? I think that would barely scratch the surface. What about COPY output, we'd have to sort that too, and that could be rather expensive. Now, you could say that we'd just limit this to schema-only related stuff, and that might be ok, but you'll need to ensure that everything is addressed and that your now matching output didn't just occur because all of the planets happened to line up on the day you ran pg_dump. You might propose that we could get around the performance hit of generating a stable output by having an optional flag to enable this. That would appear to sound ok at my first thought. If C is your thing then you could open up pg_dump.c and have a look around, if not then remaining positive and constructive, and doing your best not to upset people who's C *is* their thing is probably a good tactical move here.
On 04/17/2016 06:10 PM, Sergei Agalakov wrote: > Thank you, I know this place. > I just wanted to check that my request will have the peoples support. > So far it doesn't. It looks like that or people never need to compare > two PG databases to find the differences in the schemas or security, > or happy to use the third party tools to do it, and don't want any > native support. If I see any support from other people for this idea > then I shall > go to https://postgresql.uservoice.com/forums/21853-general, but looking > on, say, "Partitions in Oracle style" that are marked as have been > started in 2010 > (sure, INHERITANCE is so much Oracle style partitions!) I don't see it > to be very useful. Honestly I did not know that site existed. If you want some traction on this I would suggest the traditional way. If you think it is a bug: http://www.postgresql.org/support/submitbug/ If you think it should be a new feature then make your case on --hackers: http://www.postgresql.org/list/pgsql-hackers/ FYI, the place I look for requested features is the Todo list: https://wiki.postgresql.org/wiki/Todo > > Sergei > >> fyi, if you have a feature request or enhancement, then the proper >> place for that is here -> >> https://postgresql.uservoice.com/forums/21853-general > > -- Adrian Klaver adrian.klaver@aklaver.com
On 04/17/2016 05:50 PM, Sergei Agalakov wrote: > Nobody asks for pg_dump to be a schema comparison tool. As you tell > yourself > it is a most reliable schema capturing tool. All I am asking is that if > pg_dump is executed > on two databases with the identical schemas and security it should be > able to produce > the identical SQL dumps of these schemas and security. As you have > mentioned in other e-mail > pg_dump actually rewrites some statements for consistency. It just > doesn't do it consistently everywhere. And there in lies the rub. Making that happen, I suspect, is going to be a lot of work. The goal of the tool is not to produce output that is diff friendly but that produces working schema when transferred to another database. I understand what you want and why I just think it is not as easy as you want to believe. See my other post for ways to try to make this happen. > > I can't say anything about priorities of development for pg_dump. The > proposed change seems to be > a low hanging fruit, it isn't difficult to add ORDER BY in the > appropriate places. The other question is if > this is a useful enhancement. The existence of the third party tools > doesn't seem to be very relevant here. > Should be stopped the development of pgAdmin or psql because exist the > third party tools with the similar functionality? > :-) FYI, pgAdmin is a third party tool, currently being completely rewritten: http://pgsnake.blogspot.com/2016/04/pgadmin-4-elephant-nears-finish-line.html > > Sergei > -- Adrian Klaver adrian.klaver@aklaver.com
Hi, Le 18/04/2016 02:26, Sergei Agalakov a écrit : > If you never encountered a situation when in the dozens of > environments the databases has diverged because somebody has > done something manually - good for you, you are lucky guy then. I'm definitely not a lucky guy at all! :-) And this is happening to me *right now*... My case is a little bit more complicated, but I'll come back to this later on, probably in another thread, in order to avoid confusion. Last time I had to do this kind of exercise, a few years ago, I was in a remote place without Internet access, so I could not get any information or ask any help. I was kind of surprised/frustrated by the (apparent) lack of order of the pg_dump output. So I manually wrote scripts to export the tables and views' definitions separately, one by one, (using pg_dump, of course) and stack them in the order I wished into a large file. That was quite suboptimal, but it worked as expected, and I was able to diff and patch correctly. And today, I thought: "time has passed, I'm sure that pg_dump must magically have an option to get the output in some kind of order, by now"... 'man pg_dump' didn't help. And as I can read this discussion (I haven't finished yet, obviously), this is not the case. À+ Pierre -- ____________________________________________________________________________ Pierre Chevalier PChGEI: Pierre Chevalier Géologue Et Informaticien Partenaire DALIBO Mesté Duran 32100 Condom Tél+fax : 09 75 27 45 62 06 37 80 33 64 Émail : pierrechevaliergeolCHEZfree.fr icq# : 10432285 jabber: pierre.chevalier1967@jabber.fr http://pierremariechevalier.free.fr/pierre_chevalier_geologue ____________________________________________________________________________
Le 18/04/2016 03:10, Sergei Agalakov a écrit : > I just wanted to check that my request will have the peoples support. > So far it doesn't. Well, you can count on my support, for sure! > It looks like that or people never need to compare two PG databases > to find the differences in the schemas or security, or happy to use > the third party tools to do it, and don't want any native support. I definitely share your opinion. But I understand that pg_dump wasn't originally designed for this purpose, although such a feature would seem so natural, at first glance. So, another idea would be to implement another utility, something very similar to pg_dump (probably sharing most of its code with it, or calling pg_dump like my scripts did), but giving an ordered output. Implementing such a tool outside of postgres would bring a bit more chaos. So, the tool should be preferably bundled with postgres, to avoid the need to get a third-party tool. Hm. When I think about it twice, it sounds like overkilling. Opinions? > If I see any support from other people for this idea then I shall > go to https://postgresql.uservoice.com/forums/21853-general, I'd say +1, but the idea should be a bit more matured first, maybe? À+ Pierre -- ____________________________________________________________________________ Pierre Chevalier PChGEI: Pierre Chevalier Géologue Et Informaticien Partenaire DALIBO Mesté Duran 32100 Condom Tél+fax : 09 75 27 45 62 06 37 80 33 64 Émail : pierrechevaliergeolCHEZfree.fr icq# : 10432285 jabber: pierre.chevalier1967@jabber.fr http://pierremariechevalier.free.fr/pierre_chevalier_geologue ____________________________________________________________________________
On 04/22/2016 09:44 AM, Pierre Chevalier Géologue wrote: > Hi, > Le 18/04/2016 02:26, Sergei Agalakov a écrit : > >> If you never encountered a situation when in the dozens of >> environments the databases has diverged because somebody has >> done something manually - good for you, you are lucky guy then. > > I'm definitely not a lucky guy at all! :-) > And this is happening to me *right now*... My case is a little bit more > complicated, but I'll come back to this later on, probably in another > thread, in order to avoid confusion. > > Last time I had to do this kind of exercise, a few years ago, I was in a > remote place without Internet access, so I could not get any information > or ask any help. I was kind of surprised/frustrated by the (apparent) > lack of order of the pg_dump output. So I manually wrote scripts to > export the tables and views' definitions separately, one by one, (using > pg_dump, of course) and stack them in the order I wished into a large > file. That was quite suboptimal, but it worked as expected, and I was > able to diff and patch correctly. You realize there is pg_restore -l and pg_restore -L : http://www.postgresql.org/docs/9.5/interactive/app-pgrestore.html -l --list List the contents of the archive. The output of this operation can be used as input to the -L option. Note that if filteringswitches such as -n or -t are used with -l, they will restrict the items listed. -L list-file --use-list=list-file Restore only those archive elements that are listed in list-file, and restore them in the order they appear in the file.Note that if filtering switches such as -n or -t are used with -L, they will further restrict the items restored. list-file is normally created by editing the output of a previous -l operation. Lines can be moved or removed, and canalso be commented out by placing a semicolon (;) at the start of the line. See below for examples. It does not solve all problems but it does do some ordering and is amenable to be imported as space separated file for further ordering : aklaver@panda:~> pg_restore -l test.out ; ; Archive created at Fri Apr 22 10:07:50 2016 ; dbname: test ; TOC Entries: 67 ; Compression: -1 ; Dump Version: 1.12-0 ; Format: CUSTOM ; Integer: 4 bytes ; Offset: 8 bytes ; Dumped from database version: 9.4.6 ; Dumped by pg_dump version: 9.4.6 ; ; ; Selected TOC Entries: ; 2702; 1262 983301 DATABASE - test postgres 9; 2615 1298825 SCHEMA - MASTER_USER postgres 8; 2615 2200 SCHEMA - public postgres 2703; 0 0 COMMENT - SCHEMA public postgres 2704; 0 0 ACL - public postgres 2; 3079 12456 EXTENSION - plpgsql 2705; 0 0 COMMENT - EXTENSION plpgsql 1; 3079 1730602 EXTENSION - plpythonu 2706; 0 0 COMMENT - EXTENSION plpythonu 191; 1255 1057054 FUNCTION public fn_plpgsqltestmulti(character varying) aklaver 210; 1255 1730594 FUNCTION public measurement_insert_trigger() postgres 205; 1255 1065246 FUNCTION public myfunc(refcursor, refcursor) aklaver 206; 1255 1065247 FUNCTION public myfunc(refcursor, refcursor, integer) aklaver 209; 1255 1065248 FUNCTION public myfunc(refcursor, refcursor, character varying) aklaver 208; 1255 1730587 FUNCTION public pg_stat_allusers() postgres 211; 1255 1730607 FUNCTION public test() postgres 207; 1255 1730585 FUNCTION public trigger_test() aklaver 198; 1255 1299304 FUNCTION public user_update() postgres 180; 1259 1298826 TABLE MASTER_USER test_tbl postgres 175; 1259 1016073 TABLE public a aklaver 178; 1259 1057055 TABLE public cash_journal aklaver 2707; 0 0 COMMENT public COLUMN cash_journal.click aklaver 2708; 0 0 COMMENT public COLUMN cash_journal.cash_journal_id aklaver 2709; 0 0 COMMENT public COLUMN cash_journal.fairian_id aklaver 2710; 0 0 COMMENT public COLUMN cash_journal.debit aklaver 2711; 0 0 COMMENT public COLUMN cash_journal.credit aklaver 2712; 0 0 COMMENT public COLUMN cash_journal.balance aklaver 2713; 0 0 COMMENT public COLUMN cash_journal.description aklaver 182; 1259 1299634 TABLE public company postgres 2714; 0 0 ACL public company postgres 183; 1259 1727447 TABLE public final hplc_admin 189; 1259 1730617 TABLE public gold1604_test aklaver 190; 1259 1730623 TABLE public goldmast_test aklaver 188; 1259 1730609 TABLE public jsonb_test aklaver 179; 1259 1081630 TABLE public max_test aklaver 186; 1259 1730588 TABLE public measurement postgres 187; 1259 1730591 TABLE public measurement_y2016m03 postgres 176; 1259 1016078 TABLE public tbl_a aklaver 177; 1259 1016082 TABLE public tbl_b aklaver 185; 1259 1730581 TABLE public trigger_test aklaver 184; 1259 1727466 TABLE public user_tbl hplc_admin 181; 1259 1299615 TABLE public xid_test aklaver 2687; 0 1298826 TABLE DATA MASTER_USER test_tbl postgres 2682; 0 1016073 TABLE DATA public a aklaver 2685; 0 1057055 TABLE DATA public cash_journal aklaver 2689; 0 1299634 TABLE DATA public company postgres 2690; 0 1727447 TABLE DATA public final hplc_admin 2696; 0 1730617 TABLE DATA public gold1604_test aklaver 2697; 0 1730623 TABLE DATA public goldmast_test aklaver 2695; 0 1730609 TABLE DATA public jsonb_test aklaver 2686; 0 1081630 TABLE DATA public max_test aklaver 2693; 0 1730588 TABLE DATA public measurement postgres 2694; 0 1730591 TABLE DATA public measurement_y2016m03 postgres 2683; 0 1016078 TABLE DATA public tbl_a aklaver 2684; 0 1016082 TABLE DATA public tbl_b aklaver 2692; 0 1730581 TABLE DATA public trigger_test aklaver 2691; 0 1727466 TABLE DATA public user_tbl hplc_admin 2688; 0 1299615 TABLE DATA public xid_test aklaver 2563; 2606 1016077 CONSTRAINT public a_pkey aklaver 2565; 2606 1299641 CONSTRAINT public company_pkey postgres 2567; 2606 1727454 CONSTRAINT public final_pkey hplc_admin 2569; 2606 1727473 CONSTRAINT public user_tbl_pkey hplc_admin 2572; 2620 1730595 TRIGGER public insert_measurement_trigger postgres 2571; 2620 1730601 TRIGGER public test_trigger aklaver 2570; 2606 1727474 FK CONSTRAINT public g_s_fk hplc_admin > > And today, I thought: "time has passed, I'm sure that pg_dump must > magically have an option to get the output in some kind of order, by > now"... 'man pg_dump' didn't help. And as I can read this discussion (I > haven't finished yet, obviously), this is not the case. > > À+ > Pierre -- Adrian Klaver adrian.klaver@aklaver.com
Le 22/04/2016 19:11, Adrian Klaver a écrit : >> Last time I had to do this kind of exercise, a few years ago, I was in a >> remote place without Internet access, so I could not get any information >> or ask any help. I was kind of surprised/frustrated by the (apparent) >> lack of order of the pg_dump output. So I manually wrote scripts to >> export the tables and views' definitions separately, one by one, (using >> pg_dump, of course) and stack them in the order I wished into a large >> file. That was quite suboptimal, but it worked as expected, and I was >> able to diff and patch correctly. > > You realize there is pg_restore -l and pg_restore -L : > > http://www.postgresql.org/docs/9.5/interactive/app-pgrestore.html Yes, thanks for the advice. Now I remember that I had used it also: I just found these notes in my numeric attic: # pierre@autan: ~ < 2013_08_17__17_00_23 > pg_restore -l database_2013_08_14_20h34.pg_dump -n pierre > tt_schema_pierre # pierre@autan: ~ < 2013_08_17__17_00_23 > vi tt_schema_pierre # pierre@autan: ~ < 2013_08_17__17_00_23 > cat tt_schema_pierre DROP VIEW IF EXISTS pierre.dh_collars; DROP VIEW IF EXISTS pierre.dh_litho; DROP VIEW IF EXISTS pierre.dh_sampling_grades; DROP VIEW IF EXISTS pierre.dh_sampling; DROP VIEW IF EXISTS pierre.topo_points; DROP VIEW IF EXISTS pierre.baselines; DROP VIEW IF EXISTS pierre.dh_devia; ... I just cannot remember why it did not fulfill my needs, so that I rather opted to pg_dump's. ... > It does not solve all problems but it does do some ordering and is amenable to be > imported as space separated file for further ordering : > > aklaver@panda:~> pg_restore -l test.out > ; > ; Archive created at Fri Apr 22 10:07:50 2016 > ; dbname: test > ; TOC Entries: 67 > ; Compression: -1 > ; Dump Version: 1.12-0 > ; Format: CUSTOM > ; Integer: 4 bytes > ; Offset: 8 bytes > ; Dumped from database version: 9.4.6 > ; Dumped by pg_dump version: 9.4.6 > ; > ; > ; Selected TOC Entries: > ; > 2702; 1262 983301 DATABASE - test postgres > 9; 2615 1298825 SCHEMA - MASTER_USER postgres > 8; 2615 2200 SCHEMA - public postgres > 2703; 0 0 COMMENT - SCHEMA public postgres > 2704; 0 0 ACL - public postgres > 2; 3079 12456 EXTENSION - plpgsql > 2705; 0 0 COMMENT - EXTENSION plpgsql ... Thanks, I'll dig a bit more in pg_restore, for my current issues. Pierre -- ____________________________________________________________________________ Pierre Chevalier PChGEI: Pierre Chevalier Géologue Et Informaticien Partenaire DALIBO Mesté Duran 32100 Condom Tél+fax : 09 75 27 45 62 06 37 80 33 64 Émail : pierrechevaliergeolCHEZfree.fr icq# : 10432285 jabber: pierre.chevalier1967@jabber.fr http://pierremariechevalier.free.fr/pierre_chevalier_geologue ____________________________________________________________________________
On 04/22/2016 11:16 AM, Pierre Chevalier Géologue wrote: > Le 22/04/2016 19:11, Adrian Klaver a écrit : >>> Last time I had to do this kind of exercise, a few years ago, I was in a >>> remote place without Internet access, so I could not get any information >>> or ask any help. I was kind of surprised/frustrated by the (apparent) >>> lack of order of the pg_dump output. So I manually wrote scripts to >>> export the tables and views' definitions separately, one by one, (using >>> pg_dump, of course) and stack them in the order I wished into a large >>> file. That was quite suboptimal, but it worked as expected, and I was >>> able to diff and patch correctly. >> >> You realize there is pg_restore -l and pg_restore -L : >> >> http://www.postgresql.org/docs/9.5/interactive/app-pgrestore.html > > Yes, thanks for the advice. Now I remember that I had used it also: I > just found these notes in my numeric attic: > > # pierre@autan: ~ < 2013_08_17__17_00_23 > > pg_restore -l database_2013_08_14_20h34.pg_dump -n pierre > > tt_schema_pierre Yea, you can also use the filtering switches to create a filtered TOC, so: pg_restore -l -s test.out > test_s.toc Now the TOC has only a summary line of what is being done, but it is easy enough to feed it back to pg_restore and have it restore to a plain text file instead of a database: pg_restore -L test_s.toc -f test_s.sql > > # pierre@autan: ~ < 2013_08_17__17_00_23 > > vi tt_schema_pierre > > # pierre@autan: ~ < 2013_08_17__17_00_23 > > cat tt_schema_pierre > DROP VIEW IF EXISTS pierre.dh_collars; > DROP VIEW IF EXISTS pierre.dh_litho; > DROP VIEW IF EXISTS pierre.dh_sampling_grades; > DROP VIEW IF EXISTS pierre.dh_sampling; > DROP VIEW IF EXISTS pierre.topo_points; > DROP VIEW IF EXISTS pierre.baselines; > DROP VIEW IF EXISTS pierre.dh_devia; > ... > > > I just cannot remember why it did not fulfill my needs, so that I rather > opted to pg_dump's. > > > ... >> It does not solve all problems but it does do some ordering and is >> amenable to be >> imported as space separated file for further ordering : >> >> aklaver@panda:~> pg_restore -l test.out >> ; >> ; Archive created at Fri Apr 22 10:07:50 2016 >> ; dbname: test >> ; TOC Entries: 67 >> ; Compression: -1 >> ; Dump Version: 1.12-0 >> ; Format: CUSTOM >> ; Integer: 4 bytes >> ; Offset: 8 bytes >> ; Dumped from database version: 9.4.6 >> ; Dumped by pg_dump version: 9.4.6 >> ; >> ; >> ; Selected TOC Entries: >> ; >> 2702; 1262 983301 DATABASE - test postgres >> 9; 2615 1298825 SCHEMA - MASTER_USER postgres >> 8; 2615 2200 SCHEMA - public postgres >> 2703; 0 0 COMMENT - SCHEMA public postgres >> 2704; 0 0 ACL - public postgres >> 2; 3079 12456 EXTENSION - plpgsql >> 2705; 0 0 COMMENT - EXTENSION plpgsql > ... > > Thanks, I'll dig a bit more in pg_restore, for my current issues. > > Pierre -- Adrian Klaver adrian.klaver@aklaver.com