Thread: RES: Fastest way to duplicate a quite large database
De: Adrian Klaver
Enviado:terça-feira, 12 de abril de 2016 12:04
Para: Edson Richter; pgsql-general@postgresql.org
Assunto: Re: [GENERAL] Fastest way to duplicate a quite large database
On 04/12/2016 07:51 AM, Edson Richter wrote:
> Same machine, same cluster - just different database name.
Hmm, running tests against the same cluster you are running the
production database would seem to be a performance hit against the
production database and potentially dangerous should the tests trip a
bug that crashes the server.
>
> Atenciosamente,
>
> Edson Carlos Ericksson Richter
>
> Em 12/04/2016 11:46, John R Pierce escreveu:
>> On 4/12/2016 7:25 AM, Edson Richter wrote:
>>>
>>> I have a database "Customer" with about 60Gb of data.
>>> I know I can backup and restore, but this seems too slow.
>>>
>>> Is there any other option to duplicate this database as
>>> "CustomerTest" as fast as possible (even fastar than backup/restore)
>>> - better if in one operation (something like "copy database A to B")?
>>> I would like to run this everyday, overnight, with minimal impact to
>>> prepare a test environment based on production data.
>>
>>
>> copy to the same machine, or copy to a different test server?
>> different answers.
>>
>>
>>
>
>
>
--
Adrian Klaver
adrian.klaver@aklaver.com
Hi Adrian,
Thanks for your insight. This is not a “test system” in the way I’m testing the database server code.
This is kind of “pre-production evaluation”, the stage were customer will say “yes” or “no” for publishing a new version of our system into production.
Also, server is plenty of RAM and processor cores, so I don’t foresee any kind of trouble here.
The is risk is lower than running a heavy reporting system over the database server.
The point is that customers want to test the new version of our system as close as possible of the production environment.
Thanks,
Edson
De: Adrian Klaver
Enviado:terça-feira, 12 de abril de 2016 12:04
Para: Edson Richter; pgsql-general@postgresql.org
Assunto: Re: [GENERAL] Fastest way to duplicate a quite large database
On 04/12/2016 07:51 AM, Edson Richter wrote:
> Same machine, same cluster - just different database name.
Hmm, running tests against the same cluster you are running the
production database would seem to be a performance hit against the
production database and potentially dangerous should the tests trip a
bug that crashes the server.
>
> Atenciosamente,
>
> Edson Carlos Ericksson Richter
>
> Em 12/04/2016 11:46, John R Pierce escreveu:
>> On 4/12/2016 7:25 AM, Edson Richter wrote:
>>>
>>> I have a database "Customer" with about 60Gb of data.
>>> I know I can backup and restore, but this seems too slow.
>>>
>>> Is there any other option to duplicate this database as
>>> "CustomerTest" as fast as possible (even fastar than backup/restore)
>>> - better if in one operation (something like "copy database A to B")?
>>> I would like to run this everyday, overnight, with minimal impact to
>>> prepare a test environment based on production data.
>>
>>
>> copy to the same machine, or copy to a different test server?
>> different answers.
>>
>>
>>
>
>
>
--
Adrian Klaver
adrian.klaver@aklaver.com
Hi Adrian,
Thanks for your insight. This is not a “test system” in the way I’m testing the database server code.
This is kind of “pre-production evaluation”, the stage were customer will say “yes” or “no” for publishing a new version of our system into production.
Also, server is plenty of RAM and processor cores, so I don’t foresee any kind of trouble here.
The is risk is lower than running a heavy reporting system over the database server.
The point is that customers want to test the new version of our system as close as possible of the production environment.
Thanks,
Edson
Another trouble I've found: I've used "pg_dump" and "pg_restore" to create the new CustomerTest database in my cluster. Immediately, replication started to replicate the 60Gb data into slave, causing big trouble.
Does mark it as "template" avoids replication of that "copied" database?
How can I mark a database to "do not replicate"?
Thanks,
Edson
On 04/13/2016 06:58 AM, Edson Richter wrote: > > > Another trouble I've found: I've used "pg_dump" and "pg_restore" to > create the new CustomerTest database in my cluster. Immediately, > replication started to replicate the 60Gb data into slave, causing big > trouble. > Does mark it as "template" avoids replication of that "copied" database? > How can I mark a database to "do not replicate"? With the Postgres built in binary replication you can't, it replicates the entire cluster. There are third party solutions that offer that choice: http://www.postgresql.org/docs/9.5/interactive/different-replication-solutions.html Table 25-1. High Availability, Load Balancing, and Replication Feature Matrix It has been mentioned before, running a non-production database on the same cluster as the production database is a generally not a good idea. Per previous suggestions I would host your CustomerTest database on another instance/cluster of Postgres listening on a different port. Then all you customers have to do is create a connection that points at the new port. > > Thanks, > > Edson > > -- Adrian Klaver adrian.klaver@aklaver.com
Em 13/04/2016 11:18, Adrian Klaver escreveu: > On 04/13/2016 06:58 AM, Edson Richter wrote: > >> >> >> Another trouble I've found: I've used "pg_dump" and "pg_restore" to >> create the new CustomerTest database in my cluster. Immediately, >> replication started to replicate the 60Gb data into slave, causing big >> trouble. >> Does mark it as "template" avoids replication of that "copied" database? >> How can I mark a database to "do not replicate"? > > With the Postgres built in binary replication you can't, it replicates > the entire cluster. There are third party solutions that offer that > choice: > > http://www.postgresql.org/docs/9.5/interactive/different-replication-solutions.html > > > Table 25-1. High Availability, Load Balancing, and Replication Feature > Matrix Thanks, I'll look at that. > It has been mentioned before, running a non-production database on the > same cluster as the production database is a generally not a good > idea. Per previous suggestions I would host your CustomerTest database > on another instance/cluster of Postgres listening on a different port. > Then all you customers have to do is create a connection that points > at the new port. Thanks for the concern. This "CustomerTest" database is a staging, for customer approval before upgrading the production system. I bet the users will only open the system, and say it is ok. As crowded as people are those days, I doubt they will validate something that is already validated by our development team. But our contractor requires, and we provide. Since we have "express devivery of new versions" (almost 2 per week), we would like to automate the staging environment. Thanks, Edson > >> >> Thanks, >> >> Edson >> >> > >
On 04/13/2016 07:46 AM, Edson Richter wrote: > Em 13/04/2016 11:18, Adrian Klaver escreveu: >> On 04/13/2016 06:58 AM, Edson Richter wrote: >> >>> >>> >>> Another trouble I've found: I've used "pg_dump" and "pg_restore" to >>> create the new CustomerTest database in my cluster. Immediately, >>> replication started to replicate the 60Gb data into slave, causing big >>> trouble. >>> Does mark it as "template" avoids replication of that "copied" database? >>> How can I mark a database to "do not replicate"? >> >> With the Postgres built in binary replication you can't, it replicates >> the entire cluster. There are third party solutions that offer that >> choice: >> >> http://www.postgresql.org/docs/9.5/interactive/different-replication-solutions.html >> >> >> Table 25-1. High Availability, Load Balancing, and Replication Feature >> Matrix > > Thanks, I'll look at that. > >> It has been mentioned before, running a non-production database on the >> same cluster as the production database is a generally not a good >> idea. Per previous suggestions I would host your CustomerTest database >> on another instance/cluster of Postgres listening on a different port. >> Then all you customers have to do is create a connection that points >> at the new port. > > Thanks for the concern. > This "CustomerTest" database is a staging, for customer approval before > upgrading the production system. > I bet the users will only open the system, and say it is ok. As crowded > as people are those days, I doubt they will validate something that is Not necessarily a bet I would count on:) > already validated by our development team. > But our contractor requires, and we provide. > Since we have "express devivery of new versions" (almost 2 per week), we > would like to automate the staging environment. My guess is setting up a different cluster and using pg_basebackup(as John suggested) will be a lot easier to automate then creating a differential replication setup. Any way, I have beat this drum long enough. > > Thanks, > > Edson > >> >>> >>> Thanks, >>> >>> Edson >>> >>> >> >> > > > -- Adrian Klaver adrian.klaver@aklaver.com
On 04/13/2016 08:46 AM, Edson Richter wrote: > Em 13/04/2016 11:18, Adrian Klaver escreveu: >> On 04/13/2016 06:58 AM, Edson Richter wrote: >> >>> >>> >>> Another trouble I've found: I've used "pg_dump" and "pg_restore" to >>> create the new CustomerTest database in my cluster. Immediately, >>> replication started to replicate the 60Gb data into slave, causing big >>> trouble. >>> Does mark it as "template" avoids replication of that "copied" >>> database? >>> How can I mark a database to "do not replicate"? >> >> With the Postgres built in binary replication you can't, it >> replicates the entire cluster. There are third party solutions that >> offer that choice: >> >> http://www.postgresql.org/docs/9.5/interactive/different-replication-solutions.html >> >> >> Table 25-1. High Availability, Load Balancing, and Replication >> Feature Matrix > > Thanks, I'll look at that. > >> It has been mentioned before, running a non-production database on >> the same cluster as the production database is a generally not a good >> idea. Per previous suggestions I would host your CustomerTest >> database on another instance/cluster of Postgres listening on a >> different port. Then all you customers have to do is create a >> connection that points at the new port. > > Thanks for the concern. > This "CustomerTest" database is a staging, for customer approval > before upgrading the production system. > I bet the users will only open the system, and say it is ok. As > crowded as people are those days, I doubt they will validate something > that is already validated by our development team. > But our contractor requires, and we provide. > Since we have "express devivery of new versions" (almost 2 per week), > we would like to automate the staging environment. > > Thanks, > > Edson > >> >>> >>> Thanks, >>> >>> Edson >>> >>> >> >> > > > Have you tried this: - to copy database named prod to a database named prod_test: on the same cluster (same server): pg_dump prod --create --clean| psql prod_test Copy from prod db cluster to another cluster/server (i.e. the test server) pg_dump prod --create --clean| psql -h [test_server_ip] prod_test