Thread: Replication Recommendation

Replication Recommendation

From
Lee Hachadoorian
Date:
There are a wide variety of Postgres replication solutions, and I
would like advice on which one would be appropriate to my use case.

* Small (~half dozen) distributed workforce using a file sharing
service, but without access to direct network connection over the
internet
* Database is updated infrequently, when new government agency data
releases replace old data
* Because database is updated infrequently, workforce can come
together for LAN-based replication as needed
* Entire database is on the order of a few GB

Given this, I am considering the super lowtech "replication" solution
of updating "master" and doing a full database drop and restore on the
"slaves". But I would like to know which of the other (real)
replication solutions might work for this use case.

Regards,
--Lee

--
Lee Hachadoorian
Assistant Professor of Instruction, Geography and Urban Studies
Assistant Director, Professional Science Master's in GIS
Temple University


Re: Replication Recommendation

From
Adrian Klaver
Date:
On 09/12/2016 12:46 PM, Lee Hachadoorian wrote:
> There are a wide variety of Postgres replication solutions, and I
> would like advice on which one would be appropriate to my use case.
>
> * Small (~half dozen) distributed workforce using a file sharing
> service, but without access to direct network connection over the
> internet
> * Database is updated infrequently, when new government agency data
> releases replace old data
> * Because database is updated infrequently, workforce can come
> together for LAN-based replication as needed
> * Entire database is on the order of a few GB
>
> Given this, I am considering the super lowtech "replication" solution
> of updating "master" and doing a full database drop and restore on the
> "slaves". But I would like to know which of the other (real)
> replication solutions might work for this use case.

If I follow correctly the layout is?:

        Main database <--- Govt. data
                    |
                         |
                        \ /

           File share
                         |
                         |
                        \ /

DB       DB        DB       DB       DB        DB

User 1   User 2    User 3   User 4   User 5    User 6



For your simple scenario you might want to look at:

https://www.postgresql.org/docs/9.5/static/app-pgbasebackup.html


>
> Regards,
> --Lee
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Replication Recommendation

From
Lee Hachadoorian
Date:
On Mon, Sep 12, 2016 at 5:12 PM, Adrian Klaver
<adrian.klaver@aklaver.com> wrote:
> On 09/12/2016 12:46 PM, Lee Hachadoorian wrote:
>>
>> There are a wide variety of Postgres replication solutions, and I
>> would like advice on which one would be appropriate to my use case.
>>
>> * Small (~half dozen) distributed workforce using a file sharing
>> service, but without access to direct network connection over the
>> internet
>> * Database is updated infrequently, when new government agency data
>> releases replace old data
>> * Because database is updated infrequently, workforce can come
>> together for LAN-based replication as needed
>> * Entire database is on the order of a few GB
>>
>> Given this, I am considering the super lowtech "replication" solution
>> of updating "master" and doing a full database drop and restore on the
>> "slaves". But I would like to know which of the other (real)
>> replication solutions might work for this use case.
>
>
> If I follow correctly the layout is?:
>
>                 Main database <--- Govt. data
>                         |
>                         |
>                        \ /
>
>                    File share
>                         |
>                         |
>                        \ /
>
> DB       DB        DB       DB       DB        DB
>
> User 1   User 2    User 3   User 4   User 5    User 6
>
>
>
> For your simple scenario you might want to look at:
>
> https://www.postgresql.org/docs/9.5/static/app-pgbasebackup.html
>

That diagram is what I am proposing.

pg_basebackup looks interesting. My initial impression is that the
main gain would be for a multiple database cluster. Are there other
advantages to using this in preference to a full DB dump and restore
if all of our data will be in a single database?

Best,
--Lee


Re: Replication Recommendation

From
Adrian Klaver
Date:
On 09/12/2016 02:35 PM, Lee Hachadoorian wrote:
> On Mon, Sep 12, 2016 at 5:12 PM, Adrian Klaver
> <adrian.klaver@aklaver.com> wrote:
>> On 09/12/2016 12:46 PM, Lee Hachadoorian wrote:
>>>
>>> There are a wide variety of Postgres replication solutions, and I
>>> would like advice on which one would be appropriate to my use case.
>>>
>>> * Small (~half dozen) distributed workforce using a file sharing
>>> service, but without access to direct network connection over the
>>> internet
>>> * Database is updated infrequently, when new government agency data
>>> releases replace old data
>>> * Because database is updated infrequently, workforce can come
>>> together for LAN-based replication as needed
>>> * Entire database is on the order of a few GB
>>>
>>> Given this, I am considering the super lowtech "replication" solution
>>> of updating "master" and doing a full database drop and restore on the
>>> "slaves". But I would like to know which of the other (real)
>>> replication solutions might work for this use case.
>>
>>
>> If I follow correctly the layout is?:
>>
>>                 Main database <--- Govt. data
>>                         |
>>                         |
>>                        \ /
>>
>>                    File share
>>                         |
>>                         |
>>                        \ /
>>
>> DB       DB        DB       DB       DB        DB
>>
>> User 1   User 2    User 3   User 4   User 5    User 6
>>
>>
>>
>> For your simple scenario you might want to look at:
>>
>> https://www.postgresql.org/docs/9.5/static/app-pgbasebackup.html
>>
>
> That diagram is what I am proposing.
>
> pg_basebackup looks interesting. My initial impression is that the
> main gain would be for a multiple database cluster. Are there other
> advantages to using this in preference to a full DB dump and restore
> if all of our data will be in a single database?

Not sure.

pg_basebackup can:

"There is no guarantee that all WAL files required for the backup are
archived at the end of backup. If you are planning to use the backup for
an archive recovery and want to ensure that all required files are
available at that moment, you need to include them into the backup by
using -x option."

At that point you have a complete $DATADIR. So on your user machines it
then becomes a matter of stopping the server clearing out the old
$DATADIR and dropping the new one in place and starting the server.
Whether that is faster then having pg_restore connect to a database and
then process the dump file is something you will have to test.

>
> Best,
> --Lee
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Replication Recommendation

From
Vick Khera
Date:
On Mon, Sep 12, 2016 at 3:46 PM, Lee Hachadoorian
<Lee.Hachadoorian+L@gmail.com> wrote:
> * Because database is updated infrequently, workforce can come
> together for LAN-based replication as needed
> * Entire database is on the order of a few GB

Just update one copy, then send pg_dump's to the others for stomping
over the old one.