Thread: [ADMIN] replication issue

[ADMIN] replication issue

From

"Keefer, Brad (ITS)"

Date:

04 October 2017, 18:28:20

So, I am migrating from a single Postgres DB node to a clustered setup.

I want to use BDR and run a multi-master cluster, so I have that configured between my 3 DB nodes.

I restored the DB backup from the single DB node on one of the BDR-connected nodes, and saw that it replicated the schema and tables ok, but I’m finding some weird behavior once I begin to use the cluster.

I have a user who is trying to log into the application that is using this Postgres DB as a backend, but he is unable to. Looking at the logs, it appears that when Postgres is trying to place his id in a table called auth_user, it is complaining that the key already exists. Looking at a table called auth_user_id_seq, I see that the information is different on each of the BDR nodes, and that makes me wonder why, since it should match up with what I restored.

This is what it looks like:

node01:

awx=# SELECT * FROM auth_user_id_seq;

sequence_name | last_value | start_value | increment_by | max_value

------------------+------------+-------------+--------------+-------------------

--+-----------+-------------+---------+-----------+-----------+--------

auth_user_id_seq | 50 | 1 | 1 | 922337203685477580

7 | 1 | 1 | 0 | f | t |

(1 row)

awx=#

node02:

awx=# SELECT * FROM auth_user_id_seq;

sequence_name | last_value | start_value | increment_by | max_value

------------------+------------+-------------+--------------+-------------------

--+-----------+-------------+---------+-----------+-----------+--------

auth_user_id_seq | 17 | 1 | 1 | 922337203685477580

7 | 1 | 1 | 30 | f | t |

(1 row)

awx=#

node03:

awx=# SELECT * FROM auth_user_id_seq;

sequence_name | last_value | start_value | increment_by | max_value

------------------+------------+-------------+--------------+-------------------

--+-----------+-------------+---------+-----------+-----------+--------

auth_user_id_seq | 4 | 1 | 1 | 922337203685477580

7 | 1 | 1 | 31 | f | t |

(1 row)

awx=#

I have to think that I missed a step somewhere during the restore process, or configuring the replication, that would lead to this… My expectation would be that those three entries would be the same.

We are also having other duplicate-key-related issues in the app, like creating a new template.

Any help you could offer would be greatly appreciated.

Thanks.

Brad Keefer

Red Hat Consultant, ITS, RHCSA

Office of Information Technology Services

W A Harriman State Campus, Bldg 8, Floor 7

(919)210-7045 | Brad.Keefer@its.ny.gov

www.its.ny.gov

Re: [ADMIN] replication issue

From

Alvaro Aguayo Garcia-Rada

Date:

04 October 2017, 19:00:26

Hi.
Just restoring your non-BDR backup into a BDR cluster will not avoid future problems with sequences.
At first glance, the problem you are having is that all your sequences are local sequences(the only sequence type
withoutBDR). That works fine on an standalone, non-BDR node, but it could not work on a BDR cluster.
 
If you create a sequence on a BDR cluster, that sequence will be created on all nodes. However, if you call nextval for
thatsequence on your 3 nodes, you will probably get the same values on each one. That's why BDR have a different type
ofsequences, "global sequences".
 
BDR Global Sequences are split into chunks, and each group of chunks is assigned to each server in the cluster; nextval
getthe next value on a chunk assigned to the server you call it, guaranteeing that you will get different values on
eachserver. It also guarantees that a second call to nextval(in the same server) will produce a value higher than the
previouscall. However, it does not guarantee time-sequence correlation across servers. More info about global
sequences:http://bdr-project.org/docs/stable/global-sequences.html
 
To solve your problem, you must use global sequences. There's no conversion process, so you must drop your current
sequencesand recreate them as global sequences. Something like this:
 

SELECT nextval('seq_nnnn'); -- Will give you a value, let's say "x"
DROP SEQUENCE seq_nnnn;
CREATE SEQUENCE seq_nnnn START x USING bdr; -- Replace "x" with 

Remember to assign USAGE permission as needed.

That will do the trick.

Saludos,

Alvaro Aguayo
Jefe de Operaciones
Open Comb Systems E.I.R.L.

Oficina: (+51-1) 3377813 | RPM: #034252 / (+51) 995540103  | RPC: (+51) 954183248
Website: www.ocs.pe

----- Original Message -----
From: "Keefer, Brad (ITS)" <Brad.Keefer@its.ny.gov>
To: "pgsql-admin" <pgsql-admin@postgresql.org>
Sent: Wednesday, 4 October, 2017 10:28:20
Subject: [ADMIN] replication issue

So, I am migrating from a single Postgres DB node to a clustered setup.

I want to use BDR and run a multi-master cluster, so I have that configured between my 3 DB nodes.

I restored the DB backup from the single DB node on one of the BDR-connected nodes, and saw that it replicated the
schemaand tables ok, but I'm finding some weird behavior once I begin to use the cluster.
 

I have a user who is trying to log into the application that is using this Postgres DB as a backend, but he is unable
to.Looking at the logs, it appears that when Postgres is trying to place his id in a table called auth_user, it is
complainingthat the key already exists. Looking at a table called auth_user_id_seq, I see that the information is
differenton each of the BDR nodes, and that makes me wonder why, since it should match up with what I restored.
 

This is what it looks like:

node01:
awx=# SELECT * FROM auth_user_id_seq; sequence_name   | last_value | start_value | increment_by |      max_value |
min_value| cache_value | log_cnt | is_cycled | is_called | amdata
 
------------------+------------+-------------+--------------+-------------------
--+-----------+-------------+---------+-----------+-----------+--------
auth_user_id_seq |         50 |           1 |            1 | 922337203685477580
7 |         1 |           1 |       0 | f         | t         |
(1 row)

awx=#


node02:
awx=# SELECT * FROM auth_user_id_seq; sequence_name   | last_value | start_value | increment_by |      max_value |
min_value| cache_value | log_cnt | is_cycled | is_called | amdata
 
------------------+------------+-------------+--------------+-------------------
--+-----------+-------------+---------+-----------+-----------+--------
auth_user_id_seq |         17 |           1 |            1 | 922337203685477580
7 |         1 |           1 |      30 | f         | t         |
(1 row)

awx=#


node03:
awx=# SELECT * FROM auth_user_id_seq; sequence_name   | last_value | start_value | increment_by |      max_value |
min_value| cache_value | log_cnt | is_cycled | is_called | amdata
 
------------------+------------+-------------+--------------+-------------------
--+-----------+-------------+---------+-----------+-----------+--------
auth_user_id_seq |          4 |           1 |            1 | 922337203685477580
7 |         1 |           1 |      31 | f         | t         |
(1 row)

awx=#

I have to think that I missed a step somewhere during the restore process, or configuring the replication, that would
leadto this... My expectation would be that those three entries would be the same.
 

We are also having other duplicate-key-related issues in the app, like creating a new template.

Any help you could offer would be greatly appreciated.

Thanks.

Brad Keefer
Red Hat Consultant, ITS, RHCSA

Office of Information Technology Services
W A Harriman State Campus, Bldg 8, Floor 7
(919)210-7045 | Brad.Keefer@its.ny.gov<mailto:Brad.Keefer@its.ny.gov>
www.its.ny.gov<http://www.its.ny.gov/>


-- 
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin