Thread: Logical replication disabled, recovery not possible because of 1 large transaction with schema changes?
Logical replication disabled, recovery not possible because of 1 large transaction with schema changes?
From
Koen De Groote
Date:
If this question is more suitable for another mailing list, please let me know.
I've set up the following table on both publisher and subscriber, both are pg16:
CREATE TABLE dummy_table (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
email VARCHAR(100) UNIQUE NOT NULL,
age INT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
email VARCHAR(100) UNIQUE NOT NULL,
age INT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
Added to publication, refreshed subscription.
Add some data on the publisher side:
INSERT INTO dummy_table (name, email, age)
VALUES
('John Doe', 'john.doe@example.com', 25),
('Jane Smith', 'jane.smith@example.com', 30),
('Michael Johnson', 'michael.j@example.com', 45),
('Emily Davis', 'emily.d@example.com', 27),
('Robert Brown', 'robert.brown@example.com', 40);
VALUES
('John Doe', 'john.doe@example.com', 25),
('Jane Smith', 'jane.smith@example.com', 30),
('Michael Johnson', 'michael.j@example.com', 45),
('Emily Davis', 'emily.d@example.com', 27),
('Robert Brown', 'robert.brown@example.com', 40);
The data can be seen on the subscriber. So far, so good.
I then execute the following patch on the publisher: https://gist.github.com/KoenDG/d1c06d8c740c64e4e5884d0c64b81f11
It is a single transaction that does the following:
1/ Insert data, 1000 items
2/ Drop a column
3/ Alter a column name
4/ Add 2 columns, nullable
5/ Add a column and give it a unique constraint
6/ Update values for a column with NULL values, added in step 4.
7/ Set the column updated in step 6 to be NOT NULL
8/ Create a unique index with the columns from step 3 and 6
9/ Insert a column with a default value
10/ Insert data for this schema, another 1000 items.
The subscription disabled, this is to be expected, there are new columns names, the schema needs to be updated on the subscriber side.
However, it seems I'm stuck.
I can't enable the subscription. This is to be expected, it will try to resume and run into the same issues.
Ok, I update the schema and enable again. It runs into an error for the inserts of step 1. These set values for columns dropped in step 2.
I revert to the old schema and enable again. It runs into an error again, this time for values that don't exist yet at step 1.
I tried dropping the table at the subscriber side, recreating the correct schema, but this runs into the same error.
I remove the table from the publication and retry. Same error. Even with the table no longer in the publication, and the table on the subscriber side dropped and re-created, I'm still getting the exact same errors of "logical replication target relation "public.dummy_table" is missing replicated columns: "contact_email", "status", "phone_number", "username""
The only solution I've found is to drop the table from the publication, and then drop the entire subscription and set it back up again, with the correct schema.
Am I making a mistake? Or does putting all these commands in a single transaction ruin my chances?
Clarification much appreciated.
Regards,
Koen De Groote
Re: Logical replication disabled, recovery not possible because of 1 large transaction with schema changes?
From
Muhammad Usman Khan
Date:
Hi,
When you execute schema-altering operations on the publisher, these changes are not automatically replicated to the subscriber which causes the following error
logical replication target relation "public.dummy_table" is missing replicated columns: "contact_email", "status", "phone_number", "username"
Before making schema changes, temporarily disable the subscription to prevent replication errors.
logical replication target relation "public.dummy_table" is missing replicated columns: "contact_email", "status", "phone_number", "username"
Before making schema changes, temporarily disable the subscription to prevent replication errors.
ALTER SUBSCRIPTION your_subscription_name DISABLE;
Manually apply the same schema modifications to the subscriber database to ensure alignment.
Once the schema changes are applied to both databases, re-enable the subscription:
ALTER SUBSCRIPTION your_subscription_name ENABLE;
On Thu, 17 Oct 2024 at 02:59, Koen De Groote <kdg.dev@gmail.com> wrote:
If this question is more suitable for another mailing list, please let me know.I've set up the following table on both publisher and subscriber, both are pg16:CREATE TABLE dummy_table (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
email VARCHAR(100) UNIQUE NOT NULL,
age INT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);Added to publication, refreshed subscription.Add some data on the publisher side:INSERT INTO dummy_table (name, email, age)
VALUES
('John Doe', 'john.doe@example.com', 25),
('Jane Smith', 'jane.smith@example.com', 30),
('Michael Johnson', 'michael.j@example.com', 45),
('Emily Davis', 'emily.d@example.com', 27),
('Robert Brown', 'robert.brown@example.com', 40);The data can be seen on the subscriber. So far, so good.I then execute the following patch on the publisher: https://gist.github.com/KoenDG/d1c06d8c740c64e4e5884d0c64b81f11It is a single transaction that does the following:1/ Insert data, 1000 items2/ Drop a column3/ Alter a column name4/ Add 2 columns, nullable5/ Add a column and give it a unique constraint6/ Update values for a column with NULL values, added in step 4.7/ Set the column updated in step 6 to be NOT NULL8/ Create a unique index with the columns from step 3 and 69/ Insert a column with a default value10/ Insert data for this schema, another 1000 items.The subscription disabled, this is to be expected, there are new columns names, the schema needs to be updated on the subscriber side.However, it seems I'm stuck.I can't enable the subscription. This is to be expected, it will try to resume and run into the same issues.Ok, I update the schema and enable again. It runs into an error for the inserts of step 1. These set values for columns dropped in step 2.I revert to the old schema and enable again. It runs into an error again, this time for values that don't exist yet at step 1.I tried dropping the table at the subscriber side, recreating the correct schema, but this runs into the same error.I remove the table from the publication and retry. Same error. Even with the table no longer in the publication, and the table on the subscriber side dropped and re-created, I'm still getting the exact same errors of "logical replication target relation "public.dummy_table" is missing replicated columns: "contact_email", "status", "phone_number", "username""The only solution I've found is to drop the table from the publication, and then drop the entire subscription and set it back up again, with the correct schema.Am I making a mistake? Or does putting all these commands in a single transaction ruin my chances?Clarification much appreciated.Regards,Koen De Groote
Re: Logical replication disabled, recovery not possible because of 1 large transaction with schema changes?
From
Koen De Groote
Date:
Hello Muhammad,
The problem with my scenario is the changes are written as a single transaction, with a BEGIN and COMMIT. In that transaction, there are first inserts, then a schema change, and then inserts on the new schema.
Doing as you said does not work. The subscriber will first complain it cannot do the last inserts. If I fix the schema, it will complain it cannot do the first inserts.
I thought I would be able to drop the table from the publication, and then do the subscription again, but that fails, as the subscription is disabled and cannot be enabled again, even if I remove the table from publication.
On Thu, Oct 17, 2024 at 5:49 AM Muhammad Usman Khan <usman.k@bitnine.net> wrote:
Hi,When you execute schema-altering operations on the publisher, these changes are not automatically replicated to the subscriber which causes the following error
logical replication target relation "public.dummy_table" is missing replicated columns: "contact_email", "status", "phone_number", "username"
Before making schema changes, temporarily disable the subscription to prevent replication errors.ALTER SUBSCRIPTION your_subscription_name DISABLE;Manually apply the same schema modifications to the subscriber database to ensure alignment.Once the schema changes are applied to both databases, re-enable the subscription:ALTER SUBSCRIPTION your_subscription_name ENABLE;On Thu, 17 Oct 2024 at 02:59, Koen De Groote <kdg.dev@gmail.com> wrote:If this question is more suitable for another mailing list, please let me know.I've set up the following table on both publisher and subscriber, both are pg16:CREATE TABLE dummy_table (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
email VARCHAR(100) UNIQUE NOT NULL,
age INT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);Added to publication, refreshed subscription.Add some data on the publisher side:INSERT INTO dummy_table (name, email, age)
VALUES
('John Doe', 'john.doe@example.com', 25),
('Jane Smith', 'jane.smith@example.com', 30),
('Michael Johnson', 'michael.j@example.com', 45),
('Emily Davis', 'emily.d@example.com', 27),
('Robert Brown', 'robert.brown@example.com', 40);The data can be seen on the subscriber. So far, so good.I then execute the following patch on the publisher: https://gist.github.com/KoenDG/d1c06d8c740c64e4e5884d0c64b81f11It is a single transaction that does the following:1/ Insert data, 1000 items2/ Drop a column3/ Alter a column name4/ Add 2 columns, nullable5/ Add a column and give it a unique constraint6/ Update values for a column with NULL values, added in step 4.7/ Set the column updated in step 6 to be NOT NULL8/ Create a unique index with the columns from step 3 and 69/ Insert a column with a default value10/ Insert data for this schema, another 1000 items.The subscription disabled, this is to be expected, there are new columns names, the schema needs to be updated on the subscriber side.However, it seems I'm stuck.I can't enable the subscription. This is to be expected, it will try to resume and run into the same issues.Ok, I update the schema and enable again. It runs into an error for the inserts of step 1. These set values for columns dropped in step 2.I revert to the old schema and enable again. It runs into an error again, this time for values that don't exist yet at step 1.I tried dropping the table at the subscriber side, recreating the correct schema, but this runs into the same error.I remove the table from the publication and retry. Same error. Even with the table no longer in the publication, and the table on the subscriber side dropped and re-created, I'm still getting the exact same errors of "logical replication target relation "public.dummy_table" is missing replicated columns: "contact_email", "status", "phone_number", "username""The only solution I've found is to drop the table from the publication, and then drop the entire subscription and set it back up again, with the correct schema.Am I making a mistake? Or does putting all these commands in a single transaction ruin my chances?Clarification much appreciated.Regards,Koen De Groote
Re: Logical replication disabled, recovery not possible because of 1 large transaction with schema changes?
From
Michał Kłeczek
Date:
> On 17 Oct 2024, at 11:07, Koen De Groote <kdg.dev@gmail.com> wrote: > > Hello Muhammad, > > The problem with my scenario is the changes are written as a single transaction, with a BEGIN and COMMIT. In that transaction,there are first inserts, then a schema change, and then inserts on the new schema. I guess until logical replication of DDL is available you’re out of luck. The best you can do is to have a separate table for recording and replaying schema changes. Create triggers that perform actual DDL operations based on DML in this table. Publish this table on the publisher in the same publication as the tables affected by the DDL. On the subscriber side it is the same - just make the trigger is marked as ENABLE REPLICA TRIGGER or ENABLE ALWAYS TRIGGER. Kind regards, Michał
Re: Logical replication disabled, recovery not possible because of 1 large transaction with schema changes?
From
Koen De Groote
Date:
Hello Michał,
Thanks for the reply. I suspected as much, I was just wondering if there was an easy fix that didn't involve dropping the entire subscription and having to re-do all the table because of that. Guess my only option is to remove the affected tables from the publisher before the patch, refresh subscription, do the patch, recreate the tables on the subscriber and do the sync for only those tables.
I will look in to your suggestion.
Regards,
Koen De Groote
On Thu, Oct 17, 2024 at 11:17 AM Michał Kłeczek <michal@kleczek.org> wrote:
> On 17 Oct 2024, at 11:07, Koen De Groote <kdg.dev@gmail.com> wrote:
>
> Hello Muhammad,
>
> The problem with my scenario is the changes are written as a single transaction, with a BEGIN and COMMIT. In that transaction, there are first inserts, then a schema change, and then inserts on the new schema.
I guess until logical replication of DDL is available you’re out of luck.
The best you can do is to have a separate table for recording and replaying schema changes.
Create triggers that perform actual DDL operations based on DML in this table.
Publish this table on the publisher in the same publication as the tables affected by the DDL.
On the subscriber side it is the same - just make the trigger is marked as ENABLE REPLICA TRIGGER or ENABLE ALWAYS TRIGGER.
Kind regards,
Michał