Re: Proposal: Conflict log history table for Logical Replication - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Proposal: Conflict log history table for Logical Replication
Date
Msg-id CAFiTN-s9WWLOhW1TO27NtJwGf0bh2+MWyp3NEkZFeN_S5_p_rA@mail.gmail.com
Whole thread Raw
In response to Re: Proposal: Conflict log history table for Logical Replication  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
List pgsql-hackers
On Sat, Sep 13, 2025 at 6:16 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Thanks for the feedback Bharath

> On Fri, Sep 12, 2025 at 3:13 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > I was looking into another thread where we provide an error table for
> > COPY [1], it requires the user to pre-create the error table. And
> > inside the COPY command we will validate the table, validation in that
> > context is a one-time process checking for: (1) table existence, (2)
> > ability to acquire a sufficient lock, (3) INSERT privileges, and (4)
> > matching column names and data types. This approach avoids concerns
> > about the user's DROP or ALTER permissions.
> >
> > Our requirement for the logical replication conflict log table
> > differs, as we must validate the target table upon every conflict
> > insertion, not just at subscription creation. A more robust
> > alternative is to perform validation and acquire a lock on the
> > conflict table whenever the subscription worker starts. This prevents
> > modifications (like ALTER or DROP) while the worker is active. When
> > the worker gets restarted, we can re-validate the table and
> > automatically disable the conflict logging feature if validation
> > fails.  And this can be enabled by ALTER SUBSCRIPTION by setting the
> > option again.
>
> Having to worry about ALTER/DROP and adding code to protect seems like
> an overkill.

IMHO eventually if we can control that I feel this is a good goal to
have.  So that we can avoid failure during conflict insertion.  We may
argue its user's responsibility to not alter the table and we can just
check the validity during create/alter subscription.

> > And if we want in first version we can expect user to create the table
> > as per the expected schema and supply it, this will avoid the need of
> > handling how to avoid it from publishing as it will be user's
> > responsibility and then in top up patches we can also allow to create
> > the table internally if tables doesn't exist and then we can find out
> > solution to avoid it from being publish when ALL TABLES are published.
>
> This looks much more simple to start with.

Right.

PFA, attached WIP patches, 0001 allow user created tables to provide
as input for conflict history tables and we will validate the table
during create/alter subscription.  0002 add an option to internally
create the table if it does not exist.

TODO:
- Still patches are WIP and need more work testing for different failure cases
- Need to explore an option to create a built-in type (I will start a
separate thread for the same)
- Need to add test cases
- Need to explore options to avoid getting published, but maybe we
only need to avoid this when we internally create the table?

Here is some basic test I tried:

psql -d postgres -c "CREATE TABLE test(a int, b int, primary key(a));"
psql -d postgres -p 5433 -c "CREATE SCHEMA myschema"
psql -d postgres -p 5433 -c "CREATE TABLE test(a int, b int, primary key(a));"
psql -d postgres -p 5433 -c "GRANT INSERT, UPDATE, SELECT, DELETE ON
test TO dk "
psql -d postgres -c "CREATE PUBLICATION pub FOR ALL TABLES ;"

psql -d postgres -p 5433 -c "CREATE SUBSCRIPTION sub CONNECTION
'dbname=postgres port=5432' PUBLICATION pub
WITH(conflict_log_table=myschema.conflict_log_history)";
psql -d postgres -p 5432 -c "INSERT INTO test VALUES(1,2);"
psql -d postgres -p 5433 -c "UPDATE test SET b=10 WHERE a=1;"
psql -d postgres -p 5432 -c "UPDATE test SET b=20 WHERE a=1;"

postgres[1202034]=# select * from myschema.conflict_log_history ;
-[ RECORD 1 ]-----+------------------------------
relid             | 16385
local_xid         | 763
remote_xid        | 757
local_lsn         | 0/00000000
remote_commit_lsn | 0/0174AB30
local_commit_ts   | 2025-09-14 06:45:00.828874+00
remote_commit_ts  | 2025-09-14 06:45:05.845614+00
table_schema      | public
table_name        | test
conflict_type     | update_origin_differs
local_origin      |
remote_origin     | pg_16396
key_tuple         | {"a":1,"b":20}
local_tuple       | {"a":1,"b":10}
remote_tuple      | {"a":1,"b":20}


--
Regards,
Dilip Kumar
Google

Attachment

pgsql-hackers by date:

Previous
From: jian he
Date:
Subject: remove unnecessary include in src/backend/commands/policy.c
Next
From: Florian Weimer
Date:
Subject: Ignoring symlinks when recovering time zone identifiers in initdb