Home > mailing lists

Re: row filtering for logical replication - Mailing list pgsql-hackers

From	Amit Kapila
Subject	Re: row filtering for logical replication
Date	December 25, 2021 04:20:44
Msg-id	CAA4eK1KMmWhRUw-reLxnBw0s40mU8H-oYhapGz_WLD-mb3a7ig@mail.gmail.com Whole thread Raw
In response to	Re: row filtering for logical replication (Peter Smith <smithpb2250@gmail.com>)
Responses	Re: row filtering for logical replication
List	pgsql-hackers

Tree view

On Fri, Dec 24, 2021 at 11:04 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> The current PG docs text for CREATE PUBLICATION (in the v54-0001
> patch) has a part that says
>
> +   A nullable column in the <literal>WHERE</literal> clause could cause the
> +   expression to evaluate to false; avoid using columns without not-null
> +   constraints in the <literal>WHERE</literal> clause.
>
> I felt that the caution to "avoid using" nullable columns is too
> strongly worded. AFAIK nullable columns will work perfectly fine so
> long as you take due care of them in the WHERE clause. In fact, it
> might be very useful sometimes to filter on nullable columns.
>
> Here is a small test example:
>
> // publisher
> test_pub=# create table t1 (id int primary key, msg text null);
> test_pub=# create publication p1 for table t1 where (msg != 'three');
> // subscriber
> test_sub=# create table t1 (id int primary key, msg text null);
> test_sub=# CREATE SUBSCRIPTION sub1 CONNECTION 'host=localhost
> dbname=test_pub application_name=sub1' PUBLICATION p1;
>
> // insert some data
> test_pub=# insert into t1 values (1, 'one'), (2, 'two'), (3, 'three'),
> (4, null), (5, 'five');
> test_pub=# select * from t1;
>  id |  msg
> ----+-------
>   1 | one
>   2 | two
>   3 | three
>   4 |
>   5 | five
> (5 rows)
>
> // data at sub
> test_sub=# select * from t1;
>  id | msg
> ----+------
>   1 | one
>   2 | two
>   5 | five
> (3 rows)
>
> Notice the row 4 with the NULL is also not replicated. But, perhaps we
> were expecting it to be replicated (because NULL is not 'three'). To
> do this, simply rewrite the WHERE clause to properly account for
> nulls.
>
> // truncate both sides
> test_pub=# truncate table t1;
> test_sub=# truncate table t1;
>
> // alter the WHERE clause
> test_pub=# alter publication p1 set table t1 where (msg is null or msg
> != 'three');
>
> // insert data at pub
> test_pub=# insert into t1 values (1, 'one'), (2, 'two'), (3, 'three'),
> (4, null), (5, 'five');
> INSERT 0 5
> test_pub=# select * from t1;
>  id |  msg
> ----+-------
>   1 | one
>   2 | two
>   3 | three
>   4 |
>   5 | five
> (5 rows)
>
> // data at sub (not it includes the row 4)
> test_sub=# select * from t1;
>  id | msg
> ----+------
>   1 | one
>   2 | two
>   4 |
>   5 | five
> (4 rows)
>
> ~~
>
> So, IMO the PG docs wording for this part should be relaxed a bit.
>
> e.g.
> BEFORE:
> +   A nullable column in the <literal>WHERE</literal> clause could cause the
> +   expression to evaluate to false; avoid using columns without not-null
> +   constraints in the <literal>WHERE</literal> clause.
> AFTER:
> +   A nullable column in the <literal>WHERE</literal> clause could cause the
> +   expression to evaluate to false. To avoid unexpected results, any possible
> +   null values should be accounted for.
>

Your suggested wording sounds reasonable to me. Euler, others, any thoughts?

-- 
With Regards,
Amit Kapila.

pgsql-hackers by date:

From: Amit Langote
Date: 25 December 2021, 03:36:00
Subject: generic plans and "initial" pruning

From: "Joel Jacobson"
Date: 25 December 2021, 11:02:29
Subject: Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

Re: row filtering for logical replication - Mailing list pgsql-hackers

Previous

Next