Re: row filtering for logical replication - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: row filtering for logical replication |
Date | |
Msg-id | CAA4eK1KMmWhRUw-reLxnBw0s40mU8H-oYhapGz_WLD-mb3a7ig@mail.gmail.com Whole thread Raw |
In response to | Re: row filtering for logical replication (Peter Smith <smithpb2250@gmail.com>) |
Responses |
Re: row filtering for logical replication
|
List | pgsql-hackers |
On Fri, Dec 24, 2021 at 11:04 AM Peter Smith <smithpb2250@gmail.com> wrote: > > The current PG docs text for CREATE PUBLICATION (in the v54-0001 > patch) has a part that says > > + A nullable column in the <literal>WHERE</literal> clause could cause the > + expression to evaluate to false; avoid using columns without not-null > + constraints in the <literal>WHERE</literal> clause. > > I felt that the caution to "avoid using" nullable columns is too > strongly worded. AFAIK nullable columns will work perfectly fine so > long as you take due care of them in the WHERE clause. In fact, it > might be very useful sometimes to filter on nullable columns. > > Here is a small test example: > > // publisher > test_pub=# create table t1 (id int primary key, msg text null); > test_pub=# create publication p1 for table t1 where (msg != 'three'); > // subscriber > test_sub=# create table t1 (id int primary key, msg text null); > test_sub=# CREATE SUBSCRIPTION sub1 CONNECTION 'host=localhost > dbname=test_pub application_name=sub1' PUBLICATION p1; > > // insert some data > test_pub=# insert into t1 values (1, 'one'), (2, 'two'), (3, 'three'), > (4, null), (5, 'five'); > test_pub=# select * from t1; > id | msg > ----+------- > 1 | one > 2 | two > 3 | three > 4 | > 5 | five > (5 rows) > > // data at sub > test_sub=# select * from t1; > id | msg > ----+------ > 1 | one > 2 | two > 5 | five > (3 rows) > > Notice the row 4 with the NULL is also not replicated. But, perhaps we > were expecting it to be replicated (because NULL is not 'three'). To > do this, simply rewrite the WHERE clause to properly account for > nulls. > > // truncate both sides > test_pub=# truncate table t1; > test_sub=# truncate table t1; > > // alter the WHERE clause > test_pub=# alter publication p1 set table t1 where (msg is null or msg > != 'three'); > > // insert data at pub > test_pub=# insert into t1 values (1, 'one'), (2, 'two'), (3, 'three'), > (4, null), (5, 'five'); > INSERT 0 5 > test_pub=# select * from t1; > id | msg > ----+------- > 1 | one > 2 | two > 3 | three > 4 | > 5 | five > (5 rows) > > // data at sub (not it includes the row 4) > test_sub=# select * from t1; > id | msg > ----+------ > 1 | one > 2 | two > 4 | > 5 | five > (4 rows) > > ~~ > > So, IMO the PG docs wording for this part should be relaxed a bit. > > e.g. > BEFORE: > + A nullable column in the <literal>WHERE</literal> clause could cause the > + expression to evaluate to false; avoid using columns without not-null > + constraints in the <literal>WHERE</literal> clause. > AFTER: > + A nullable column in the <literal>WHERE</literal> clause could cause the > + expression to evaluate to false. To avoid unexpected results, any possible > + null values should be accounted for. > Your suggested wording sounds reasonable to me. Euler, others, any thoughts? -- With Regards, Amit Kapila.
pgsql-hackers by date: