RE: row filtering for logical replication - Mailing list pgsql-hackers
| From | houzj.fnst@fujitsu.com |
|---|---|
| Subject | RE: row filtering for logical replication |
| Date | |
| Msg-id | OS0PR01MB571618736E7E79309A723BBE94E99@OS0PR01MB5716.jpnprd01.prod.outlook.com Whole thread Raw |
| In response to | Re: row filtering for logical replication (Amit Kapila <amit.kapila16@gmail.com>) |
| Responses |
Re: row filtering for logical replication
|
| List | pgsql-hackers |
On July 23, 2021 6:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Fri, Jul 23, 2021 at 2:27 PM Rahila Syed <rahilasyed90@gmail.com> wrote:
> >
> > The column comparison for row filtering happens before the unchanged
> > toast columns are filtered. Unchanged toast columns are filtered just
> > before writing the tuple to output stream.
> >
>
> To perform filtering, you need to use the tuple from WAL and that tuple doesn't
> seem to have unchanged toast values, so how can we do filtering? I think it is a
> good idea to test this once.
I agreed.
Currently, both unchanged toasted key column and unchanged toasted non-key
column is not logged. So, we cannot get the toasted value directly for these
columns when doing row filtering.
I tested the current patch for toasted data and found a problem: In the current
patch, it will try to fetch the toast data from toast table when doing row
filtering[1]. But, it's unsafe to do that in walsender. We can see it use
HISTORIC snapshot in heap_fetch_toast_slice() and also the comments of
init_toast_snapshot() have said "Detoasting *must* happen in the same
transaction that originally fetched the toast pointer.". The toast data could
have been changed when doing row filtering. For exmaple, I tested the following
steps and get an error.
1) UPDATE a nonkey column in publisher.
2) Use debugger to block the walsender process in function
pgoutput_row_filter_exec_expr().
3) Open another psql to connect the publisher, and drop the table which updated
in 1).
4) Unblock the debugger in 2), and then I can see the following error:
---
ERROR: could not read block 0 in file "base/13675/16391"
---
[1]
(1)------publisher------
CREATE TABLE toasted_key (
id serial,
toasted_key text PRIMARY KEY,
toasted_col1 text,
toasted_col2 text
);
select repeat('9999999999', 200) as tvalue \gset
CREATE PUBLICATION pub FOR TABLE toasted_key WHERE (toasted_col2 = :'tvalue');
ALTER TABLE toasted_key REPLICA IDENTITY USING INDEX toasted_key_pkey;
ALTER TABLE toasted_key ALTER COLUMN toasted_key SET STORAGE EXTERNAL;
ALTER TABLE toasted_key ALTER COLUMN toasted_col1 SET STORAGE EXTERNAL;
ALTER TABLE toasted_key ALTER COLUMN toasted_col2 SET STORAGE EXTERNAL;
INSERT INTO toasted_key(toasted_key, toasted_col1, toasted_col2) VALUES(repeat('1234567890', 200), repeat('9876543210',
200),repeat('9999999999', 200));
(2)------subscriber------
CREATE TABLE toasted_key (
id serial,
toasted_key text PRIMARY KEY,
toasted_col1 text,
toasted_col2 text
);
CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres port=10000' PUBLICATION pub;
(3)------publisher------
UPDATE toasted_key SET toasted_col1 = repeat('1111113113', 200);
Based on the above steps, the row filter will ge through the following path
and fetch toast data in walsender.
------
pgoutput_row_filter_exec_expr
...
texteq
...
text *targ1 = DatumGetTextPP(arg1);
pg_detoast_datum_packed
detoast_attr
------
pgsql-hackers by date: