RE: logical decoding and replication of sequences, take 2 - Mailing list pgsql-hackers
From | Hayato Kuroda (Fujitsu) |
---|---|
Subject | RE: logical decoding and replication of sequences, take 2 |
Date | |
Msg-id | TY3PR01MB98899FA0082AE9918886E150F58CA@TY3PR01MB9889.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | Re: logical decoding and replication of sequences, take 2 (Amit Kapila <amit.kapila16@gmail.com>) |
List | pgsql-hackers |
Dear hackers, > It is correct that we can make a wrong decision about whether a change > is transactional or non-transactional when sequence DDL happens before > the SNAPBUILD_FULL_SNAPSHOT state and the sequence operation happens > after that state. I found a workload which decoder distinguish wrongly. # Prerequisite Apply an attached patch for inspecting the sequence status. It can be applied atop v20231203 patch set. Also, a table and a sequence must be defined: ``` CREATE TABLE foo (var int); CREATE SEQUENCE s; ``` # Workload Then, you can execute concurrent transactions from three clients like below: Client-1 BEGIN; INSERT INTO foo VALUES (1); Client-2 SELECT pg_create_logical_replication_slot('slot', 'test_decoding'); Client-3 BEGIN; ALTER SEQUENCE s MAXVALUE 5000; COMMIT; SAVEPOINT s1; SELECT setval('s', 2000); ROLLBACK; SELECT pg_logical_slot_get_changes('slot', 'test_decoding'); # Result and analysis At first, below lines would be output on the log. This meant that WAL records for ALTER SEQUENCE were decoded but skipped because the snapshot had been building. ``` ... LOG: logical decoding found initial starting point at 0/154D238 DETAIL: Waiting for transactions (approximately 1) older than 741 to end. STATEMENT: SELECT * FROM pg_create_logical_replication_slot('slot', 'test_decoding'); LOG: XXX: smgr_decode. snapshot is SNAPBUILD_BUILDING_SNAPSHOT STATEMENT: SELECT * FROM pg_create_logical_replication_slot('slot', 'test_decoding'); LOG: XXX: skipped STATEMENT: SELECT * FROM pg_create_logical_replication_slot('slot', 'test_decoding'); LOG: XXX: seq_decode. snapshot is SNAPBUILD_BUILDING_SNAPSHOT STATEMENT: SELECT * FROM pg_create_logical_replication_slot('slot', 'test_decoding'); LOG: XXX: skipped ... ``` Note that above `seq_decode...` line was not output via `setval()`, it was done by ALTER SEQUENCE statement. Below is a call stack for inserting WAL. ``` XLogInsert(RM_SEQ_ID, XLOG_SEQ_LOG); fill_seq_fork_with_data fill_seq_with_data AlterSequence ``` Then, subsequent lines would say like them. This means that the snapshot becomes FULL and `setval()` is regarded non-transactional wrongly. ``` LOG: logical decoding found initial consistent point at 0/154D658 DETAIL: Waiting for transactions (approximately 1) older than 742 to end. STATEMENT: SELECT * FROM pg_create_logical_replication_slot('slot', 'test_decoding'); LOG: XXX: seq_decode. snapshot is SNAPBUILD_FULL_SNAPSHOT STATEMENT: SELECT * FROM pg_create_logical_replication_slot('slot', 'test_decoding'); LOG: XXX: the sequence is non-transactional STATEMENT: SELECT * FROM pg_create_logical_replication_slot('slot', 'test_decoding'); LOG: XXX: not consistent: skipped ``` The change would be discarded because the snapshot has not been CONSISTENT yet by the below part. If it has been transactional, we would have queued this change though the transaction will be skipped at commit. ``` else if (!transactional && (SnapBuildCurrentState(builder) != SNAPBUILD_CONSISTENT || SnapBuildXactNeedsSkip(builder, buf->origptr))) return; ``` But anyway, we could find a case which we can make a wrong decision. This example is lucky - does not output wrongly, but I'm not sure all the case like that. Best Regards, Hayato Kuroda FUJITSU LIMITED
Attachment
pgsql-hackers by date: