Re: [HACKERS] WIP: long transactions on hot standby feedback replica/ proof of concept - Mailing list pgsql-hackers
From | Ivan Kartyshov |
---|---|
Subject | Re: [HACKERS] WIP: long transactions on hot standby feedback replica/ proof of concept |
Date | |
Msg-id | 292210e99e1ca70742abfbf9e55d7805@postgrespro.ru Whole thread Raw |
In response to | Re: [HACKERS] WIP: long transactions on hot standby feedback replica/ proof of concept (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: [HACKERS] WIP: long transactions on hot standby feedback replica/ proof of concept
Re: [HACKERS] WIP: long transactions on hot standby feedback replica/ proof of concept Re: [HACKERS] WIP: long transactions on hot standby feedback replica/ proof of concept |
List | pgsql-hackers |
Thank you for your valuable comments. I've made a few adjustments. The main goal of my changes is to let long read-only transactions run on replica if hot_standby_feedback is turned on. Patch1 - hsfeedback_av_truncate.patch is made to stop ResolveRecoveryConflictWithLock occurs on replica, after autovacuum lazy truncates heap on master cutting some pages at the end. When hot_standby_feedback is on, we know that the autovacuum does not remove anything superfluous, which could be needed on standby, so there is no need to rise any ResolveRecoveryConflict*. 1) Add to xl_standby_locks and xl_smgr_truncate isautovacuum flag, which tells us that autovacuum generates them. 2) When autovacuum decides to trim the table (using lazy_truncate_heap), it takes AccessExclusiveLock and sends this lock to the replica, but replica should ignore AccessExclusiveLock if hot_standby_feedback=on. 3) When autovacuum truncate wal message is replayed on a replica, it takes ExclusiveLock on a table, so as not to interfere with read-only requests. We have two cases of resolving ResolveRecoveryConflictWithLock if timers (max_standby_streaming_delay and max_standby_archive_delay) have run out: backend is idle in transaction (waiting input) - in this case backend will be sent SIGTERM backend transaction is running query - in this case running transaction will be aborted How to test: Make async replica, turn on feedback and reduce max_standby_streaming_delay. Make autovacuum more aggressive. autovacuum = on autovacuum_max_workers = 1 autovacuum_naptime = 1s autovacuum_vacuum_threshold = 1 autovacuum_vacuum_cost_delay = 0 Test1: Here we will do a load on the master and simulation of a long transaction with repeated 1 second SEQSCANS on the replica (by calling pg_sleep 1 second duration every 6 seconds). MASTER REPLICA hot_standby = on max_standby_streaming_delay = 1s hot_standby_feedback = on start CREATE TABLE test AS (SELECT id, 1 AS value FROM generate_series(1,1) id); pgbench -T600 -P2 -n --file=master.sql postgres (update test set value = value;) start BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED; SELECT pg_sleep(value) FROM test; \watch 6 ---Autovacuum truncate pages at the end Result on replica: FATAL: terminating connection due to conflict with recovery DETAIL: User was holding a relation lock for too long. On Patched version lazy_vacuum_truncation passed without fatal errors. Only some times Error occurs because this tests is too synthetic ERROR: canceling statement due to conflict with recovery DETAIL: User was holding shared buffer pin for too long. Because of rising ResolveRecoveryConflictWithSnapshot while redo some visibility flags to avoid this conflict we can do test2 or increase max_standby_streaming_delay. Test2: Here we will do a load on the master and simulation of a long transaction on the replica (by taking LOCK on table) MASTER REPLICA hot_standby = on max_standby_streaming_delay = 1s hot_standby_feedback = on start CREATE TABLE test AS (SELECT id, 1 AS value FROM generate_series(1,1) id); pgbench -T600 -P2 -n --file=master.sql postgres (update test set value = value;) start BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED; LOCK TABLE test IN ACCESS SHARE MODE; select * from test; \watch 6 ---Autovacuum truncate pages at the end Result on replica: FATAL: terminating connection due to conflict with recovery DETAIL: User was holding a relation lock for too long. On Patched version lazy_vacuum_truncation passed without fatal errors. Test3: Here we do a load on the master and simulation of a long transaction with repeated 1 second SEQSCANS on the replica (by calling pg_sleep 1 second duration every 6 seconds). MASTER REPLICA hot_standby = on max_standby_streaming_delay = 4s hot_standby_feedback = on start CREATE TABLE test AS (SELECT id, 200 AS value FROM generate_series(1,1) id); pgbench -T600 -P2 -n --file=master.sql postgres (update test set value = value;) start BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED; SELECT pg_sleep(value) FROM test; ---Autovacuum truncate pages at the end Result on replica: FATAL: terminating connection due to conflict with recovery DETAIL: User was holding a relation lock for too long. On Patched version lazy_vacuum_truncation passed without fatal errors. This way we can make transactions with SEQSCAN, INDEXSCAN or BITMAPSCAN Patch2 - hsfeedback_noninvalide_xmin.patch When walsender is initialized, its xmin in PROCARRAY is set to GetOldestXmin() in order to prevent autovacuum running on master from truncating relation and removing some pages that are required by replica. This might happen if master's autovacuum and replica's query started simultaneously. And the replica has not yet reported its xmin value. How to test: Make async replica, turn on feedback, reduce max_standby_streaming_delay and aggressive autovacuum. autovacuum = on autovacuum_max_workers = 1 autovacuum_naptime = 1s autovacuum_vacuum_threshold = 1 autovacuum_vacuum_cost_delay = 0 Test: Here we will start replica and begi repeatable read transaction on table, then we stop replicas postmaster to prevent starting walreceiver worker (on master startup) and sending master it`s transaction xmin over hot_standby_feedback message. MASTER REPLICA start CREATE TABLE test AS (SELECT id, 1 AS value FROM generate_series(1,10000000) id); stop start BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ; SELECT * FROM test; stop postmaster with gdb start DELETE FROM test WHERE id > 0; wait till autovacuum delete and changed xmin release postmaster with gdb --- Result on replica FATAL: terminating connection due to conflict with recovery DETAIL: User query might have needed to see row versions that must be removed. There is one feature of the behavior of standby, which let us to allow the autovacuum to cut off the page table (at the end of relation) that no one else needs (because there is only dead and removed tuples). So if the standby SEQSCAN or another *SCAN mdread a page that is damaged or has been deleted, it will receive a zero page, and not break the request for ERROR. Could you give me your ideas over these patches. -- Ivan Kartyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
pgsql-hackers by date: