Re: [HACKERS] make async slave to wait for lsn to be replayed - Mailing list pgsql-hackers
From | Anna Akenteva |
---|---|
Subject | Re: [HACKERS] make async slave to wait for lsn to be replayed |
Date | |
Msg-id | f4100c14985b1b7983aefd00cb6b491c@postgrespro.ru Whole thread Raw |
In response to | Re: [HACKERS] make async slave to wait for lsn to be replayed (Kartyshov Ivan <i.kartyshov@postgrespro.ru>) |
Responses |
Re: [HACKERS] make async slave to wait for lsn to be replayed
|
List | pgsql-hackers |
On 2020-03-21 14:16, Kartyshov Ivan wrote: > As it was discussed earlier, I added wait for statement into > begin/start statement. Thanks! To address the discussion: I like the idea of having WAIT as a part of BEGIN statement rather than a separate command, as Thomas Munro suggested. That way, the syntax itself enforces that WAIT FOR LSN will actually take effect, even for single-snapshot transactions. It seems more convenient for the user, who won't have to remember the details about how WAIT interacts with isolation levels. > BEGIN [ WORK | TRANSACTION ] [ transaction_mode[, ...] ] wait_for_event Not sure about this, but could we add "WAIT FOR .." as another transaction_mode rather than a separate thing? That way, user won't have to worry about the order. As of now, one should remember to always put WAIT FOR as the Last parameter in the BEGIN statement. > and event is: > LSN value [options] > TIMESTAMP value I would maybe remove WAIT FOR TIMESTAMP. As Robert Haas has pointed out, it seems a lot like pg_sleep_until(). Besides, it doesn't necessarily need to be connected to transaction start, which makes it different from WAIT FOR LSN - so I wouldn't mix them together. I had another look at the code: === In WaitShmemSize() you might want to use functions that check for overflow - add_size()/mul_size(). They're used in similar situations, for example in BTreeShmemSize(). === This is how WaitUtility() is called - note that time_val will always be > 0: + if (time_val <= 0) + time_val = 1; +... + res = WaitUtility(lsn, (int)(time_val * 1000), dest); (time_val * 1000) is passed to WaitUtility() as the delay argument. And inside WaitUtility() we have this: +if (delay > 0) + latch_events = WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH; +else + latch_events = WL_LATCH_SET | WL_POSTMASTER_DEATH; Since we always pass a delay value greater than 0, we'll never get to the "else" clause here and we'll never be ready to wait for LSN forever. Perhaps due to that, the current test outputs this after a simple WAIT FOR LSN command: psql:<stdin>:1: NOTICE: LSN is not reached. === Speaking of tests, When I tried to test BEGIN TRANSACTION WAIT FOR LSN, I got a segfault: LOG: statement: BEGIN TRANSACTION WAIT FOR LSN '0/3002808' LOG: server process (PID 10385) was terminated by signal 11: Segmentation fault DETAIL: Failed process was running: COMMIT Could you add some more tests to the patch when this is fixed? With WAIT as part of BEGIN statement + with things such as WAIT FOR ALL ... / WAIT FOR ANY ... / WAIT FOR LSN ... UNTIL TIMESTAMP ... === In WaitSetLatch() we should probably check backend for NULL before calling SetLatch(&backend->procLatch) We might also need to check wait statement for NULL in these two cases: + case T_TransactionStmt: + {... + result = transformWaitForStmt(pstate, (WaitStmt *) stmt->wait); case TRANS_STMT_START: {... + WaitStmt *waitstmt = (WaitStmt *) stmt->wait; + res = WaitMain(waitstmt, dest); === After we added the "wait" attribute to TransactionStmt struct, do we perhaps need to add something to _copyTransactionStmt() / _equalTransactionStmt()? -- Anna Akenteva Postgres Professional: The Russian Postgres Company http://www.postgrespro.com
pgsql-hackers by date: