Re: [HACKERS] snapbuild woes - Mailing list pgsql-hackers
From | Petr Jelinek |
---|---|
Subject | Re: [HACKERS] snapbuild woes |
Date | |
Msg-id | 7286ebe9-cb1b-cf1a-f85b-fbe3402ea0bf@2ndquadrant.com Whole thread Raw |
In response to | Re: [HACKERS] snapbuild woes (Petr Jelinek <petr.jelinek@2ndquadrant.com>) |
Responses |
Re: [HACKERS] snapbuild woes
Re: [HACKERS] snapbuild woes |
List | pgsql-hackers |
On 13/12/16 00:38, Petr Jelinek wrote: > On 12/12/16 23:33, Andres Freund wrote: >> On 2016-12-12 23:27:30 +0100, Petr Jelinek wrote: >>> On 12/12/16 22:42, Andres Freund wrote: >>>> Hi, >>>> >>>> On 2016-12-10 23:10:19 +0100, Petr Jelinek wrote: >>>>> Hi, >>>>> First one is outright bug, which has to do with how we track running >>>>> transactions. What snapbuild basically does while doing initial snapshot >>>>> is read the xl_running_xacts record, store the list of running txes and >>>>> then wait until they all finish. The problem with this is that >>>>> xl_running_xacts does not ensure that it only logs transactions that are >>>>> actually still running (to avoid locking PGPROC) so there might be xids >>>>> in xl_running_xacts that already committed before it was logged. >>>> >>>> I don't think that's actually true? Notice how LogStandbySnapshot() >>>> only releases the lock *after* the LogCurrentRunningXacts() iff >>>> wal_level >= WAL_LEVEL_LOGICAL. So the explanation for the problem you >>>> observed must actually be a bit more complex :( >>>> >>> >>> Hmm, interesting, I did see the transaction commit in the WAL before the >>> xl_running_xacts that contained the xid as running. I only seen it on >>> production system though, didn't really manage to easily reproduce it >>> locally. >> >> I suspect the reason for that is that RecordTransactionCommit() doesn't >> conflict with ProcArrayLock in the first place - only >> ProcArrayEndTransaction() does. So they're still running in the PGPROC >> sense, just not the crash-recovery sense... >> > > That looks like reasonable explanation. BTW I realized my patch needs > bit more work, currently it will break the actual snapshot as it behaves > same as if the xl_running_xacts was empty which is not correct AFAICS. > Hi, I got to work on this again. Unfortunately I haven't found solution that I would be very happy with. What I did is in case we read xl_running_xacts which has all transactions we track finished, we start tracking from that new xl_running_xacts again with the difference that we clean up the running transactions based on previously seen committed ones. That means that on busy server we may wait for multiple xl_running_xacts rather than just one, but at least we have chance to finish unlike with current coding which basically waits for empty xl_running_xacts. I also removed the additional locking for logical wal_level in LogStandbySnapshot() since it does not work. I also identified another bug in snapbuild while looking at the code. That is the logical decoding will try to use on disk serialized snapshot for initial snapshot export when it can. The problem is that these snapshots are quite special and are not really usable as snapshots for data (ie, the logical decoding snapshots regularly have xmax smaller than xmin). So then when client tries to use this exported snapshot it gets completely wrong data as the snapshot is broken. I think this is explanation for Erik Rijker's problems with the initial COPY patch for logical replication. At least for me the issues go away when I disable use of the ondisk snapshots. I didn't really find better solution than that though (disabling the use of ondisk snapshots for initial consistent snapshot). So to summarize attached patches: 0001 - Fixes performance issue where we build tons of snapshots that we don't need which kills CPU. 0002 - Disables the use of ondisk historical snapshots for initial consistent snapshot export as it may result in corrupt data. This definitely needs backport. 0003 - Fixes bug where we might never reach snapshot on busy server due to race condition in xl_running_xacts logging. The original use of extra locking does not seem to be enough in practice. Once we have agreed fix for this it's probably worth backpatching. There are still some comments that need updating, this is more of a PoC. Thoughts or better ideas? -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
pgsql-hackers by date: