WIP patch for hint bit i/o mitigation - Mailing list pgsql-hackers
From | Merlin Moncure |
---|---|
Subject | WIP patch for hint bit i/o mitigation |
Date | |
Msg-id | CAHyXU0zn5emePLedoZUGrAQiF92F-YjvFr-P5vUh6n0WpKZ6PQ@mail.gmail.com Whole thread Raw |
Responses |
Re: WIP patch for hint bit i/o mitigation
Re: WIP patch for hint bit i/o mitigation |
List | pgsql-hackers |
Following the sig is a first cut at a patch (written by Atri) that attempts to mitigate hint bit i/o penalty when many pages worth of tuples are sequentially written out with the same transaction id. There have been other attempts to deal with this problem that fit niche cases (especially those that create the table in the same transaction as the one inserting) that work but don't really solve the problem generally. I previously attacked this problem ([1], [2]) and came up with a patch that cached hint bits inside tqual.c. The patch was pulled for a few reasons: 1) a lot of complexity without proper justification 2) sketchy cache replacement algorithm 3) I manged to misspell 'committed' just about everywhere 4) invalidation? Issues 1-3 could have been worked out but #4 was making me think the problem was a nonstarter, or at least, 'too much too soon'. The tuple visibility routines are in a very tight code path and having to deal with various things in the backend that could cause the xid to become stale were making me nervous. A smaller, simpler patch might be the ticket. What's happening here is that we're putting in 'last xid' cache in the same manner and style as in transam.c. Unlike transam.c though, we're caching the infomask bit only and only if we're safe per XLogNeedsFlush(). While useful in its own right, the transam.c 'last xid' cache is not sufficient because: 1) While it does guard against clog fetch, it does nothing to mitigate hint bit penalties which are just as bad if not worse since you get a page write+page header lock during a broad scan 2) In the very tight loops that crank HeapTupleSatisfiesMVCC, removing the minimal non-clog based processing (TransactionIdIsInProgress, etc) helps a little bit in cpu terms when scanning the table for the very first time. So given that -- the patch simple adds an extra check when/where hint bit status is checked in the visibility routines (currently, only HeapTupleSatisfiesMVCC is done but all the applicable visibility routines should be done). Basically, the way it works is like this: *) is hint bit set? *) if not? does the examined xid match the last examined one? *) if so, and the cached hint bit matches the one want, proceeed as if hint bit was set Basically, this adds two tests to visibility check in the worst case (zero if hint bit was set, one if xid does not match). Pretty cheap. If you fault through the cached xid and dirty the page, then you're performance shouldn't be worse than the status quo. Also, dealing with 'last xid' is immune from invalidation issues [3] which is a nice property. The most logic place to *set* the cached xid/bit seemed to be in SetHintBits(). Possible improvements might be to: *) set the hint bit if the page is dirty and/or *) set the hint bit but do not dirty the page upon cache (although this might have bad interplay with 'all hint bits must be WAL logged' page checksum proposal) Atri benched the patch and came up with the following: atri@atri-Aspire-4740:~/postgresql-9.2.1/contrib/pgbench$ ./pgbench atri -n -f bench.sql -c 8 -T 120 transaction type: Custom query number of transactions actually processed: 1433 tps = 11.900653 (including connections establishing) tps = 11.903127 (excluding connections establishing) atri@atri-Aspire-4740:~/postgresql-9.2.1/contrib/pgbench$ ./pgbench atri -n -f bench.sql -c 8 -T 120 transaction type: Custom query number of transactions actually processed: 1769 tps = 14.669422 (including connections establishing) tps = 14.673167 (excluding connections establishing) bench.sql: create temp table foo as select v from generate_series(1,100000) v; select * from foo; drop table foo; merlin notes: [1] http://archives.postgresql.org/pgsql-hackers/2011-03/msg01765.php [2] http://archives.postgresql.org/message-id/BANLkTinr1h1+rKX1UOukn-rAONBTEHQvew@mail.gmail.com) [3] http://postgresql.1045698.n5.nabble.com/Is-cachedFetchXidStatus-provably-valid-td5712454.html patch: *** tqual1.c 2012-09-20 03:17:58.000000000 +0530 --- tqual.c 2012-11-06 00:50:39.769409077 +0530 *************** *** 72,81 **** SnapshotData SnapshotAnyData = {HeapTupleSatisfiesAny}; SnapshotData SnapshotToastData = {HeapTupleSatisfiesToast}; /* local functions */ static bool XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot); - /* * SetHintBits() * --- 72,83 ---- SnapshotData SnapshotAnyData = {HeapTupleSatisfiesAny}; SnapshotData SnapshotToastData = {HeapTupleSatisfiesToast}; + static TransactionId cachedVisibilityXid; + static uint16 cachedVisibilityXidStatus; + /* local functions */ static bool XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot); /* * SetHintBits() * *************** *** 117,122 **** --- 119,127 ---- if (XLogNeedsFlush(commitLSN) && BufferIsPermanent(buffer)) return; /* not flushed yet,so don't set hint */ + + cachedVisibilityXid = xid; + cachedVisibilityXidStatus = infomask; } tuple->t_infomask |= infomask; *************** *** 904,910 **** HeapTupleSatisfiesMVCC(HeapTupleHeader tuple, Snapshot snapshot, Buffer buffer) { ! if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED)) { if (tuple->t_infomask & HEAP_XMIN_INVALID) return false; --- 909,917 ---- HeapTupleSatisfiesMVCC(HeapTupleHeader tuple, Snapshot snapshot, Buffer buffer) { ! if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED) ! && !(cachedVisibilityXid == HeapTupleHeaderGetXmin(tuple) ! && cachedVisibilityXidStatus == HEAP_XMIN_COMMITTED)) { if (tuple->t_infomask & HEAP_XMIN_INVALID) return false; *************** *** 1008,1014 **** return true; } ! if (!(tuple->t_infomask & HEAP_XMAX_COMMITTED)) { if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmax(tuple))) { --- 1015,1023 ---- return true; } ! if (!(tuple->t_infomask & HEAP_XMAX_COMMITTED) ! && !(cachedVisibilityXid == HeapTupleHeaderGetXmax(tuple) ! && cachedVisibilityXidStatus == HEAP_XMAX_COMMITTED)) { if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmax(tuple))) {
pgsql-hackers by date: