optimizing repeated MVCC snapshots - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | optimizing repeated MVCC snapshots |
Date | |
Msg-id | CA+Tgmoa-rAm3=LJ4iF+45A7-NWkLoJL8wsuYQXG=-dtzxK1y2A@mail.gmail.com Whole thread Raw |
Responses |
Re: optimizing repeated MVCC snapshots
|
List | pgsql-hackers |
On Tue, Jan 3, 2012 at 2:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Another thought is that it should always be safe to reuse an old >> snapshot if no transactions have committed or aborted since it was >> taken > > Yeah, that might work better. And it'd be a win for all MVCC snaps, > not just the ones coming from promoted SnapshotNow ... Here's a rough patch for that. Some benchmark results from a 32-core Itanium server are included below. They look pretty good. Out of the dozen configurations I tested, all but one came out ahead with the patch. The loser was 80-clients on permanent tables, but I'm not too worried about that since 80 clients on unlogged tables came out ahead. This is not quite in a committable state yet, since it presumes atomic 64-bit reads and writes, and those aren't actually atomic everywhere. What I'm thinking we can do is: on platforms where 8-byte reads and writes are known to be atomic, we do as the patch does currently. On any platform where we don't know that to be the case, we can move the test I added to GetSnapshotData inside the lock, which should still be a small win at low concurrencies. At high concurrencies it's a bit iffy, because making GetSnapshotData's critical section shorter might lead to lots of people trying to manipulate the ProcArrayLock spinlock in very rapid succession. Even if that turns out to be an issue, I'm inclined to believe that anyone who has enough concurrency for that to matter probably also has atomic 8-byte reads and writes, and so the most that will be needed is an update to our notion of which platforms have that capability. If that turns out to be wrong, the other obvious alternative is to not to the test at all unless it can be done unlocked. To support the above, I'm inclined to add a new file src/include/atomic.h which optionally defines a macro called ATOMIC_64BIT_OPS and macros atomic_read_uint64(r) and atomic_write_uint64(l, r). That way we can eventually support (a) architectures where 64-bit operations aren't atomic at all, (b) architectures where ordinary 64-bit operations are atomic (atomic_read_unit64(r) -> r, and atomic_write_uint64(l, r) -> l = r), and (c) architectures (like 32-bit x86) where ordinary 64-bit operations aren't atomic but special instructions (cmpxchg8b) can be used to get that behavior. m = master, s = with patch. scale factor 100, median of three 5-minute test runs. shared_buffers=8GB, checkpoint_segments=300, checkpoint_timeout=30min, effective_cache_size=340GB, wal_buffers=16MB, wal_writer_delay=20ms, listen_addresses='*', synchronous_commit=off. binary modified with chatr +pd L +pi L and run with rtsched -s SCHED_NOAGE -p 178. Permanent Tables ================ m01 tps = 912.865209 (including connections establishing) s01 tps = 916.848536 (including connections establishing) m08 tps = 6256.429549 (including connections establishing) s08 tps = 6364.214425 (including connections establishing) m16 tps = 10795.373683 (including connections establishing) s16 tps = 11038.233359 (including connections establishing) m24 tps = 13710.400042 (including connections establishing) s24 tps = 13836.823580 (including connections establishing) m32 tps = 14574.758456 (including connections establishing) s32 tps = 15125.196227 (including connections establishing) m80 tps = 12014.498814 (including connections establishing) s80 tps = 11825.302643 (including connections establishing) Unlogged Tables =============== m01 tps = 942.950926 (including connections establishing) s01 tps = 953.618574 (including connections establishing) m08 tps = 6492.238255 (including connections establishing) s08 tps = 6537.197731 (including connections establishing) m16 tps = 11363.708861 (including connections establishing) s16 tps = 11561.193527 (including connections establishing) m24 tps = 14656.659546 (including connections establishing) s24 tps = 14977.226426 (including connections establishing) m32 tps = 16310.814143 (including connections establishing) s32 tps = 16644.921538 (including connections establishing) m80 tps = 13422.438927 (including connections establishing) s80 tps = 13780.256723 (including connections establishing) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
pgsql-hackers by date: