less expensive pg_buffercache on big shmem - Mailing list pgsql-hackers
From | Ivan Kartyshov |
---|---|
Subject | less expensive pg_buffercache on big shmem |
Date | |
Msg-id | f9d6cab2-73a7-7a84-55a8-07dcb8516ae5@postgrespro.ru Whole thread Raw |
Responses |
Re: less expensive pg_buffercache on big shmem
|
List | pgsql-hackers |
Hi hackers, Recently I have finished my work on a patch for pg_buffercache contrib, I think it's time to share my results. Introduction ============ I want to offer you the implementation that allows to decrease system workload by partially sacrificing (fully snapshot consistency) data consistency. Sometimes we do not need full data consistency, for example on quantitative rather than qualitative analysis of memory contents, or when we want to catch insufficient memory resources or how often relation is used. Problem description =================== Currently, the pg_buffercache v1.1 and prior takes an exclusive lock on all shared buffers, which greatly affects system performance. Usually we use pg_buffercache to find out why DB is working slower than expected or examine what occupies the entire memory. So when we run pg_buffercache on such system, we make it even slower. Implementation ============== Vanilla implementation contains loop which collecting statistic from whole shared memory acquire, read and release Spinlocks one by one, page by page while holding LWLock. V1.2 implementation contains flexible loop which can collect shared memory statistic using three different methods: 1) with holding LWLock only on one partition of shared memory (semiconsistent method) 2) without LWLocks (nonconsistent method), 3) or in vanilia way (consistent method) The aforementioned allow us to launch pg_buffercache in the three different ways. Each of them have some advantages and some disadvantages: Consistent: + 100% consistency of shared memory snapshot - Slowdown the system with whole shared memory exclusive lock Semiconsistent: + Faster than consistent method + Mostly doesn`t affect on the system load - Speed of taking that snapshot is low Nonconsistent: The fastest + Doesn`t noticeably affects on the systems - <3% lost of snapshot consistency What works ========== Actually, it work well even on big load, but of course there might be things I've overlooked. VIEW pg_buffercache_cons VIEW pg_buffercache_noncons VIEW pg_buffercache_semicons Examples from docs in new realization: SELECT c.relname, count(*) AS buffers FROM pg_buffercache_noncons b INNER JOIN pg_class c ON b.relfilenode = pg_relation_filenode(c.oid) AND b.reldatabase IN (0, (SELECT oid FROM pg_database WHERE datname = current_database())) GROUP BY c.relname ORDER BY 2 DESC LIMIT 10; SELECT c.relname, count(*) AS buffers FROM pg_buffercache_semicons b INNER JOIN pg_class c ON b.relfilenode = pg_relation_filenode(c.oid) AND b.reldatabase IN (0, (SELECT oid FROM pg_database WHERE datname = current_database())) GROUP BY c.relname ORDER BY 2 DESC LIMIT 10; Testing the implementation ====================== How implementation tested: 1) Start server 2) Make pgbench tps pgbench -c 250 -s 1000 -T 200 -P1 3) Compare how tps sags under load if: SELECT count(*) FROM pg_buffercache_cons; SELECT count(*) FROM pg_buffercache_semicons; SELECT count(*) FROM pg_buffercache_noncons; This test was made on server (server parameters) Model name: Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz CPU(s): 144 Socket(s): 4 Shared_buffers: 200GB Results of testing ====================== Our DBA team obtained the following results: Nonconsistent: * 10% faster then consistent method * doesn`t noticeably affects on the systems * the maximum loss of accuracy was less then 3%* ( in most situation it is permissible accuracy loss ) Semiconsistent: * 5 time slower then nonconsistent * made less affects on system compared to consistent Overall results: Our clients was pleased with this implementation. Implementation is made with backward compatibility, as a conclusion old pg_buffercache v1.1 queries will work well. Semiconsistent show results approaching to nonconsistent on SELECTONLY queries. * this values were obtained from our DBA tests. What can be better =============== It is unclear how to optimize the semiconsistent method to make it faster, and reduce temporary effect that appears from time to time. I will be glad to see your feedback! --- Ivan Kartyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
pgsql-hackers by date: