Avoiding repeated snapshot computation - Mailing list pgsql-hackers
From | Pavan Deolasee |
---|---|
Subject | Avoiding repeated snapshot computation |
Date | |
Msg-id | CABOikdMsJ4OsxtA7XBV2quhKYUo_4105fJF4N+uyRoyBAzSuuQ@mail.gmail.com Whole thread Raw |
Responses |
Re: Avoiding repeated snapshot computation
Re: Avoiding repeated snapshot computation Re: Avoiding repeated snapshot computation |
List | pgsql-hackers |
On some recent benchmarks and profile data, I saw GetSnapshotData figures at the very top or near top. For lesser number of clients, it can account for 10-20% of time, but more number of clients I have seen it taking up as much as 40% of sample time. Unfortunately, the machine of which I was running these tests is currently not available and so I don't have the exact numbers. But the observation is almost correct. Our recent work on separating the hot members of PGPROC in a separate array would definitely reduce data cache misses ans reduce the GetSnapshotData time, but it probably still accounts for a large enough critical section for a highly contended lock. I think now that we have reduced the run time of the function itself, we should now try to reduce the number of times the function is called. Robert proposed a way to reduce the number of calls per transaction. I think we can go one more step further and reduce the number for across the transactions. One major problem today could be because the way LWLock works. If the lock is currently held in SHARED mode by some backend and some other backend now requests it in SHARED mode, it will immediately get it. Thats probably the right thing to do because you don't want the reader to really wait when the lock is readily available. But in the case of GetSnapshotData(), every reader is doing exactly the same thing; they are computing a snapshot based on the same shared state and would compute exactly the same snapshot (if we ignore the fact that we don't include caller's XID in xip array, but thats a minor detail). And because the way LWLock works, more and more readers would get in to compute the snapshot, until the exclusive waiters get a window to sneak in, either because more and more processes slowly start sleeping for exclusive access. To depict it, the four transactions make overlapping calls for GetSnapshotData() and hence the total critical section starts when the first caller enters it and the ends when the last caller exits. Txn1 ------[ SHARED ]--------------------- Txn2 --------[ SHARED ]------------------- Txn3 -----------------[ SHARED ]------------- Txn4 -------------------------------------------[ SHARED ]--------- |<---------------Total Time ------------------------------------>| Couple of ideas come to mind to solve this issue. A snapshot once computed will remain valid for every call irrespective of its origin until at least one transaction ends. So we can store the last computed snapshot in some shared area and reuse it for all subsequent GetSnapshotData calls. The shared snapshot will get invalidated when some transaction ends by calling ProcArrayEndTransaction(). I tried this approach and saw a 15% improvement for 32-80 clients on the 32 core HP IA box with pgbench -s 100 -N tests. Not bad, but I think this can be improved further. What we can do is when a transaction comes to compute its snapshot, it checks if some other transaction is already computing a snapshot for itself. If so, it just sleeps on the lock. When the other process finishes computing the snapshot, it saves the snapshot is a shared area and wakes up all processes waiting for the snapshot. All those processes then just copy the snapshot from the shared area and they are done. This will not only reduce the total CPU consumption by avoiding repetitive work, but would also reduce the total time for which ProcArrayLock is held in SHARED mode by avoiding pipeline of GetSnapshotData calls. I am currently trying the shared work queue mechanism to implement this, but I am sure we can do it this in some other way too. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
pgsql-hackers by date: