Re: Avoiding repeated snapshot computation - Mailing list pgsql-hackers
| From | Andres Freund |
|---|---|
| Subject | Re: Avoiding repeated snapshot computation |
| Date | |
| Msg-id | 201111292155.52460.andres@anarazel.de Whole thread Raw |
| In response to | Re: Avoiding repeated snapshot computation (Pavan Deolasee <pavan.deolasee@gmail.com>) |
| List | pgsql-hackers |
On Tuesday, November 29, 2011 05:51:40 AM Pavan Deolasee wrote:
> On Tue, Nov 29, 2011 at 8:30 AM, Kevin Grittner
>
> <Kevin.Grittner@wicourts.gov> wrote:
> > Andres Freund wrote:
> >> I would like to see somebody running a benchmark on a machine with
> >> higher concurrency...
> >
> > Yeah, me too. I don't find it at all hard to believe that we will
> > see significant performance benefit by changing the PGPROC structure
> > so that all parts of it can be accessed through one cache line rather
> > than two. The fact that we saw improvement when changing it down to
> > two, even though there is extra indirection in some places, seems
> > like pretty solid evidence on the face of it.
>
> I think there is fundamental difference between the PGPROC patch that
> we did and the rearranging SnapshotData members that Andres has
> proposed. The PGPROC structure is a shared memory area, very heavily
> accessed by GetSnapshotData (and some others). There was a linear
> increase in the access as number of clients go up. The SnapshotData
> structure is mostly from a backend local memory (unless we do
> something what I suggested in the OP) and hence fitting that in a
> single cache line may or may not have much impact. We don't even
> guarantee that it starts at a cacheline which means in most cases it
> will still be spread on multiple cache lines.
Well, I could measure a ~1.3% benefit on a modestly concurrent machine (see
the earlier reformatted mail from Gurjeet) and the benefits were definitely
smaller without concurrency. But I aggree - the benefits wont be even remotely
as big as the PGPROC splitup patch.
Youre also right about the alignment not being predictable enough.
Perhaps pg should start using/introducing a force_cacheline_align macro which
can easily implemented on most compilers? Especially for SnapshotData and
consorts which are aligned statically that should be good enough.
Something along thelines of
#ifdef _GNUC__
#define force_align_to(alignment) __attribute__((align(alignment)))
#elif __MSVC__
#define force_align_to(alignment) __declspec(align(alignment))
#else
#define force_align_to(alignment)
#endif
#define CACHELIGN_ALIGNMENT 64
#define force_cacheline_align force_align_to(CACHELING_ALIGNMENT)
Looks complete enough for now.
Back to PGPROC:
Looking at the current PGPROC layout on x86-64 with linux abi (generated by
pahole):
struct PGPROC { SHM_QUEUE links; /* 0 16 */ PGSemaphoreData
sem; /* 16 8 */ int waitStatus; /* 24 4 */
Latch procLatch; /* 28 12 */ LocalTransactionId lxid;
/* 40 4 */ int pid; /* 44 4 */ int
pgprocno; /* 48 4 */ BackendId backendId; /* 52 4 */
Oid databaseId; /* 56 4 */ Oid roleId;
/* 60 4 */ /* --- cacheline 1 boundary (64 bytes) --- */ bool
recoveryConflictPending;/* 64 1 */ bool lwWaiting; /* 65 1 */
bool lwExclusive; /* 66 1 */
/* XXX 5 bytes hole, try to pack */
struct PGPROC * lwWaitLink; /* 72 8 */ LOCK * waitLock;
/* 80 8 */ PROCLOCK * waitProcLock; /* 88 8 */ LOCKMODE
waitLockMode; /* 96 4 */ LOCKMASK heldLocks; /* 100
4*/ XLogRecPtr waitLSN; /* 104 8 */ int
syncRepState; /* 112 4 */
/* XXX 4 bytes hole, try to pack */
SHM_QUEUE syncRepLinks; /* 120 16 */ /* --- cacheline 2 boundary (128 bytes)
was8 bytes ago --- */ SHM_QUEUE myProcLocks[16]; /* 136 256 */ /* --- cacheline 6
boundary(384 bytes) was 8 bytes ago --- */ struct XidCache subxids; /* 392 256 */
/* --- cacheline 10 boundary (640 bytes) was 8 bytes ago --- */ LWLockId backendLock;
/* 648 4 */
/* XXX 4 bytes hole, try to pack */
uint64 fpLockBits; /* 656 8 */ Oid fpRelId[16];
/* 664 64 */ /* --- cacheline 11 boundary (704 bytes) was 24 bytes ago --- */ bool
fpVXIDLock; /* 728 1 */
/* XXX 3 bytes hole, try to pack */
LocalTransactionId fpLocalTransactionId; /* 732 4 */
/* size: 736, cachelines: 12, members: 28 */ /* sum members: 720, holes: 4, sum holes: 16 */ /* last
cacheline:32 bytes */
}
It seems possible that reordering it to avoid sharing unrelated stuff on the
same cacheline might be beneficial. E.g. moving fastpath lock stuf away from
the more common lwlock infrastructure might be a good idea. Currently those
lines will bounce continually even though the fastpath stuff will mostly be
accessed locally.
Subxids and syncRepLinks look like good candidates too.
Anyway, sorry for the topic drift.
Andres
Andres
pgsql-hackers by date: