Re: MultiXact\SLRU buffers configuration - Mailing list pgsql-hackers
From | Andrey M. Borodin |
---|---|
Subject | Re: MultiXact\SLRU buffers configuration |
Date | |
Msg-id | 3B099683-ECCD-43CD-A3D6-F08C3745002A@yandex-team.ru Whole thread Raw |
In response to | Re: MultiXact\SLRU buffers configuration (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Responses |
Re: MultiXact\SLRU buffers configuration
|
List | pgsql-hackers |
> 14 мая 2020 г., в 06:25, Kyotaro Horiguchi <horikyota.ntt@gmail.com> написал(а): > > At Wed, 13 May 2020 23:08:37 +0500, "Andrey M. Borodin" <x4mmm@yandex-team.ru> wrote in >> >> >>> 11 мая 2020 г., в 16:17, Andrey M. Borodin <x4mmm@yandex-team.ru> написал(а): >>> >>> I've went ahead and created 3 patches: >>> 1. Configurable SLRU buffer sizes for MultiXacOffsets and MultiXactMembers >>> 2. Reduce locking level to shared on read of MultiXactId members >>> 3. Configurable cache size >> >> I'm looking more at MultiXact and it seems to me that we have a race condition there. >> >> When we create a new MultiXact we do: >> 1. Generate new MultiXactId under MultiXactGenLock >> 2. Record new mxid with members and offset to WAL >> 3. Write offset to SLRU under MultiXactOffsetControlLock >> 4. Write members to SLRU under MultiXactMemberControlLock > > But, don't we hold exclusive lock on the buffer through all the steps > above? Yes...Unless MultiXact is observed on StandBy. This could lead to observing inconsistent snapshot: one of lockers committedtuple delete, but standby sees it as alive. >> When we read MultiXact we do: >> 1. Retrieve offset by mxid from SLRU under MultiXactOffsetControlLock >> 2. If offset is 0 - it's not filled in at step 4 of previous algorithm, we sleep and goto 1 >> 3. Retrieve members from SLRU under MultiXactMemberControlLock >> 4. ..... what we do if there are just zeroes because step 4 is not executed yet? Nothing, return empty members list. > > So transactions never see such incomplete mxids, I believe. I've observed sleep in step 2. I believe it's possible to observe special effects of step 4 too. Maybe we could add lock on standby to dismiss this 1000us wait? Sometimes it hits hard on Standbys: if someone is lockingwhole table on primary - all seq scans on standbys follow him with MultiXactOffsetControlLock contention. It looks like this: 0x00007fcd56896ff7 in __GI___select (nfds=nfds@entry=0, readfds=readfds@entry=0x0, writefds=writefds@entry=0x0, exceptfds=exceptfds@entry=0x0,timeout=timeout@entry=0x7ffd83376fe0) at ../sysdeps/unix/sysv/linux/select.c:41 #0 0x00007fcd56896ff7 in __GI___select (nfds=nfds@entry=0, readfds=readfds@entry=0x0, writefds=writefds@entry=0x0, exceptfds=exceptfds@entry=0x0,timeout=timeout@entry=0x7ffd83376fe0) at ../sysdeps/unix/sysv/linux/select.c:41 #1 0x000056186e0d54bd in pg_usleep (microsec=microsec@entry=1000) at ./build/../src/port/pgsleep.c:56 #2 0x000056186dd5edf2 in GetMultiXactIdMembers (from_pgupgrade=0 '\000', onlyLock=<optimized out>, members=0x7ffd83377080,multi=3106214809) at ./build/../src/backend/access/transam/multixact.c:1370 #3 GetMultiXactIdMembers () at ./build/../src/backend/access/transam/multixact.c:1202 #4 0x000056186dd2d2d9 in MultiXactIdGetUpdateXid (xmax=<optimized out>, t_infomask=<optimized out>) at ./build/../src/backend/access/heap/heapam.c:7039 #5 0x000056186dd35098 in HeapTupleGetUpdateXid (tuple=tuple@entry=0x7fcba3b63d58) at ./build/../src/backend/access/heap/heapam.c:7080 #6 0x000056186e0cd0f8 in HeapTupleSatisfiesMVCC (htup=<optimized out>, snapshot=0x56186f44a058, buffer=230684) at ./build/../src/backend/utils/time/tqual.c:1091 #7 0x000056186dd2d922 in heapgetpage (scan=scan@entry=0x56186f4c8e78, page=page@entry=3620) at ./build/../src/backend/access/heap/heapam.c:439 #8 0x000056186dd2ea7c in heapgettup_pagemode (key=0x0, nkeys=0, dir=ForwardScanDirection, scan=0x56186f4c8e78) at ./build/../src/backend/access/heap/heapam.c:1034 #9 heap_getnext (scan=scan@entry=0x56186f4c8e78, direction=direction@entry=ForwardScanDirection) at ./build/../src/backend/access/heap/heapam.c:1801 #10 0x000056186de84f51 in SeqNext (node=node@entry=0x56186f4a4f78) at ./build/../src/backend/executor/nodeSeqscan.c:81 #11 0x000056186de6a3f1 in ExecScanFetch (recheckMtd=0x56186de84ef0 <SeqRecheck>, accessMtd=0x56186de84f20 <SeqNext>, node=0x56186f4a4f78)at ./build/../src/backend/executor/execScan.c:97 #12 ExecScan (node=0x56186f4a4f78, accessMtd=0x56186de84f20 <SeqNext>, recheckMtd=0x56186de84ef0 <SeqRecheck>) at ./build/../src/backend/executor/execScan.c:164 Best regards, Andrey Borodin.
pgsql-hackers by date: