Thread: BUG #10533: 9.4 beta1 assertion failure in autovacuum process
The following bug has been logged on the website: Bug reference: 10533 Logged by: David Leverton Email address: levertond@googlemail.com PostgreSQL version: 9.4beta1 Operating system: RHEL 5 x86_64 Description: Our application's test suite triggers an assertion failure in an autovacuum process under 9.4 beta1. I wasn't able to reduce it to a nice test case, but I hope the backtrace illustrates the problem: #0 0x00000032bae30265 in raise () from /lib64/libc.so.6 #1 0x00000032bae31d10 in abort () from /lib64/libc.so.6 #2 0x000000000078b69d in ExceptionalCondition (conditionName=<value optimized out>, errorType=<value optimized out>, fileName=<value optimized out>, lineNumber=<value optimized out>) at assert.c:54 #3 0x00000000007ad6e2 in palloc (size=16) at mcxt.c:670 #4 0x00000000004d3592 in GetMultiXactIdMembers (multi=75092, members=0x7fff915f9468, allow_old=0 '\000') at multixact.c:1242 #5 0x0000000000495c9c in MultiXactIdGetUpdateXid (xmax=17061, t_infomask=<value optimized out>) at heapam.c:6059 #6 0x00000000007ba93c in HeapTupleHeaderIsOnlyLocked (tuple=0x42a5) at tqual.c:1539 #7 0x00000000007baf2c in HeapTupleSatisfiesVacuum (htup=<value optimized out>, OldestXmin=67407, buffer=347) at tqual.c:1174 #8 0x00000000005a96eb in heap_page_is_all_visible (onerel=0x2b1b020f3f58, blkno=86, buffer=347, tupindex=339, vacrelstats=0x1cfe3148, vmbuffer=0x7fff915fa65c) at vacuumlazy.c:1788 #9 lazy_vacuum_page (onerel=0x2b1b020f3f58, blkno=86, buffer=347, tupindex=339, vacrelstats=0x1cfe3148, vmbuffer=0x7fff915fa65c) at vacuumlazy.c:1220 #10 0x00000000005a9acb in lazy_vacuum_heap (onerel=0x2b1b020f3f58, vacrelstats=0x1cfe3148) at vacuumlazy.c:1152 #11 0x00000000005ab393 in lazy_scan_heap (onerel=0x2b1b020f3f58, vacstmt=<value optimized out>, bstrategy=<value optimized out>) at vacuumlazy.c:1080 #12 lazy_vacuum_rel (onerel=0x2b1b020f3f58, vacstmt=<value optimized out>, bstrategy=<value optimized out>) at vacuumlazy.c:239 #13 0x00000000005a8b35 in vacuum_rel (relid=18303, vacstmt=0x7fff915fae20, do_toast=<value optimized out>, for_wraparound=0 '\000') at vacuum.c:1202 #14 0x00000000005a8f89 in vacuum (vacstmt=0x7fff915fae20, relid=<value optimized out>, do_toast=0 '\000', bstrategy=<value optimized out>, for_wraparound=0 '\000', isTopLevel=<value optimized out>) at vacuum.c:234 #15 0x0000000000646d14 in do_autovacuum () at autovacuum.c:2796 #16 0x00000000006475bb in AutoVacWorkerMain (argc=<value optimized out>, argv=<value optimized out>) at autovacuum.c:1678 #17 0x0000000000647669 in StartAutoVacWorker () at autovacuum.c:1463 #18 0x0000000000655741 in StartAutovacuumWorker (postgres_signal_arg=<value optimized out>) at postmaster.c:5188 #19 sigusr1_handler (postgres_signal_arg=<value optimized out>) at postmaster.c:4842 #20 <signal handler called> #21 0x00000032baecd323 in __select_nocancel () from /lib64/libc.so.6 #22 0x000000000065286f in ServerLoop () at postmaster.c:1530 #23 0x00000000006566e2 in PostmasterMain (argc=3, argv=0x1cf79b80) at postmaster.c:1223 #24 0x00000000005e6f4c in main (argc=3, argv=0x1a) at main.c:225 I did look at recent commits in the area and it seems possible that 621a99a might have fixed the problem, if I'm understanding things correctly (it seems it now only calls GetMultiXactIdMembers when it's looking at a row inserted by the current transaction, which presumably wouldn't happen for a vacuum), but I'm not in a convenient situation to test a git build at the moment, and even if I did and it turned out not to crash anymore, I wouldn't be certain that it was fixed rather than just masked under those particular circumstances. Apologies if this report turns out to be noise. Postgres was installed using the RPMs from http://yum.pgrpms.org/, with no interesting configuration changes (only listen_addresses and pg_hba.conf).
On 06/05/2014 09:01 PM, levertond@googlemail.com wrote: > The following bug has been logged on the website: > > Bug reference: 10533 > Logged by: David Leverton > Email address: levertond@googlemail.com > PostgreSQL version: 9.4beta1 > Operating system: RHEL 5 x86_64 > Description: > > Our application's test suite triggers an assertion failure in an autovacuum > process under 9.4 beta1. I wasn't able to reduce it to a nice test case, > but I hope the backtrace illustrates the problem: Yes, it does, thanks for the report! > #0 0x00000032bae30265 in raise () from /lib64/libc.so.6 > #1 0x00000032bae31d10 in abort () from /lib64/libc.so.6 > #2 0x000000000078b69d in ExceptionalCondition (conditionName=<value > optimized out>, errorType=<value optimized out>, > fileName=<value optimized out>, lineNumber=<value optimized out>) at > assert.c:54 > #3 0x00000000007ad6e2 in palloc (size=16) at mcxt.c:670 > #4 0x00000000004d3592 in GetMultiXactIdMembers (multi=75092, > members=0x7fff915f9468, allow_old=0 '\000') at multixact.c:1242 > #5 0x0000000000495c9c in MultiXactIdGetUpdateXid (xmax=17061, > t_infomask=<value optimized out>) at heapam.c:6059 > #6 0x00000000007ba93c in HeapTupleHeaderIsOnlyLocked (tuple=0x42a5) at > tqual.c:1539 > #7 0x00000000007baf2c in HeapTupleSatisfiesVacuum (htup=<value optimized > out>, OldestXmin=67407, buffer=347) at tqual.c:1174 > #8 0x00000000005a96eb in heap_page_is_all_visible (onerel=0x2b1b020f3f58, > blkno=86, buffer=347, tupindex=339, vacrelstats=0x1cfe3148, > vmbuffer=0x7fff915fa65c) at vacuumlazy.c:1788 > #9 lazy_vacuum_page (onerel=0x2b1b020f3f58, blkno=86, buffer=347, > tupindex=339, vacrelstats=0x1cfe3148, vmbuffer=0x7fff915fa65c) > at vacuumlazy.c:1220 > ... MultiXactIdGetUpdateXid() calls GetMultiXactIdMembers(), which can fail if you run out of memory. That's not cool if you're in a critical section, as the error will be promoted to PANIC; the assertion checks that you don't call palloc() while in a critical section, to catch that kind of problems early. The potential for a problem is there in 9.3 as well, but the assertion was only added to 9.4 fairly recently. That function requires very little memory, so it's highly unlikely to fail with OOM in practice, but in theory it could. I think we'll need a variant of GetMultiXactIdMembers() that only returns the update XID, avoiding the palloc(). The straight-forward fix would be to copy-paste contents of GetMultiXactIdMembers() into MultiXactIdGetUpdateXid(), and instead of returning the members in an array, only return the update-xid. But it's a long and complicated function, so copy-pasting is not a good option. I think it needs to be refactored into some kind of a helper function that both MultiXactIdGetUpdateXid() and GetMultiXactIdMembers() could call. - Heikki
Hi, On 2014-06-05 21:42:23 +0300, Heikki Linnakangas wrote: > I think we'll need a variant of GetMultiXactIdMembers() that only returns > the update XID, avoiding the palloc(). The straight-forward fix would be to > copy-paste contents of GetMultiXactIdMembers() into > MultiXactIdGetUpdateXid(), and instead of returning the members in an array, > only return the update-xid. But it's a long and complicated function, so > copy-pasting is not a good option. I think it needs to be refactored into > some kind of a helper function that both MultiXactIdGetUpdateXid() and > GetMultiXactIdMembers() could call. While it certainly would be nice to have such a function I have my doubt it's realistic to get it for 9.4 and the backbranches. I haven't thought particularly much about this, but I don't really see why the heap_page_is_all_visible() bit needs to be in a critical section? Can't we just do that entire bit after the log_heap_clean()? Then the heap_page_is_all_visible() can be done outside a critical section. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Not sure if this is the same issue originally reported, but the description sounded the same. I can get this to happen pretty reliably in 9.4beta1 with the pgtap test suite for one of my extensions - https://github.com/omniti-labs/mimeo Doesn't happen at the same test or with the same table every time (usually after test08 when large write batches complete), but the error in my postgres logs is the same every time. I've got shared_buffers set fairly low (128mb) since this is just my test instance I run in a VM (saw you mention running out of memory). My tap tests pass without any issues in 9.3, 9.2 and 9.1. So far every time I've run them with 9.4 it's failed at some point when autovacuum kicked in. If you need anymore info let me know. From postgresql logs: TRAP: FailedAssertion("!(CritSectionCount == 0 || (CurrentMemoryContext) == ErrorContext || (MyAuxProcType == CheckpointerProcess))", File: "mcxt.c", Line: 670) 2014-06-06 15:54:29 EDT [] [23575] [53921aa3.5c17-4] db=,user=,e=00000 LOG: server process (PID 25555) was terminated by signal 6: Aborted 2014-06-06 15:54:29 EDT [] [23575] [53921aa3.5c17-5] db=,user=,e=00000 DETAIL: Failed process was running: autovacuum: VACUUM ANALYZE mimeo.mimeo_source_logdel_test_source_condition_q 2014-06-06 15:54:29 EDT [] [23575] [53921aa3.5c17-6] db=,user=,e=00000 LOG: terminating any other active server processes 2014-06-06 15:54:29 EDT [local] [[local]] [25559] [53921bee.63d7-38] db=keith,user=keith,e=57P02 WARNING: terminating connection because of crash of another server process 2014-06-06 15:54:29 EDT [local] [[local]] [25559] [53921bee.63d7-39] db=keith,user=keith,e=57P02 DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2014-06-06 15:54:29 EDT [local] [[local]] [25559] [53921bee.63d7-40] db=keith,user=keith,e=57P02 HINT: In a moment you should be able to reconnect to the database and repeat your command. 2014-06-06 15:54:29 EDT [local] [[local]] [25559] [53921bee.63d7-41] db=keith,user=keith,e=57P02 CONTEXT: SQL statement "SELECT dblink_exec('mimeo_logdel_refresh_mimeo_dest.logdel_test_dest','WITH a AS (SELECT col2,col1 FROM mimeo.mimeo_source_logdel_test_source2_q ORDER BY col2,col1 LIMIT ALL) UPDATE mimeo.mimeo_source_logdel_test_source2_q b SET processed = true FROM a WHERE a.col2 = b.col2 AND a.col1 = b.col1')" PL/pgSQL function refresh_logdel(text,integer,boolean,boolean,boolean) line 171 at EXECUTE statement Back trace: #0 0x00007f61fd14f037 in raise () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007f61fd152698 in abort () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x000000000079ea87 in ExceptionalCondition (conditionName=conditionName@entry=0x9554b0 "!(CritSectionCount == 0 || (CurrentMemoryContext) == ErrorContext || (MyAuxProcType == CheckpointerProcess))", errorType=errorType@entry=0x7d94b0 "FailedAssertion", fileName=fileName@entry=0x95514e "mcxt.c", lineNumber=lineNumber@entry=670) at assert.c:54 #3 0x00000000007c15e0 in palloc (size=128) at mcxt.c:670 #4 0x00000000007d8923 in psprintf (fmt=0x975589 "base/%u/%u") at psprintf.c:60 #5 0x00000000006b4c17 in mdopen (behavior=EXTENSION_FAIL, forknum=MAIN_FORKNUM, reln=0x1107360) at md.c:562 #6 mdopen (reln=0x1107360, forknum=MAIN_FORKNUM, behavior=EXTENSION_FAIL) at md.c:552 #7 0x00000000006b5d47 in _mdfd_getseg (reln=reln@entry=0x1107360, forknum=forknum@entry=MAIN_FORKNUM, blkno=blkno@entry=432, skipFsync=0 '\000', behavior=behavior@entry=EXTENSION_FAIL) at md.c:1707 #8 0x00000000006b66b8 in mdwrite (reln=0x1107360, forknum=MAIN_FORKNUM, blocknum=432, buffer=0x7f61f4cf75c0 "=", skipFsync=<optimized out>) at md.c:742 #9 0x000000000068dcad in FlushBuffer (reln=0x1107360, buf=0x7f61f3c9a540) at bufmgr.c:2002 #10 FlushBuffer (buf=0x7f61f3c9a540, reln=<optimized out>) at bufmgr.c:1917 #11 0x000000000068e621 in BufferAlloc (foundPtr=0x7fffef8daa1e "\272\001\177\006", strategy=0x0, blockNum=0, forkNum=VISIBILITYMAP_FORKNUM, relpersistence=112 'p', smgr=0x1105aa8) at bufmgr.c:687 #12 ReadBuffer_common (smgr=0x1105aa8, relpersistence=<optimized out>, forkNum=forkNum@entry=VISIBILITYMAP_FORKNUM, blockNum=blockNum@entry=0, mode=RBM_ZERO_ON_ERROR, strategy=0x0, hit=hit@entry=0x7fffef8daacf "") at bufmgr.c:335 #13 0x000000000068ecab in ReadBufferExtended (reln=reln@entry=0x7f61fe3e6e70, forkNum=forkNum@entry=VISIBILITYMAP_FORKNUM, blockNum=blockNum@entry=0, mode=mode@entry=RBM_ZERO_ON_ERROR, strategy=strategy@entry=0x0) at bufmgr.c:254 #14 0x00000000004a6ef5 in vm_readbuf (rel=rel@entry=0x7f61fe3e6e70, blkno=blkno@entry=0, extend=extend@entry=0 '\000') at visibilitymap.c:563 #15 0x00000000004a7a94 in visibilitymap_test (rel=rel@entry=0x7f61fe3e6e70, heapBlk=heapBlk@entry=0, buf=buf@entry=0x7fffef8dbc4c) at visibilitymap.c:354 #16 0x00000000005b3892 in lazy_vacuum_page (onerel=onerel@entry=0x7f61fe3e6e70, blkno=blkno@entry=0, buffer=buffer@entry=2144, tupindex=<optimized out>, tupindex@entry=0, vacrelstats=vacrelstats@entry=0x10eeaf8, vmbuffer=vmbuffer@entry=0x7fffef8dbc4c) at vacuumlazy.c:1245 #17 0x00000000005b3b4f in lazy_vacuum_heap (onerel=onerel@entry=0x7f61fe3e6e70, vacrelstats=vacrelstats@entry=0x10eeaf8) at vacuumlazy.c:1152 #18 0x00000000005b42ee in lazy_scan_heap (scan_all=0 '\000', nindexes=2, Irel=<optimized out>, vacrelstats=0x10eeaf8, onerel=0x7f61fe3e6e70) at vacuumlazy.c:1080 #19 lazy_vacuum_rel (onerel=onerel@entry=0x7f61fe3e6e70, vacstmt=vacstmt@entry=0x7fffef8dc360, bstrategy=<optimized out>) at vacuumlazy.c:239 #20 0x00000000005b1e3a in vacuum_rel (relid=relid@entry=1158010, vacstmt=vacstmt@entry=0x7fffef8dc360, do_toast=do_toast@entry=0 '\000', for_wraparound=for_wraparound@entry=0 '\000') at vacuum.c:1202 #21 0x00000000005b2c66 in vacuum (vacstmt=vacstmt@entry=0x7fffef8dc360, relid=<optimized out>, do_toast=do_toast@entry=0 '\000', bstrategy=<optimized out>, bstrategy@entry=0x114a380, for_wraparound=0 '\000', isTopLevel=isTopLevel@entry=1 '\001') at vacuum.c:234 #22 0x00000000006515d3 in autovacuum_do_vac_analyze (bstrategy=0x114a380, tab=0x114a498) at autovacuum.c:2796 #23 do_autovacuum () at autovacuum.c:2326 #24 0x0000000000651b15 in AutoVacWorkerMain (argv=0x0, argc=0) at autovacuum.c:1678 #25 0x00000000006521c2 in StartAutoVacWorker () at autovacuum.c:1463 ---Type <return> to continue, or q <return> to quit--- #26 0x000000000065f89d in StartAutovacuumWorker () at postmaster.c:5188 #27 sigusr1_handler (postgres_signal_arg=<optimized out>) at postmaster.c:4842 #28 <signal handler called> #29 0x00007f61fd20ae03 in select () from /lib/x86_64-linux-gnu/libc.so.6 #30 0x000000000045f985 in ServerLoop () at postmaster.c:1530 #31 0x00000000006608ae in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x1085910) at postmaster.c:1223 #32 0x0000000000460c28 in main (argc=3, argv=0x1085910) at main.c:225 -- Keith Fiske Database Administrator OmniTI Computer Consulting, Inc. http://www.keithf4.com On Thu, Jun 5, 2014 at 5:34 PM, Andres Freund <andres@2ndquadrant.com> wrote: > Hi, > > On 2014-06-05 21:42:23 +0300, Heikki Linnakangas wrote: > > I think we'll need a variant of GetMultiXactIdMembers() that only returns > > the update XID, avoiding the palloc(). The straight-forward fix would be > to > > copy-paste contents of GetMultiXactIdMembers() into > > MultiXactIdGetUpdateXid(), and instead of returning the members in an > array, > > only return the update-xid. But it's a long and complicated function, so > > copy-pasting is not a good option. I think it needs to be refactored into > > some kind of a helper function that both MultiXactIdGetUpdateXid() and > > GetMultiXactIdMembers() could call. > > While it certainly would be nice to have such a function I have my doubt > it's realistic to get it for 9.4 and the backbranches. > > I haven't thought particularly much about this, but I don't really see > why the heap_page_is_all_visible() bit needs to be in a critical > section? Can't we just do that entire bit after the log_heap_clean()? > Then the heap_page_is_all_visible() can be done outside a critical > section. > > Greetings, > > Andres Freund > > -- > Andres Freund http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services > > > -- > Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-bugs >
Keith Fiske wrote: > Not sure if this is the same issue originally reported, but the description > sounded the same. I can get this to happen pretty reliably in 9.4beta1 with > the pgtap test suite for one of my extensions - Uh, this is a completely different problem. We discussed long ago that those pallocs in relpath() were going to cause a problem: > #2 0x000000000079ea87 in ExceptionalCondition (conditionName=conditionName@entry=0x9554b0 "!(CritSectionCount == 0 ||(CurrentMemoryContext) == ErrorContext || (MyAuxProcType == CheckpointerProcess))", errorType=errorType@entry=0x7d94b0"FailedAssertion", fileName=fileName@entry=0x95514e "mcxt.c", lineNumber=lineNumber@entry=670)at assert.c:54 > #3 0x00000000007c15e0 in palloc (size=128) at mcxt.c:670 > #4 0x00000000007d8923 in psprintf (fmt=0x975589 "base/%u/%u") at psprintf.c:60 > #5 0x00000000006b4c17 in mdopen (behavior=EXTENSION_FAIL, forknum=MAIN_FORKNUM, reln=0x1107360) at md.c:562 -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2014-06-06 16:55:58 -0400, Alvaro Herrera wrote: > Keith Fiske wrote: > > Not sure if this is the same issue originally reported, but the description > > sounded the same. I can get this to happen pretty reliably in 9.4beta1 with > > the pgtap test suite for one of my extensions - > > Uh, this is a completely different problem. We discussed long ago that > those pallocs in relpath() were going to cause a problem: I actually don't think it's a different problem. If we'd restructure things so the critical sections are separate this wouldn't be a problem. It's imo not a particularly good idea to mdopen() inside a critical section either. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Just tried this again against the lastest HEAD (fe7337f2dc3177da19b647fcda3a33b73da82e14) and still running into same thing. Failed at different pgtap test point, but core dump seems the same. TRAP: FailedAssertion("!(CritSectionCount == 0 || (CurrentMemoryContext) == ErrorContext || (MyAuxProcType == CheckpointerProcess))", File: "mcxt.c", Line: 670) LOG: server process (PID 20484) was terminated by signal 6: Aborted DETAIL: Failed process was running: autovacuum: VACUUM ANALYZE mimeo_source.updater_test_source LOG: terminating any other active server processes #0 0x00007fa886b2e037 in raise () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007fa886b31698 in abort () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x000000000079ca87 in ExceptionalCondition ( conditionName=conditionName@entry=0x952890 "!(CritSectionCount == 0 || (CurrentMemoryContext) == ErrorContext || (MyAuxProcType == CheckpointerProcess))", errorType=errorType@entry=0x7d7470 "FailedAssertion", fileName=fileName@entry=0x95252e "mcxt.c", lineNumber=lineNumber@entry=670) at assert.c:54 #3 0x00000000007bf510 in palloc (size=128) at mcxt.c:670 #4 0x00000000007d68e3 in psprintf (fmt=0x972a49 "base/%u/%u") at psprintf.c:60 #5 0x00000000006b2417 in mdopen (behavior=EXTENSION_FAIL, forknum=MAIN_FORKNUM, reln=0x16502d0) at md.c:562 #6 mdopen (reln=0x16502d0, forknum=MAIN_FORKNUM, behavior=EXTENSION_FAIL) at md.c:552 #7 0x00000000006b3547 in _mdfd_getseg (reln=reln@entry=0x16502d0, forknum=forknum@entry=MAIN_FORKNUM, blkno=blkno@entry=4, skipFsync=0 '\000', behavior=behavior@entry=EXTENSION_FAIL) at md.c:1707 #8 0x00000000006b3eb8 in mdwrite (reln=0x16502d0, forknum=MAIN_FORKNUM, blocknum=4, buffer=0x7fa87e62f5c0 "\001", skipFsync=<optimized out>) at md.c:742 #9 0x000000000068b50d in FlushBuffer (reln=0x16502d0, buf=0x7fa87d88ad80) at bufmgr.c:2002 #10 FlushBuffer (buf=0x7fa87d88ad80, reln=<optimized out>) at bufmgr.c:1917 #11 0x000000000068be81 in BufferAlloc (foundPtr=0x7fffd5a8081e "", strategy=0x0, blockNum=0, forkNum=VISIBILITYMAP_FORKNUM, relpersistence=112 'p', smgr=0x164e7d8) at bufmgr.c:687 #12 ReadBuffer_common (smgr=0x164e7d8, relpersistence=<optimized out>, forkNum=forkNum@entry=VISIBILITYMAP_FORKNUM, blockNum=blockNum@entry=0, mode=RBM_ZERO_ON_ERROR, strategy=0x0, hit=hit@entry=0x7fffd5a808cf "") at bufmgr.c:335 #13 0x000000000068c50b in ReadBufferExtended (reln=reln@entry=0x7fa88778ff30, forkNum=forkNum@entry=VISIBILITYMAP_FORKNUM, blockNum=blockNum@entry=0, mode=mode@entry=RBM_ZERO_ON_ERROR, strategy=strategy@entry=0x0) at bufmgr.c:254 #14 0x00000000004a5c55 in vm_readbuf (rel=rel@entry=0x7fa88778ff30, blkno=blkno@entry=0, extend=extend@entry=0 '\000') at visibilitymap.c:563 #15 0x00000000004a67f4 in visibilitymap_test (rel=rel@entry=0x7fa88778ff30, heapBlk=heapBlk@entry=0, buf=buf@entry=0x7fffd5a81a4c) at visibilitymap.c:354 #16 0x00000000005b2a12 in lazy_vacuum_page (onerel=onerel@entry=0x7fa88778ff30, blkno=blkno@entry=0, buffer=buffer@entry=5091, tupindex=<optimized out>, tupindex@entry=0, vacrelstats=vacrelstats@entry=0x1637a68, vmbuffer=vmbuffer@entry=0x7fffd5a81a4c) at vacuumlazy.c:1245 #17 0x00000000005b2ccf in lazy_vacuum_heap (onerel=onerel@entry=0x7fa88778ff30, vacrelstats=vacrelstats@entry=0x1637a68) at vacuumlazy.c:1152 #18 0x00000000005b346e in lazy_scan_heap (scan_all=0 '\000', nindexes=3, Irel=<optimized out>, vacrelstats=0x1637a68, onerel=0x7fa88778ff30) at vacuumlazy.c:1080 #19 lazy_vacuum_rel (onerel=onerel@entry=0x7fa88778ff30, vacstmt=vacstmt@entry=0x7fffd5a82160, bstrategy=<optimized out>) at vacuumlazy.c:239 #20 0x00000000005b0fca in vacuum_rel (relid=relid@entry=25939, vacstmt=vacstmt@entry=0x7fffd5a82160, do_toast=do_toast@entry=0 '\000', for_wraparound=for_wraparound@entry=0 '\000') at vacuum.c:1202 #21 0x00000000005b1de6 in vacuum (vacstmt=vacstmt@entry=0x7fffd5a82160, relid=<optimized out>, do_toast=do_toast@entry=0 '\000', bstrategy=<optimized out>, bstrategy@entry=0x1693458, for_wraparound=0 '\000', isTopLevel=isTopLevel@entry=1 '\001') at vacuum.c:234 ---Type <return> to continue, or q <return> to quit--- #22 0x000000000064ee53 in autovacuum_do_vac_analyze (bstrategy=0x1693458, tab=0x1693570) at autovacuum.c:2796 #23 do_autovacuum () at autovacuum.c:2326 #24 0x000000000064f395 in AutoVacWorkerMain (argv=0x0, argc=0) at autovacuum.c:1678 #25 0x000000000064fa42 in StartAutoVacWorker () at autovacuum.c:1463 #26 0x000000000065d0cd in StartAutovacuumWorker () at postmaster.c:5188 #27 sigusr1_handler (postgres_signal_arg=<optimized out>) at postmaster.c:4842 #28 <signal handler called> #29 0x00007fa886be9e03 in select () from /lib/x86_64-linux-gnu/libc.so.6 #30 0x000000000045e702 in ServerLoop () at postmaster.c:1530 #31 0x000000000065dfbe in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x15d2930) at postmaster.c:1223 #32 0x000000000045f9b2 in main (argc=3, argv=0x15d2930) at main.c:227 -- Keith Fiske Database Administrator OmniTI Computer Consulting, Inc. http://www.keithf4.com On Fri, Jun 6, 2014 at 4:58 PM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2014-06-06 16:55:58 -0400, Alvaro Herrera wrote: > > Keith Fiske wrote: > > > Not sure if this is the same issue originally reported, but the > description > > > sounded the same. I can get this to happen pretty reliably in 9.4beta1 > with > > > the pgtap test suite for one of my extensions - > > > > Uh, this is a completely different problem. We discussed long ago that > > those pallocs in relpath() were going to cause a problem: > > I actually don't think it's a different problem. If we'd restructure > things so the critical sections are separate this wouldn't be a > problem. It's imo not a particularly good idea to mdopen() inside a > critical section either. > > Greetings, > > Andres Freund > > -- > Andres Freund http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services >
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-06-06 16:55:58 -0400, Alvaro Herrera wrote: >> Uh, this is a completely different problem. We discussed long ago that >> those pallocs in relpath() were going to cause a problem: > I actually don't think it's a different problem. If we'd restructure > things so the critical sections are separate this wouldn't be a > problem. It's imo not a particularly good idea to mdopen() inside a > critical section either. The point here seems to be that lazy_vacuum_page does the visibility map ops inside its own critical section. Why? Setting a visibility bit doesn't seem like it's critical. Why can't we just move the END_CRIT_SECTION() to before the PageIsAllVisible test? regards, tom lane
On 2014-06-06 18:03:53 -0400, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2014-06-06 16:55:58 -0400, Alvaro Herrera wrote: > >> Uh, this is a completely different problem. We discussed long ago that > >> those pallocs in relpath() were going to cause a problem: > > > I actually don't think it's a different problem. If we'd restructure > > things so the critical sections are separate this wouldn't be a > > problem. It's imo not a particularly good idea to mdopen() inside a > > critical section either. > > The point here seems to be that lazy_vacuum_page does the visibility map > ops inside its own critical section. Why? Setting a visibility bit > doesn't seem like it's critical. Why can't we just move the > END_CRIT_SECTION() to before the PageIsAllVisible test? Yea, that's what I am proposing upthread. If we move the visibility tests out of the critical section this will get rid of the original report as well. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-06-06 18:03:53 -0400, Tom Lane wrote: >> The point here seems to be that lazy_vacuum_page does the visibility map >> ops inside its own critical section. Why? Setting a visibility bit >> doesn't seem like it's critical. Why can't we just move the >> END_CRIT_SECTION() to before the PageIsAllVisible test? > Yea, that's what I am proposing upthread. If we move the visibility > tests out of the critical section this will get rid of the original > report as well. I went trolling for other critical sections ... lazy_scan_heap has same disease, but looks like it can be fixed same way. Also, there are a bunch of fsync_fname() calls inside critical sections in replication/slot.c. Seems at best pretty damn risky; what's more, the critical sections cover only the fsyncs and not anything else, which is flat out broken. If it was okay to fail just before calling the fsync, why is it critical to not fail inside it? Somebody was not thinking clearly there. regards, tom lane
On 2014-06-06 18:21:45 -0400, Tom Lane wrote: > Also, there are a bunch of fsync_fname() calls inside critical sections in > replication/slot.c. Seems at best pretty damn risky; what's more, the > critical sections cover only the fsyncs and not anything else, which is > flat out broken. If it was okay to fail just before calling the fsync, > why is it critical to not fail inside it? Somebody was not thinking > clearly there. No, it actually makes sense. If: * the open, write or fsync to the temp file fails: no permanent state has changed. We can gracefully error out. * rename(tmpfile, realname) fails: we know (by posix) that the file hasn't been renamed. The old state is still valid. * if the fsync() to the new file fails (damn unlikely) we don't know which state is valid. So if we'd crash in that moment we might loose our reservation on resources (e.g. catalog xmin). And might start to decode with the wrong catalog state. Bad. On startup we'll try to fsync the slot files again, so we won't startup until that's clear. Why is it that risky? We fdatasync() files while inside a critical section all the time. And we've done the space allocation (the fsync on the old filename) and the rename() outside the critical section. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2014-06-05 23:34:36 +0200, Andres Freund wrote: > Hi, > > On 2014-06-05 21:42:23 +0300, Heikki Linnakangas wrote: > > I think we'll need a variant of GetMultiXactIdMembers() that only returns > > the update XID, avoiding the palloc(). The straight-forward fix would be to > > copy-paste contents of GetMultiXactIdMembers() into > > MultiXactIdGetUpdateXid(), and instead of returning the members in an array, > > only return the update-xid. But it's a long and complicated function, so > > copy-pasting is not a good option. I think it needs to be refactored into > > some kind of a helper function that both MultiXactIdGetUpdateXid() and > > GetMultiXactIdMembers() could call. > > While it certainly would be nice to have such a function I have my doubt > it's realistic to get it for 9.4 and the backbranches. > > I haven't thought particularly much about this, but I don't really see > why the heap_page_is_all_visible() bit needs to be in a critical > section? Can't we just do that entire bit after the log_heap_clean()? > Then the heap_page_is_all_visible() can be done outside a critical > section. Before I start working on a patch along those lines, do you see any problems with making the critical section smaller? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 06/09/2014 03:46 PM, Andres Freund wrote: >> >I haven't thought particularly much about this, but I don't really see >> >why the heap_page_is_all_visible() bit needs to be in a critical >> >section? Can't we just do that entire bit after the log_heap_clean()? >> >Then the heap_page_is_all_visible() can be done outside a critical >> >section. > Before I start working on a patch along those lines, do you see any > problems with making the critical section smaller? One subtle difference is that the PD_ALL_VISIBLE flag will not be included in the full-page-image that log_heap_page() might take. But that seems OK. visibilitymap_set() writes a WAL record that sets it at replay. - Heikki
Heikki Linnakangas <hlinnakangas@vmware.com> writes: > On 06/09/2014 03:46 PM, Andres Freund wrote: >> Before I start working on a patch along those lines, do you see any >> problems with making the critical section smaller? > One subtle difference is that the PD_ALL_VISIBLE flag will not be > included in the full-page-image that log_heap_page() might take. But > that seems OK. visibilitymap_set() writes a WAL record that sets it at > replay. Huh? The visibility map call is after the page image would be taken in any case. regards, tom lane
On 2014-06-09 10:21:54 -0400, Tom Lane wrote: > Heikki Linnakangas <hlinnakangas@vmware.com> writes: > > On 06/09/2014 03:46 PM, Andres Freund wrote: > >> Before I start working on a patch along those lines, do you see any > >> problems with making the critical section smaller? > > > One subtle difference is that the PD_ALL_VISIBLE flag will not be > > included in the full-page-image that log_heap_page() might take. But > > that seems OK. visibilitymap_set() writes a WAL record that sets it at > > replay. > > Huh? The visibility map call is after the page image would be taken in > any case. But the heap_page_is_all_visible() and PageSetAllVisible() currently aren't. And both would have to be moved to fix the issue... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2014-06-06 18:21:45 -0400, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2014-06-06 18:03:53 -0400, Tom Lane wrote: > >> The point here seems to be that lazy_vacuum_page does the visibility map > >> ops inside its own critical section. Why? Setting a visibility bit > >> doesn't seem like it's critical. Why can't we just move the > >> END_CRIT_SECTION() to before the PageIsAllVisible test? > > > Yea, that's what I am proposing upthread. If we move the visibility > > tests out of the critical section this will get rid of the original > > report as well. > > I went trolling for other critical sections ... > > lazy_scan_heap has same disease, but looks like it can be fixed same way. Hm. I'm writing a fix for this issue right now and went to look into lazy_scan_heap() - but I'm not seeing a problem atm. Do you mean that bit: START_CRIT_SECTION(); /* mark buffer dirty before writing a WAL record */ MarkBufferDirty(buf); /* * It's possible that another backend has extended the heap, * initialized the page, and then failed to WAL-log the page * due to an ERROR. Since heap extension is not WAL-logged, * recovery might try to replay our record setting the page * all-visible and find that the page isn't initialized, which * will cause a PANIC. To prevent that, check whether the * page has been previously WAL-logged, and if not, do that * now. */ if (RelationNeedsWAL(onerel) && PageGetLSN(page) == InvalidXLogRecPtr) log_newpage_buffer(buf, true); PageSetAllVisible(page); visibilitymap_set(onerel, blkno, buf, InvalidXLogRecPtr, vmbuffer, InvalidTransactionId); END_CRIT_SECTION(); If so, I think that should actually be safe - visibilitymap_set() shouldn't ever do IO - that's why it gets the vmbuffer argument passed in. Now, afaics there isn't really any need for the critical section in this case but the assertion in log_newpage_buffer(). So, afaics, we could just end the do the END_CRIT_SECTION() two lines earlier, to keep it as short as possible. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Hi Heikki, all, On 2014-06-09 16:04:06 +0300, Heikki Linnakangas wrote: > On 06/09/2014 03:46 PM, Andres Freund wrote: > >>>I haven't thought particularly much about this, but I don't really see > >>>why the heap_page_is_all_visible() bit needs to be in a critical > >>>section? Can't we just do that entire bit after the log_heap_clean()? > >>>Then the heap_page_is_all_visible() can be done outside a critical > >>>section. > >Before I start working on a patch along those lines, do you see any > >problems with making the critical section smaller? > > One subtle difference is that the PD_ALL_VISIBLE flag will not be included > in the full-page-image that log_heap_page() might take. But that seems OK. > visibilitymap_set() writes a WAL record that sets it at replay. Heikki, you've changed things around a bit in 2a8e1ac598. That commit message mentions that there's still a difference in which order things happen on the master vs. the standby. Am I seeing it correctly that reordering things like hinted at above (and in the attached patch) fully gets rid of that problem? And doesn't reintroduce what you fixed? I've verified that the crash occurs without the patch and doesn't with it applied. I'm too tired right now to think enough about my fix to push it. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Hi David, Keith, I've pushed a fix for the problem. On 2014-06-05 18:01:21 +0000, levertond@googlemail.com wrote: > Our application's test suite triggers an assertion failure in an autovacuum > process under 9.4 beta1. I wasn't able to reduce it to a nice test case, > but I hope the backtrace illustrates the problem: Any chance you could rerun the testsuite now (using master) or after beta2? Thanks! Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Jun 20, 2014 at 6:11 AM, Andres Freund <andres@2ndquadrant.com> wrote: > Hi David, Keith, > > I've pushed a fix for the problem. > > On 2014-06-05 18:01:21 +0000, levertond@googlemail.com wrote: > > Our application's test suite triggers an assertion failure in an > autovacuum > > process under 9.4 beta1. I wasn't able to reduce it to a nice test case, > > but I hope the backtrace illustrates the problem: > > Any chance you could rerun the testsuite now (using master) or after > beta2? > > Thanks! > > Andres Freund > > -- > Andres Freund http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services > Andres, Running my test suite against current HEAD (ecac0e2b9e8e8e78d771b20fe441e95bb02db2fa), this seems to have fixed it at least in my case. I confirmed autovacuum was running during the later tests. I ran it 4 consecutive times with zero errors. Thanks! -- Keith Fiske Database Administrator OmniTI Computer Consulting, Inc. http://www.keithf4.com
Andres Freund wrote: > I've pushed a fix for the problem. Cool, thanks. > Any chance you could rerun the testsuite now (using master) or after > beta2? Will check with beta2, it's a bit of a pain to build things in our current environment, unfortunately.
David Leverton wrote: > Andres Freund wrote: >> Any chance you could rerun the testsuite now (using master) or after >> beta2? > > Will check with beta2, it's a bit of a pain to build things in our > current environment, unfortunately. Just to confirm, this is working fine now with beta2.