HEAD crashes with assertion and LWLOCK_STATS enabled - Mailing list pgsql-hackers
From | Yuto HAYAMIZU |
---|---|
Subject | HEAD crashes with assertion and LWLOCK_STATS enabled |
Date | |
Msg-id | 537B0C2E.6090706@gmail.com Whole thread Raw |
Responses |
Re: HEAD crashes with assertion and LWLOCK_STATS enabled
|
List | pgsql-hackers |
Hi hackers, I found a bug that causes a crash when assertion is enabled and LWLOCK_STATS is defined. I've tested with Debian 7.5 (3.2.0-4-amd64) on VMware fusion 6, but this bug seems to be platform-independent and shouldreproduce in other environments. A patch to fix the bug is also attached. ## Reproduing a crash You can reproduce a crash by this way: git co a0841ecd2518d4505b96132b764b918ab5d21ad4 git clean -dfx ./configure --enable-cassert CFLAGS='-DLWLOCK_STATS' make check In my environment, the following messages appeared. ( omit... ) ../../../src/test/regress/pg_regress --inputdir=. --temp-install=./tmp_check --top-builddir=../../.. --dlpath=. --schedule=./parallel_schedule ============== creating temporary installation ============== ============== initializing database system ============== pg_regress: initdb failed and initdb.log contained the following messages. reating directory /tmp/pghead/src/test/regress/./tmp_check/data ... ok creating subdirectories ... ok selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting dynamic shared memory implementation ... posix creating configuration files ... ok creating template1 database in /tmp/pghead/src/test/regress/./tmp_check/data/base/1 ... PID 48239 lwlock main 142: shacq0 exacq 1 blk 0 spindelay 0 ( omit... ) PID 48247 lwlock main 33058: shacq 0 exacq 1 blk 0 spindelay 0 PID 48247 lwlock main 33005: shacq 0 exacq 48 blk 0 spindelay 0 ok loading system objects' descriptions ... TRAP: FailedAssertion("!(CritSectionCount == 0 || (context) == ErrorContext|| (MyAuxProcType == CheckpointerProcess))", File: "mcxt.c", Line: 594) Aborted (core dumped) child process exited with exit code 134 initdb: data directory "/tmp/pghead/src/test/regress/./tmp_check/data" not removed at user's request ## The cause of crash The failing assertion is for prohibiting memory allocation in a critical section, which is introduced by commit 4a170ee9on 2014-04-04. In my understanding, the root cause of the assertion failure is on-demand allocation of lwlock_stats entry. For each LWLock,a lwlock_stats entry is created at the first invocation of LWLockAcquire using MemoryContextAlloc. If the first invocationis in a critical section, the assertion fails. For 'initdb' case I mentioned above, WALWriteLock locking in XLogFlush function was the problem. I also confirmed the assertion failure on starting postgres on a correctly initialized database. In this case, locking CheckpointerCommLockin AbsorbFsyncRequests function was the problem. ## A solution In order to avoid memory allocation during critical sections, lwlock_stats hash table should be populated at the initializationof each process. The attached patch populate lwlock_stats entries of MainLWLockArray at the end of CreateLWLocks, InitProcess and InitAuxiliaryProcess. With this patch, all regression tests can be passed so far, but I think this patch is not perfect because it does not coverLWLocks outside of MainLWLockArray. I'm not sure where is the right place to initialize lwlock_stats entries for thatlocks. So I feel it needs some refinements by you hackers.
Attachment
pgsql-hackers by date: