HEAD crashes with assertion and LWLOCK_STATS enabled - Mailing list pgsql-hackers
| From | Yuto HAYAMIZU |
|---|---|
| Subject | HEAD crashes with assertion and LWLOCK_STATS enabled |
| Date | |
| Msg-id | 537B0C2E.6090706@gmail.com Whole thread Raw |
| Responses |
Re: HEAD crashes with assertion and LWLOCK_STATS enabled
|
| List | pgsql-hackers |
Hi hackers,
I found a bug that causes a crash when assertion is enabled and LWLOCK_STATS is defined.
I've tested with Debian 7.5 (3.2.0-4-amd64) on VMware fusion 6, but this bug seems to be platform-independent and
shouldreproduce in other environments.
A patch to fix the bug is also attached.
## Reproduing a crash
You can reproduce a crash by this way:
git co a0841ecd2518d4505b96132b764b918ab5d21ad4
git clean -dfx
./configure --enable-cassert CFLAGS='-DLWLOCK_STATS'
make check
In my environment, the following messages appeared.
( omit... )
../../../src/test/regress/pg_regress --inputdir=. --temp-install=./tmp_check --top-builddir=../../.. --dlpath=.
--schedule=./parallel_schedule
============== creating temporary installation ==============
============== initializing database system ==============
pg_regress: initdb failed
and initdb.log contained the following messages.
reating directory /tmp/pghead/src/test/regress/./tmp_check/data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
creating template1 database in /tmp/pghead/src/test/regress/./tmp_check/data/base/1 ... PID 48239 lwlock main 142:
shacq0 exacq 1 blk 0 spindelay 0
( omit... )
PID 48247 lwlock main 33058: shacq 0 exacq 1 blk 0 spindelay 0
PID 48247 lwlock main 33005: shacq 0 exacq 48 blk 0 spindelay 0
ok
loading system objects' descriptions ... TRAP: FailedAssertion("!(CritSectionCount == 0 || (context) ==
ErrorContext|| (MyAuxProcType == CheckpointerProcess))", File: "mcxt.c", Line: 594)
Aborted (core dumped)
child process exited with exit code 134
initdb: data directory "/tmp/pghead/src/test/regress/./tmp_check/data" not removed at user's request
## The cause of crash
The failing assertion is for prohibiting memory allocation in a critical section, which is introduced by commit
4a170ee9on 2014-04-04.
In my understanding, the root cause of the assertion failure is on-demand allocation of lwlock_stats entry. For each
LWLock,a lwlock_stats entry is created at the first invocation of LWLockAcquire using MemoryContextAlloc. If the first
invocationis in a critical section, the assertion fails.
For 'initdb' case I mentioned above, WALWriteLock locking in XLogFlush function was the problem.
I also confirmed the assertion failure on starting postgres on a correctly initialized database. In this case, locking
CheckpointerCommLockin AbsorbFsyncRequests function was the problem.
## A solution
In order to avoid memory allocation during critical sections, lwlock_stats hash table should be populated at the
initializationof each process.
The attached patch populate lwlock_stats entries of MainLWLockArray at the end of CreateLWLocks, InitProcess and
InitAuxiliaryProcess.
With this patch, all regression tests can be passed so far, but I think this patch is not perfect because it does not
coverLWLocks outside of MainLWLockArray. I'm not sure where is the right place to initialize lwlock_stats entries for
thatlocks. So I feel it needs some refinements by you hackers.
Attachment
pgsql-hackers by date: