BUG #16990: Random PANIC in qemu user context - Mailing list pgsql-bugs
From | PG Bug reporting form |
---|---|
Subject | BUG #16990: Random PANIC in qemu user context |
Date | |
Msg-id | 16990-10b586bc699fd234@postgresql.org Whole thread Raw |
Responses |
Re: BUG #16990: Random PANIC in qemu user context
|
List | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 16990 Logged by: Paul Guyot Email address: pguyot@kallisys.net PostgreSQL version: 11.11 Operating system: qemu-arm-static chrooted raspios inside ubuntu Description: Within GitHub Actions Workflow, a qemu chrooted environment is created from a RaspiOS lite image, within which latest availble postgresql is installed from apt (postgresql 11.11). Then tests of embedded software are executed, which includes creating a postgresql database and performing few benign operations (as far as PostgreSQL is concerned). Tests run perfectly fine in a desktop-like environment as well as on real devices. Within this qemu context, randomly yet quite frequently, postgresql PANICs. Latest log was the following : 2021-05-02 09:22:21.591 BST [15024] PANIC: stuck spinlock detected at LWLockWaitListLock, /build/postgresql-11-rRyn74/postgresql-11-11.11/build/../src/backend/storage/lmgr/lwlock.c:832 qemu: uncaught target signal 6 (Aborted) - core dumped 2021-05-02 09:22:21.597 BST [15022] PANIC: stuck spinlock detected at LWLockWaitListLock, /build/postgresql-11-rRyn74/postgresql-11-11.11/build/../src/backend/storage/lmgr/lwlock.c:832 qemu: uncaught target signal 6 (Aborted) - core dumped 2021-05-02 09:22:21.762 BST [15423] pynab@test_pynab PANIC: stuck spinlock detected at LWLockWaitListLock, /build/postgresql-11-rRyn74/postgresql-11-11.11/build/../src/backend/storage/lmgr/lwlock.c:832 2021-05-02 09:22:21.762 BST [15423] pynab@test_pynab STATEMENT: SELECT "django_content_type"."id", "django_content_type"."app_label", "django_content_type"."model" FROM "django_content_type" WHERE "django_content_type"."app_label" = 'auth' qemu: uncaught target signal 6 (Aborted) - core dumped 2021-05-02 09:22:24.481 BST [15011] LOG: server process (PID 15423) was terminated by signal 6: Aborted 2021-05-02 09:22:24.481 BST [15011] DETAIL: Failed process was running: SELECT "django_content_type"."id", "django_content_type"."app_label", "django_content_type"."model" FROM "django_content_type" WHERE "django_content_type"."app_label" = 'auth' 2021-05-02 09:22:24.481 BST [15011] LOG: terminating any other active server processes 2021-05-02 09:22:24.567 BST [15011] LOG: all server processes terminated; reinitializing 2021-05-02 09:22:24.601 BST [15512] LOG: database system was interrupted; last known up at 2021-05-02 09:18:11 BST 2021-05-02 09:22:24.692 BST [15512] LOG: database system was not properly shut down; automatic recovery in progress 2021-05-02 09:22:24.699 BST [15512] LOG: redo starts at 0/171E170 2021-05-02 09:22:25.045 BST [15512] LOG: invalid record length at 0/1957948: wanted 24, got 0 2021-05-02 09:22:25.046 BST [15512] LOG: redo done at 0/1957910 2021-05-02 09:22:25.048 BST [15512] LOG: last completed transaction was at log time 2021-05-02 09:20:04.917746+01 2021-05-02 09:22:25.096 BST [15011] LOG: database system is ready to accept connections The log is publicly available here : https://github.com/pguyot/pynab/runs/2485660214?check_suite_focus=true Notice how sluggish the test is compared to when PostgreSQL doesn't PANIC, with the same environment. For example, this run worked perfectly under 20 minutes: https://github.com/pguyot/pynab/runs/2483559259?check_suite_focus=true I tried to update CI script to upload the full raspbian image in case of panics to get my hands on the core dump, but it's so sluggish I'm not sure it will not timeout eventually. I wonder if this sluggishness is not a cause of the PANIC. Could you please advise about how to investigate further this crash?
pgsql-bugs by date: