BUG #18611: Postgres service crashes continuously in loop of reinitialization if disk partition is full - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #18611: Postgres service crashes continuously in loop of reinitialization if disk partition is full
Date
Msg-id 18611-6fc1683655d2ce53@postgresql.org
Whole thread Raw
Responses Re: BUG #18611: Postgres service crashes continuously in loop of reinitialization if disk partition is full
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      18611
Logged by:          Bhupendra Patel
Email address:      bhavin.ec50@gmail.com
PostgreSQL version: 16.2
Operating system:   Linux
Description:

We have observed that Postgres service running in re-initialization loop
when disk is full.

These are the logs observed in Postgres log file. Postgres service worker
crashed and it caused service to re-initialize and re-initialization itself
failed that caused loop of re-initialization. 

2 2024-09-10 14:07:14.978 GMTPANIC:  could not write to file
"pg_logical/replorigin_checkpoint.tmp": No space left on device
1 2024-09-10 14:07:15.136 GMTLOG:  checkpointer process (PID 2) was
terminated by signal 6: Aborted
1 2024-09-10 14:07:15.136 GMTLOG:  terminating any other active server
processes
1 2024-09-10 14:07:15.137 GMTLOG:  all server processes terminated;
reinitializing
21 2024-09-10 14:07:15.152 GMTLOG:  database system was interrupted; last
known up at 2024-09-10 14:04:22 GMT
21 2024-09-10 14:07:15.157 GMTLOG:  database system was not properly shut
down; automatic recovery in progress
21 2024-09-10 14:07:15.171 GMTLOG:  redo starts at 0/12F164E0
21 2024-09-10 14:07:15.174 GMTLOG:  invalid record length at 0/12F40598:
expected at least 24, got 0
21 2024-09-10 14:07:15.174 GMTLOG:  redo done at 0/12F40560 system usage:
CPU: user: 0.00 s, system: 0.01 s, elapsed: 0.01 s
1 2024-09-10 14:07:15.188 GMTLOG:  database system is ready to accept
connections
22 2024-09-10 14:08:15.247 GMTPANIC:  could not write to file
"pg_logical/replorigin_checkpoint.tmp": No space left on device
1 2024-09-10 14:08:15.379 GMTLOG:  checkpointer process (PID 22) was
terminated by signal 6: Aborted
1 2024-09-10 14:08:15.379 GMTLOG:  terminating any other active server
processes
1 2024-09-10 14:08:15.379 GMTLOG:  all server processes terminated;
reinitializing
29 2024-09-10 14:08:15.399 GMTLOG:  database system was interrupted; last
known up at 2024-09-10 14:07:15 GMT
29 2024-09-10 14:08:15.407 GMTLOG:  database system was not properly shut
down; automatic recovery in progress
29 2024-09-10 14:08:15.410 GMTLOG:  redo starts at 0/12F40610
29 2024-09-10 14:08:15.410 GMTLOG:  invalid record length at 0/12F41918:
expected at least 24, got 0
29 2024-09-10 14:08:15.410 GMTLOG:  redo done at 0/12F418E0 system usage:
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
1 2024-09-10 14:08:15.428 GMTLOG:  database system is ready to accept
connections
30 2024-09-10 14:09:15.485 GMTPANIC:  could not write to file
"pg_logical/replorigin_checkpoint.tmp": No space left on device
1 2024-09-10 14:09:15.624 GMTLOG:  checkpointer process (PID 30) was
terminated by signal 6: Aborted
1 2024-09-10 14:09:15.624 GMTLOG:  terminating any other active server
processes
1 2024-09-10 14:09:15.624 GMTLOG:  all server processes terminated;
reinitializing
37 2024-09-10 14:09:15.642 GMTLOG:  database system was interrupted; last
known up at 2024-09-10 14:08:15 GMT
37 2024-09-10 14:09:15.647 GMTLOG:  database system was not properly shut
down; automatic recovery in progress
37 2024-09-10 14:09:15.649 GMTLOG:  redo starts at 0/12F41990
37 2024-09-10 14:09:15.649 GMTLOG:  invalid record length at 0/12F42D28:
expected at least 24, got 0
37 2024-09-10 14:09:15.649 GMTLOG:  redo done at 0/12F42CF0 system usage:
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
1 2024-09-10 14:09:15.659 GMTLOG:  database system is ready to accept
connections
38 2024-09-10 14:10:15.717 GMTPANIC:  could not write to file
"pg_logical/replorigin_checkpoint.tmp": No space left on device
1 2024-09-10 14:10:15.912 GMTLOG:  checkpointer process (PID 38) was
terminated by signal 6: Aborted
1 2024-09-10 14:10:15.912 GMTLOG:  terminating any other active server
processes
1 2024-09-10 14:10:15.913 GMTLOG:  all server processes terminated;
reinitializing
44 2024-09-10 14:10:15.945 GMTLOG:  database system was interrupted; last
known up at 2024-09-10 14:09:15 GMT
44 2024-09-10 14:10:15.961 GMTLOG:  database system was not properly shut
down; automatic recovery in progress
44 2024-09-10 14:10:15.968 GMTLOG:  redo starts at 0/12F42DA0
44 2024-09-10 14:10:15.968 GMTLOG:  invalid record length at 0/12F44140:
expected at least 24, got 0
44 2024-09-10 14:10:15.968 GMTLOG:  redo done at 0/12F44108 system usage:
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
1 2024-09-10 14:10:16.006 GMTLOG:  database system is ready to accept
connections
45 2024-09-10 14:12:16.098 GMTPANIC:  could not write to file
"pg_logical/replorigin_checkpoint.tmp": No space left on device
1 2024-09-10 14:12:16.280 GMTLOG:  checkpointer process (PID 45) was
terminated by signal 6: Aborted

Postgres version used: PostgreSQL 16.2
Using default configuration file

Let me know if you require more details.


pgsql-bugs by date:

Previous
From: "a.kozhemyakin"
Date:
Subject: Re: BUG #18610: llvm error: __aarch64_swp4_acq_rel which could not be resolved
Next
From: PG Bug reporting form
Date:
Subject: BUG #18612: Postgres crash with segfault on disk full - ____strtof_l_internal (strtod_l.c:1019)