backend for database 'A' crashes/is killed -> corrupt index in database 'B' - Mailing list pgsql-bugs
From | Jon Nelson |
---|---|
Subject | backend for database 'A' crashes/is killed -> corrupt index in database 'B' |
Date | |
Msg-id | AANLkTikAUX0fCNHjgoH6rDxT-OydK9Cj++o1CiL0mAm7@mail.gmail.com Whole thread Raw |
Responses |
Re: backend for database 'A' crashes/is killed -> corrupt
index in database 'B'
|
List | pgsql-bugs |
Please bear with me, as it's taken me a while to be able to reproduce an issue that is worrisome. Before I describe the problem, I am using postgresql version 8.4.5 on CentOS 5.5 (x86_64). The short version is that if a postgresql backend is killed (by the Linux OOM handler, or kill -9, etc...) while operations are taking place in a *different* backend, corruption is introduced in the other backend. I don't want to say it happens 100% of the time, but it happens every time I test. The differences from a stock pg config are primarily to do with logging and such. The other settings (besides logging and such) that differ from the stock values include: wal_sync_method -- I've tried with leaving it as the default, and I've also tried datasync -- same results. wal_buffers = 128MB effective_cache_size = 4GB work_mem = 1MB maintenance_work_mem = 1GB Here is how I am reproducing the problem: 1. Open a psql connection to database A. It may remain idle. 2. Wait for an automated process to connect to database B and start operations. These operations 3. kill -9 the backend for the psql connection to database A. Then I observe the backends all shutting down and postgresql entering recovery mode, which succeeds. Subsequent operations on other databases appear fine, but not for database B: An index on one of the tables in database B is corrupted. It is always the same index. 2011-03-30 14:51:32 UTC LOG: server process (PID 3871) was terminated by signal 9: Killed 2011-03-30 14:51:32 UTC LOG: terminating any other active server processes 2011-03-30 14:51:32 UTC WARNING: terminating connection because of crash of another server process 2011-03-30 14:51:32 UTC DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2011-03-30 14:51:32 UTC HINT: In a moment you should be able to reconnect to the database and repeat your command. 2011-03-30 14:51:32 UTC databaseB databaseB WARNING: terminating connection because of crash of another server process 2011-03-30 14:51:32 UTC databaseB databaseB DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2011-03-30 14:51:32 UTC databaseB databaseB HINT: In a moment you should be able to reconnect to the database and repeat your command. 2011-03-30 14:51:32 UTC LOG: all server processes terminated; reinitializing 2011-03-30 14:51:32 UTC LOG: database system was interrupted; last known up at 2011-03-30 14:46:50 UTC 2011-03-30 14:51:32 UTC databaseB databaseB FATAL: the database system is in recovery mode 2011-03-30 14:51:32 UTC LOG: database system was not properly shut down; automatic recovery in progress 2011-03-30 14:51:32 UTC LOG: redo starts at 301/1D328E40 2011-03-30 14:51:33 UTC databaseB databaseB FATAL: the database system is in recovery mode 2011-03-30 14:51:34 UTC LOG: record with zero length at 301/1EA08608 2011-03-30 14:51:34 UTC LOG: redo done at 301/1EA08558 2011-03-30 14:51:34 UTC LOG: last completed transaction was at log time 2011-03-30 14:51:31.257997+00 2011-03-30 14:51:37 UTC LOG: autovacuum launcher started 2011-03-30 14:51:37 UTC LOG: database system is ready to accept connections 2011-03-30 14:52:05 UTC databaseB databaseB ERROR: index "<elided>" contains unexpected zero page at block 0 2011-03-30 14:52:05 UTC databaseB databaseB HINT: Please REINDEX it. What's more, I can execute a 'DELETE from tableB' (where tableB is the table that is the one with the troublesome index) without error, but when I try to *insert* that is when I get a problem. The index is a standard btree index. The DELETE statement has no where clause. Other information: The following variables remain commented out and thus retain their defaults (confirmed with 'show all'): fsync is on. synchronous_commit is on. full_page_writes is on. zero_damaged_pages is *off*. The underlying storage is a software raid1. 'test_fsync' from postgresql 9.0.1 reports: Loops = 10000 Simple write: 8k write 157205.515/second Compare file sync methods using one write: (open_datasync unavailable) open_sync 8k write 1357.928/second 8k write, fdatasync 3872.508/second 8k write, fsync 3496.659/second Compare file sync methods using two writes: (open_datasync unavailable) 2 open_sync 8k writes 1928.987/second 8k write, 8k write, fdatasync 2020.884/second 8k write, 8k write, fsync 2411.653/second Compare open_sync with different sizes: open_sync 16k write 3298.716/second 2 open_sync 8k writes 1228.599/second Test if fsync on non-write file descriptor is honored: (If the times are similar, fsync() can sync data written on a different descriptor.) 8k write, fsync, close 2535.765/second 8k write, close, fsync 2162.060/second My interpretation of these values is that the drives themselves have their write caches disabled. Lastly, the problem has been reproduced on at least three totally different machines. -- Jon
pgsql-bugs by date: