Weird Assert failure in GetLockStatusData() - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Weird Assert failure in GetLockStatusData() |
Date | |
Msg-id | 8053.1357659565@sss.pgh.pa.us Whole thread Raw |
Responses |
Re: Weird Assert failure in GetLockStatusData()
|
List | pgsql-hackers |
This is a bit disturbing: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bushpig&dt=2013-01-07%2019%3A15%3A02 The key bit is [50eb2156.651e:6] LOG: execute isolationtester_waiting: SELECT 1 FROM pg_locks holder, pg_locks waiter WHERE NOT waiter.grantedAND waiter.pid = $1 AND holder.granted AND holder.pid <> $1 AND holder.pid IN (25887, 25888, 25889) AND holder.mode= ANY (CASE waiter.mode WHEN 'AccessShareLock' THEN ARRAY['AccessExclusiveLock'] WHEN 'RowShareLock' THEN ARRAY['ExclusiveLock','AccessExclusiveLock']WHEN 'RowExclusiveLock' THEN ARRAY['ShareLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock']WHEN 'ShareUpdateExclusiveLock' THEN ARRAY['ShareUpdateExclusiveLock','ShareLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock']WHEN 'ShareLock'THEN ARRAY['RowExclusiveLock','ShareUpdateExclusiveLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock']WHEN 'ShareRowExclusiveLock'THEN ARRAY['RowExclusiveLock','ShareUpdateExclusiveLock','ShareLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock'] WHEN'ExclusiveLock' THEN ARRAY['RowShar!eLock','RowExclusiveLock','ShareUpdateExclusiveLock','ShareLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock'] WHEN'AccessExclusiveLock' THEN ARRAY['AccessShareLock','RowShareLock','RowExclusiveLock','ShareUpdateExclusiveLock','ShareLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock'] END)AND holder.locktype IS NOT DISTINCT FROM waiter.locktype AND holder.database IS NOT DISTINCT FROM waiter.database ANDholder.relation IS NOT DISTINCT FROM waiter.relation AND holder.page IS NOT DISTINCT FROM waiter.page AND holder.tupleIS NOT DISTINCT FROM waiter.tuple AND holder.virtualxid IS NOT DISTINCT FROM waiter.virtualxid AND holder.transactionidIS NOT DISTINCT FROM waiter.transactionid AND holder.classid IS NOT DISTINCT FROM waiter.classid ANDholder.objid IS NOT DISTINCT FROM waiter.objid AND holder.objsubid IS NOT DISTINCT FROM waiter.objsubid [50eb2156.651e:7] DETAIL: parameters: $1 = '25889' TRAP: FailedAssertion("!(el == data->nelements)", File: "lock.c", Line: 3398) [50eb2103.62ee:2] LOG: server process (PID 25886) was terminated by signal 6: Aborted [50eb2103.62ee:3] DETAIL: Failed process was running: SELECT 1 FROM pg_locks holder, pg_locks waiter WHERE NOT waiter.grantedAND waiter.pid = $1 AND holder.granted AND holder.pid <> $1 AND holder.pid IN (25887, 25888, 25889) AND holder.mode= ANY (CASE waiter.mode WHEN 'AccessShareLock' THEN ARRAY['AccessExclusiveLock'] WHEN 'RowShareLock' THEN ARRAY['ExclusiveLock','AccessExclusiveLock']WHEN 'RowExclusiveLock' THEN ARRAY['ShareLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock']WHEN 'ShareUpdateExclusiveLock' THEN ARRAY['ShareUpdateExclusiveLock','ShareLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock']WHEN 'ShareLock'THEN ARRAY['RowExclusiveLock','ShareUpdateExclusiveLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock']WHEN 'ShareRowExclusiveLock'THEN ARRAY['RowExclusiveLock','ShareUpdateExclusiveLock','ShareLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock'] WHEN'ExclusiveLock' THEN ARRAY['RowShareL!ock','RowExclusiveLock','ShareUpdateExclusiveLock','ShareLock','ShareRowExclusiveLock','E The assertion failure seems to indicate that the number of LockMethodProcLockHash entries found by hash_seq_search didn't match the number that had been counted by hash_get_num_entries immediately before that. I don't see any bug in GetLockStatusData itself, so this suggests that there's something wrong with dynahash's entry counting, or that somebody somewhere is modifying the shared hash table without holding the appropriate lock. The latter seems a bit more likely, given that this must be a very low-probability bug or we'd have seen it before. An overlooked locking requirement in a seldom-taken code path would fit the symptoms. Or maybe bushpig just had some weird cosmic-ray hardware failure, but I don't put a lot of faith in such explanations. Thoughts? regards, tom lane
pgsql-hackers by date: