Home > mailing lists

Recursive use of syscaches (was: relation ### modified while in use) - Mailing list pgsql-hackers

From	Tom Lane
Subject	Recursive use of syscaches (was: relation ### modified while in use)
Date	November 9, 2000 13:52:04
Msg-id	15452.973795877@sss.pgh.pa.us Whole thread Raw
In response to	RE: relation ### modified while in use ("Hiroshi Inoue" <Inoue@tpf.co.jp>)
Responses	Re: Recursive use of syscaches (was: relation ### modified while in use) Re: Recursive use of syscaches (was: relation ### modified while in use)
List	pgsql-hackers

Tree view

"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
>> Does this occur after a prior error message?  I have been suspicious
>> because there isn't a mechanism to clear the syscache-busy flags during
>> xact abort.

> I don't know if I've seen the cases you pointed out.
> I have the following gdb back trace. Obviously it calls
> SearchSysCache() for cacheId 10 twice. I was able
> to get another gdb back trace but discarded it by
> mistake.  Though I've added pause() just after detecting
> recursive use of cache,backends continue the execution
> in most cases unfortunately.
> I've not examined the backtrace yet. But don't we have
> to nail system relation descriptors more than now ?

I don't think that's the solution; nailing more descriptors than we
absolutely must is not a pretty approach, and I don't think it solves
this problem anyway.  Your example demonstrates that recursive use
of a syscache is perfectly possible when a cache inval message arrives
just as we are about to search for a syscache entry.  Consider
the following path:

1. We are doing index_open and ensuing relcache entry load for some user
index.  In the middle of this, we need to fetch a not-currently-cached
pg_amop entry that is referenced by the index.

2. As we open pg_amop, we receive an SI message for some other user
index that is referenced in the current query and so currently has
positive refcnt.  We therefore attempt to rebuild that index's relcache
entry.

3. At this point we have recursive invocation of relcache load, which
may well lead to a recursive attempt to fetch the very same pg_amop
entry that the outer relcache load is trying to fetch.

Therefore, the current error test of checking for re-entrant lookups in
the same syscache is bogus.  It would still be bogus even if we refined
it to notice whether the exact same entry is being sought.

On top of that, we have the issue I was concerned about that there is
no mechanism for clearing the cache-busy flags during xact abort.

Rather than trying to fix this stuff, I propose that we simply remove
the test for recursive use of a syscache.  AFAICS it will never catch
any real bugs in production.  It might catch bugs in development (ie,
someone messes up the startup sequence in a way that causes a truly
circular cache lookup) but I think a stack overflow crash is a
perfectly OK result then.
        regards, tom lane

pgsql-hackers by date:

From: "Kevin O'Gorman"
Date: 09 November 2000, 13:45:43
Subject: initdb failure

From: Tom Lane
Date: 09 November 2000, 13:58:09
Subject: Re: initdb failure

Recursive use of syscaches (was: relation ### modified while in use) - Mailing list pgsql-hackers

Previous

Next