Re: Hash Indexes - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Hash Indexes |
Date | |
Msg-id | CAA4eK1+JM+ffHgUfW8wf+Lgn2eJ1fGjyn6b_L5m0fiTEj2_6Pw@mail.gmail.com Whole thread Raw |
In response to | Re: Hash Indexes (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Hash Indexes
Re: Hash Indexes |
List | pgsql-hackers |
On Thu, Sep 29, 2016 at 8:27 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: > On Thu, Sep 29, 2016 at 6:04 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Wed, Sep 28, 2016 at 3:04 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> >> As I was looking at the old text regarding deadlock risk, I realized >> what may be a serious problem. Suppose process A is performing a scan >> of some hash index. While the scan is suspended, it attempts to take >> a lock and is blocked by process B. Process B, meanwhile, is running >> VACUUM on that hash index. Eventually, it will do >> LockBufferForCleanup() on the hash bucket on which process A holds a >> buffer pin, resulting in an undetected deadlock. In the current >> coding, A would hold a heavyweight lock and B would attempt to acquire >> a conflicting heavyweight lock, and the deadlock detector would kill >> one of them. This patch probably breaks that. I notice that that's >> the only place where we attempt to acquire a buffer cleanup lock >> unconditionally; every place else, we acquire the lock conditionally, >> so there's no deadlock risk. Once we resolve this problem, the >> paragraph about deadlock risk in this section should be revised to >> explain whatever solution we come up with. >> >> By the way, since VACUUM must run in its own transaction, B can't be >> holding arbitrary locks, but that doesn't seem quite sufficient to get >> us out of the woods. It will at least hold ShareUpdateExclusiveLock >> on the relation being vacuumed, and process A could attempt to acquire >> that same lock. >> > > Right, I think there is a danger of deadlock in above situation. > Needs some more thoughts. > I think one way to avoid the risk of deadlock in above scenario is to take the cleanup lock conditionally, if we get the cleanup lock then we will delete the items as we are doing in patch now, else it will just mark the tuples as dead and ensure that it won't try to remove tuples that are moved-by-split. Now, I think the question is how will these dead tuples be removed. We anyway need a separate mechanism to clear dead tuples for hash indexes as during scans we are marking the tuples as dead if corresponding tuple in heap is dead which are not removed later. This is already taken care in btree code via kill_prior_tuple optimization. So I think clearing of dead tuples can be handled by a separate patch. I have also thought about using page-scan-at-a-time idea which has been discussed upthread[1], but I think we can't completely eliminate the need to out-wait scans (cleanup lock requirement) for scans that are started when split-in-progress or for non-MVCC scans as described in that e-mail [1]. We might be able to find some way to solve the problem with this approach, but I think it will be slightly complicated and much more work is required as compare to previous approach. What is your preference among above approaches to resolve this problem or let me know if you have a better idea to solve it? [1] - https://www.postgresql.org/message-id/CAA4eK1Jj1UqneTXrywr%3DGg87vgmnMma87LuscN_r3hKaHd%3DL2g%40mail.gmail.com -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: