Home > mailing lists

Re: Hash Indexes - Mailing list pgsql-hackers

From	Amit Kapila
Subject	Re: Hash Indexes
Date	October 19, 2016 00:27:56
Msg-id	CAA4eK1JGQjWdfVtBhk8+bvvac-TjwMBOquWL0piSjheZ=JZEjw@mail.gmail.com Whole thread Raw
In response to	Re: Hash Indexes (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: Hash Indexes Re: Hash Indexes
List	pgsql-hackers

Tree view

On Tue, Oct 18, 2016 at 10:52 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Oct 18, 2016 at 5:37 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> On Wed, Oct 5, 2016 at 10:22 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> On Tue, Oct 4, 2016 at 12:36 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>>> I think one way to avoid the risk of deadlock in above scenario is to
>>>> take the cleanup lock conditionally, if we get the cleanup lock then
>>>> we will delete the items as we are doing in patch now, else it will
>>>> just mark the tuples as dead and ensure that it won't try to remove
>>>> tuples that are moved-by-split.  Now, I think the question is how will
>>>> these dead tuples be removed.  We anyway need a separate mechanism to
>>>> clear dead tuples for hash indexes as during scans we are marking the
>>>> tuples as dead if corresponding tuple in heap is dead which are not
>>>> removed later.  This is already taken care in btree code via
>>>> kill_prior_tuple optimization.  So I think clearing of dead tuples can
>>>> be handled by a separate patch.
>>>
>>> That seems like it could work.
>>
>> I have implemented this idea and it works for MVCC scans.  However, I
>> think this might not work for non-MVCC scans.  Consider a case where
>> in Process-1, hash scan has returned one row and before it could check
>> it's validity in heap, vacuum marks that tuple as dead and removed the
>> entry from heap and some new tuple has been placed at that offset in
>> heap.
>
> Oops, that's bad.
>
>> Now when Process-1 checks the validity in heap, it will check
>> for different tuple then what the index tuple was suppose to check.
>> If we want, we can make it work similar to what btree does as being
>> discussed on thread [1], but for that we need to introduce page-scan
>> mode as well in hash indexes.   However, do we really want to solve
>> this problem as part of this patch when this exists for other index am
>> as well?
>
> For what other index AM does this problem exist?
>

By this problem, I mean to say deadlocks for suspended scans, that can
happen in btree for non-Mvcc or other type of scans where we don't
release pin during scan.  In my mind, we have below options:

a. problem of deadlocks for suspended scans should be tackled as a
separate patch as it exists for other indexes (at least for some type
of scans).
b. Implement page-scan mode and then we won't have deadlock problem
for MVCC scans.
c. Let's not care for non-MVCC scans unless we have some way to hit
those for hash indexes and proceed with Dead tuple marking idea.  I
think even if we don't care for non-MVCC scans, we might hit this
problem (deadlocks) when the index relation is unlogged.

Here, even if we want to go with (b), I think we can handle it in a
separate patch, unless you think otherwise.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Bruce Momjian
Date: 18 October 2016, 23:25:29
Subject: Re: macaddr 64 bit (EUI-64) datatype support

From: Haribabu Kommi
Date: 19 October 2016, 00:36:36
Subject: Re: New SQL counter statistics view (pg_stat_sql)

Re: Hash Indexes - Mailing list pgsql-hackers

Previous

Next