Re: Adding doubly linked list type which stores the number of items in the list - Mailing list pgsql-hackers
From | Bharath Rupireddy |
---|---|
Subject | Re: Adding doubly linked list type which stores the number of items in the list |
Date | |
Msg-id | CALj2ACWud64Jt0GzigzHRHnptmdr8__539BfvQds6uojxhT5Sw@mail.gmail.com Whole thread Raw |
In response to | Re: Adding doubly linked list type which stores the number of items in the list (David Rowley <dgrowleyml@gmail.com>) |
Responses |
Re: Adding doubly linked list type which stores the number of items in the list
|
List | pgsql-hackers |
On Mon, Oct 31, 2022 at 12:44 PM David Rowley <dgrowleyml@gmail.com> wrote: > > On Mon, 31 Oct 2022 at 19:05, Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > So, when an overflow occurs, the head->count wraps around after > > PG_UINT32_MAX, meaning, becomes 0 and we will catch it in an assert > > build. This looks reasonable to me. However, the responsibility lies > > with the developers to deal with such overflows. > > I'm not sure what the alternatives are here. One of the advantages of > dlist over List is that there are no memory allocations to add a node > which is already allocated onto a list. lappend() might need to > enlarge the list, which means you can't do that in a critical section. Using uint64 is one option to allow many elements, however, I'm also fine with removing the overflow assertions Assert(head->count > 0); /* count overflow check */ altogether and let the callers take the responsibility. dlist_* and dclist_* callers already have another responsibility - Caution: 'foo' must be a member of 'head'. > It's currently OK to add an item to a dlist in a critical section, > however, if we add an elog(ERROR) then it won't be. dlist_check() and dlist_member_check() have an elog(ERROR) and the above statement isn't true in case of ILIST_DEBUG-defined builds. > The best I think > we can do is to just let the calling code ensure that it only uses > dlist when it's certain that there can't be more than 2^32 items to > store at once. Right. + * able to store a maximum of PG_UINT32_MAX elements. It is up to the caller + * to ensure no more than this many items are added to a dclist. The above comment seems fine to me, if we really want to enforce any overflow checks on non-debug, non-assert builds, it might add some costs to dclist_* functions. > > I'm wondering if adding count to dlist_head and maintaining it as part > > of the existing dlist_* data structure and functions is any better > > than introducing dclist_*? In that case, we need only one function, > > dlist_count, no? Or do we choose to go dclist_* because we want to > > avoid the extra cost of maintaining count within dlist_*? If yes, is > > maintaining count in dlist_* really costly? > > I have a few reasons for not wanting to do that: > > 1) I think dlist operations are very fast at the moment. The fact > that the functions are static inline tells me the function call > overhead matters. Therefore, it's likely maintaining a count also > matters. > 2) Code bloat. The functions are static inline. That means all > compiled code that adds or removes an item from a dlist would end up > larger. That results in more instruction cache misses. This seems a fair point to me. > 3) I've no reason to believe that all call sites that do > dlist_delete() have the ability to know which list the node is on. > Just look at ReorderBufferAssignChild(). I decided to not convert the > subtxns dlist into a dclist as the subtransaction sometimes seems to > go onto the top transaction list before it's moved to the sub-txn's > list. > 4) There's very little or no scope for bugs in dclist relating to the > dlist implementation as all that stuff is done by just calling the > dlist_* functions. The only scope is really that it could call the > wrong dlist_* function. It does not seem terribly hard to ensure we > don't write any bugs like that. Right. > > Thanks. The v3 patch looks good to me. > > Great. Thanks for having a look. I will take another look at v3 tomorrow and probably mark it RfC. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: