Re: Review: B-Tree emulation for GIN - Mailing list pgsql-hackers
From | Teodor Sigaev |
---|---|
Subject | Re: Review: B-Tree emulation for GIN |
Date | |
Msg-id | 49D731D4.9030608@sigaev.ru Whole thread Raw |
In response to | Re: Review: B-Tree emulation for GIN (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Review: B-Tree emulation for GIN
|
List | pgsql-hackers |
> * After this new look at the code I think that matchPartialInPendingList > is completely broken. Surely it's an infinite loop if the > comparePartial function returns -1. I also don't understand why it fixed > stops short of examining the tuple at maxoff, or for that matter why > it's doing any searching at all when the code that's calling it seems > to be implementing a binary search. I think that part of the code > needs more testing and commenting. Tuples of rows are stored in pending list in the same order as in entry tree (order by (attrnum, Datum)). So, it's possible to use binary search. By test's binary search presents about 5-15% of performance. But heap's row could be one page or on several pages. If page contains several rows, each row is checked separately and tuples of one heap row are placed from pendingPosition->firstOffset up to pendingPosition->lastOffset. The page is marked by GIN_LIST_FULLROW flag if it contains all tuples of row(s) or it's a last page in page's chain containing single heap row. scanGetCandidate() always returns offset range for tuples of one heap row. Although it may return NOT all tuples of row, if so then collectDatumForItem() will ask scanGetCandidate() about next portion of tuples of that heap row. It's safe because tuple's of one heap row are continuosly stored in pending list. > * In keyGetItem(), it's not at all clear what the assumptions are for > the contents of the entryRes[] array and for the interaction with a > LossyPage entry. Is it intentional that you run through the array > comparing to a LossyPage key before deciding to exit the loop? It > seems like that might be affecting the behavior at the next call, > but if so you're making some undocumented assumptions about the way > comparison of a lossy-page pointer is going to behave. I thought it'd > be cleaner to move the ItemPointerIsLossyPage check up to before that > loop (right after the ItemPointerIsMax test) but the more I look at it > the less sure I am if that would break things. That code needs more > commenting too. entryRes[] array is used for two purposes: - as an argument for consistentFn - to store information about entries which was a minimal on previous call/loop. So, only that entries should be renewed by entryGetItem() call. Lossy page is encoded as ItemPointer with greatest offset (0xffff), so keyGetItem() could return items from page and then return the same page as lossy. This is not a problem because of usage of TIDBitmap to callect all results. So, algorithm of keyGetItem could be described as: 1 renew all entries which is equal to previous result (entryGetItems returns them in increasing order) 2 find minimal entries 3 fill up entryRes[] to new values and check lossy or call consistentFn > * I'd also like to come to some agreement about getting rid of the > fail-on-NULL-scankey problem in newScanKey(). As I noted in the > comment there, we could make that work cleanly if we are willing to > assume that all GIN-indexable operators are strict. We already assume > the same for hash and btree operators, so it doesn't seem like a big > problem to do this, but I wonder if there are any objections. Agree. I changed the GIN code, but don't know where is other places to change to fixate this agreement. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Attachment
pgsql-hackers by date: