Re: BUG #17884: gist_page_items() crashes for a non-leaf page of an index with non-key columns - Mailing list pgsql-bugs
From | Alexander Lakhin |
---|---|
Subject | Re: BUG #17884: gist_page_items() crashes for a non-leaf page of an index with non-key columns |
Date | |
Msg-id | e4c9897a-6864-4c47-f7ea-08385f2a7db5@gmail.com Whole thread Raw |
In response to | Re: BUG #17884: gist_page_items() crashes for a non-leaf page of an index with non-key columns (Michael Paquier <michael@paquier.xyz>) |
Responses |
Re: BUG #17884: gist_page_items() crashes for a non-leaf page of an index with non-key columns
|
List | pgsql-bugs |
17.05.2023 08:23, Michael Paquier wrote: > On Mon, May 15, 2023 at 11:00:00AM +0300, Alexander Lakhin wrote: >> Yes, the existing implementation might crash with non-leaf pages and just >> doesn't show non-key columns for items on leaves. > Ah, OK. After more caffeine.. So that's two problems, not one. Perhaps I had the lack of caffeine too, because it seems that I missed third one. I'm sorry about that. >> Yeah, the test output could be shorter and more representative at the same >> time. Please look at the improved test. > Sounds fine as it is, thanks for the update. The patch is now much > shorter in size. Yes, but I'm afraid it is not a final version yet. >> It sounds good to me, but for now I can see only a single caller of >> pg_get_indexdef_columns(), so maybe it would be harmless to merely change the >> existing function's parameter on HEAD? (To not have two distinct functions >> with generic parameters for just two callers, which will use only two >> fixed parameter values.) > Sure, we could do that for HEAD, but we would still have to maintain > an extra routine for the back-branches. So I would tend to choose an > approach that's consistent across all the branches while being as much > as possible extensible. I am fine with this approach, Thank you! > + tupdesc = CreateTupleDescCopy(RelationGetDescr(indexRel)); > + tupdesc->natts = IndexRelationGetNumberOfKeyAttributes(indexRel); > > Ah, I see. So you need that to be able to deform correctly the tuple > coming from a non-leaf page that only has key attributes. Tricky at > first sight, indeed.. Yes, the same thing was done in f2e403803, so I think we can safely repeat it here. > I have done a few adjustments to that patch, refining some comments, > fixing some ordering with the headers and applying an indentation. Is > that OK for you? While thinking about naming (what exactly "attroid" means there), I've reread the comment above BuildIndexValueDescription(): * The passed-in values/nulls arrays are the "raw" input to the index AM, * e.g. results of FormIndexDatum --- this is not necessarily what is stored * in the index, but it's what the user perceives to be stored. and recognized that I've made a mistake when relied on the existing coding. A simple example: create extension pageinspect; create extension pg_trgm; create table test_trgm(t text); create index trgm_idx on test_trgm using gist (t gist_trgm_ops); insert into test_trgm select to_char(g, 'textFM000') from generate_series(1, 1000) g; select gist_page_items(get_raw_page('trgm_idx', 0), 'trgm_idx'::regclass); select gist_page_items(get_raw_page('trgm_idx', 1), 'trgm_idx'::regclass); (I chose gist_trgm_ops because that opclass defined as follows: CREATE OPERATOR CLASS gist_trgm_ops FOR TYPE text USING gist ... STORAGE gtrgm) shows garbage values: gist_page_items --------------------------------------------- (1,"(1,65535)",32,f,"(t)=(\x02\x7F\x7F_ߟ)") (2,"(3,65535)",32,f,"(t)=(\x02')") (3,"(2,65535)",32,f,"(t)=(\x02߽)") (4,"(5,65535)",32,f,"(t)=(\x02\x7F)") (5,"(4,65535)",32,f,"(t)=(\x02w?\x7F)") (6,"(6,65535)",32,f,"(t)=(\x02/)") (6 rows) gist_page_items ----------------------------------------------------------- (1,"(0,198)",40,f,"(t)=(\x01 t te19898 extt19texxt1)") (2,"(0,200)",40,f,"(t)=(\x01 t te00 200extt20texxt2)") (3,"(0,202)",40,f,"(t)=(\x01 t te02 202extt20texxt2)") (4,"(0,203)",40,f,"(t)=(\x01 t te03 203extt20texxt2)") (5,"(0,56)",40,f,"(t)=(\x01 t te05656 extt05texxt0)") ... It means that pageinspect tries to print stored gtrgm values as text, and I believe that's the third problem here. Moreover, for the point_ops class opcintype is point, but opckeytype is box, thus indexes test_gist_idx and test_gist_idx_inc used in the test in fact contain boxes, not points. So the fixed gist_page_items() produces different output even for existing test cases. (Please look at the patch attached.) I'm somewhat confused with the output: (p)=((185,185),(1,1)) (Maybe add more parentheses around val to get "(p)=(((185,185),(1,1)))", not sure about that.) But I think the current gist_page_items() output is not correct and should be fixed anyway. Best regards, Alexander
Attachment
pgsql-bugs by date: