Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm() - Mailing list pgsql-bugs
From | Alexander Lakhin |
---|---|
Subject | Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm() |
Date | |
Msg-id | b8f458d6-f0d5-3b0b-1f53-a51aec8aa15b@gmail.com Whole thread Raw |
In response to | Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm() (Andrew Dunstan <andrew@dunslane.net>) |
Responses |
Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()
|
List | pgsql-bugs |
Hello, 28.06.2023 21:49, Tom Lane wrote: > The bad news is that while investigating this I came across > another hazard that seems much messier to fix. To wit, if > you have a tuple with "missing" pass-by-ref columns, then > some of the columns output by heap_deform_tuple() will > contain pointers into the tupdesc (cf. getmissingattr()). > That's not sufficient lifespan in the scenario we're dealing > with here: if the tupdesc gets trashed anytime before the > eventual ExecEvalFieldStoreForm(), we'll have garbage in the > result tuple. Please excuse me, as I'm definitely late to the party, but may I ask some questions to understand the issue discussed? I still can't see a practical way to get garbage data in a tuple with missing columns, so maybe I'm missing something (except for columns). If we talk about ExecEvalFieldStoreDeForm() -> heap_deform_tuple() -> getmissingattr(), then I can't find a way to add a non-null default for a record type to exploit this path. The rowtypes test contains: -- at the moment this will not work due to ALTER TABLE inadequacy: alter table fullname add column suffix text default ''; ERROR: cannot alter table "fullname" because column "people.fn" uses its row type So it seems to me, that ExecEvalFieldStoreDeForm() can't be called for a record/tuple having missing non-null attributes. And even if ALTER TABLE was "adequate", does it mean that ExecEvalFieldStoreDeForm() would return default values from the underlying table type? > Worse, I fear there may be many other places with latent bugs of the > same ilk. Before the attmissingval code was added, one could assume > that the result of heap_deform_tuple would hold good as long as you > had a pin on the source tuple. But now it depends on metadata as > well, and I don't think we have a coherent story about that. But is it true, that having a pin on tupdesc is enough to make sure that the result of heap_deform_tuple() holds good? If so, then probably we should find places, where tupdesc gets unpinned before that result is saved somewhere. I see the following callers of the heap_deform_tuple(): src/backend/replication/logical/reorderbuffer.c ReorderBufferToastReplace(): heap_deform_tuple() followed by heap_form_tuple() src/backend/executor/execTuples.c ExecForceStoreHeapTuple(), ExecForceStoreMinimalTuple(), ExecStoreHeapTupleDatum(): use long-living and hopefully pinned slot->tts_tupleDescriptor src/backend/executor/execExprInterp.c ExecEvalFieldStoreDeForm(): the above question applies here src/backend/executor/spi.c SPI_modifytuple(): heap_deform_tuple() followed by heap_form_tuple() src/backend/utils/adt/rowtypes.c record_out(), hash_record(), hash_record_extended(): tupdesc released after extracting/processing data record_send(): tupdesc released after sending data record_cmp(), record_eq(), record_image_cmp(), record_image_eq(): tupdesc1. tupdesc2 released after comparing data src/backend/utils/adt/jsonfuncs.c populate_record(): heap_deform_tuple() followed by heap_form_tuple() src/backend/utils/adt/expandedrecord.c deconstruct_expanded_record() uses long-living (pinned) erh->er_tupdesc src/backend/utils/adt/ri_triggers.c RI_Initial_Check(), RI_PartitionRemove_Check(): use long-living? SPI_tuptable->tupdesc src/backend/access/heap/heapam_handler.c reform_and_rewrite_tuple(): uses RelationGetDescr(OldHeap) (pinned, I suppose) src/backend/access/heap/heaptoast.c heap_toast_delete(), heap_toast_insert_or_update() use long-living rel->rd_att; toast_flatten_tuple(): heap_deform_tuple(..., tupleDesc, ...) followed by heap_form_tuple(..., tupleDesc, ...) toast_flatten_tuple_to_datum(): heap_deform_tuple followed by detoasting attrs src/backend/access/heap/heapam.c ExtractReplicaIdentity(): heap_deform_tuple(..., desc, ...) followed by heap_form_tuple(..., desc, ...) src/backend/access/common/heaptuple.c heap_modify_tuple(): heap_deform_tuple(..., tupleDesc, ...) followed by heap_form_tuple(..., tupleDesc, ...) heap_modify_tuple_by_cols(): heap_deform_tuple(..., tupleDesc, ...) followed by heap_form_tuple(..., tupleDesc, ...) src/backend/access/common/tupconvert.c execute_attr_map_tuple(): heap_deform_tuple() followed by heap_form_tuple() src/test/regress/regress.c make_tuple_indirect(): heap_deform_tuple() followed by heap_form_tuple() src/pl/plpgsql/src/pl_exec.c exec_move_row() called with tupdesc = NULL, SPI_tuptable->tupdesc, or tupdesc released after the call contrib/hstore/hstore_io.c hstore_from_record(), hstore_populate_record(): tupdesc released after extracting data And even if I missed some possibly problematic calls, maybe it's worth to consider fixing exactly that places... Also, besides heap_deform_tuple(), getmissingattr() is called from heap_getattr(). And heap_getattr() is used in many places, but most of them are protected too due to tupdesc pinned. Two suspicious places that I found are GetAttributeByName() and GetAttributeByNum(), so when using these functions you can really get the missing attribute value with the tupdesc released. But what circumstances do we need to end up with an invalid data? 1) Write a function that will call GetAttributeByName() and hold it's result inside for some time. 2) Add a passed-by-ref column with a default value to a table. 3) Pass to that function a tuple without the "missing" column (just "SELECT func(table) FROM table" won't do, but func(OLD) in a trigger should). 4) Trigger a relcache invalidation. 5) Perform a database access inside that function to process the invalidation. 6) Access the result of GetAttributeByName() after that. Maybe we could construct such test case, e.g. add to pg_regress one more SQL-accessible C function, bu9ccvhhihvcnb·t I wonder, is this the way to go? On the other hand, probably there are extensions, that use GetAttributeByName() in the non-safe way (as the function documentation doesn't warn about such issues), but maybe just perform datumCopy inside that function (and similar one(s))? Though, perhaps I just don't notice the elephant in the core... Best regards, Alexander
pgsql-bugs by date: