TupleTableSlot API problem - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | TupleTableSlot API problem |
Date | |
Msg-id | 26485.1238361697@sss.pgh.pa.us Whole thread Raw |
Responses |
Re: TupleTableSlot API problem
|
List | pgsql-hackers |
In CVS HEAD, try this in an assert-enabled build: CREATE TEMPORARY TABLE tree( id INTEGER PRIMARY KEY, parent_id INTEGER REFERENCEStree(id) ); INSERT INTO tree VALUES (1, NULL), (2, 1), (3,1), (4,2), (5,2), (6,2), (7,3), (8,3), (9,4), (10,4), (11,7), (12,7), (13,7), (14, 9),(15,11), (16,11); WITH RECURSIVE t(id, path) AS ( VALUES(1,ARRAY[]::integer[]) UNION ALL SELECT tree.id, t.path || tree.id FROM tree JOIN t ON (tree.parent_id = t.id) ) SELECT t1.id, t2.path, t2 FROM t AS t1 JOIN t AS t2 ON (t1.id=t2.id); I get ERROR: cache lookup failed for type 2139062143 (the error might be less predictable in a non-assert build). What is happening is that ExecProject fetches the Datum value of t2.path from a TupleTableSlot that contains a "minimal tuple" fetched from the tuplestore associated with the CTE "t". Then, it fetches the value of the whole-row variable t2. ExecEvalWholeRowVar calls ExecFetchSlotTuple, which finds that the slot doesn't contain a regular tuple, so it calls ExecMaterializeSlot, which replaces the minimal tuple with a regular tuple and frees the former. Now the already-fetched Datum for t2.path is pointing at freed storage. In principle there ought to be a whole lot of bugs around this area, because ExecFetchSlotTuple, ExecFetchSlotMinimalTuple, and ExecFetchSlotTupleDatum all are potentially destructive of the original slot contents; furthermore there ought be be visible bugs in 8.3 and maybe before. However, in an hour or so of poking at it, I've been unable to produce a failure without using CTE syntax; it's just really hard to get the planner to generate a whole-row-var reference in a context where the referenced slot might contain a minimal tuple. Still, I suspect that merely shows I'm being insufficiently creative today. I think we ought to assume there's a related bug in existing branches unless someone can prove there's not. One solution to this problem is to redefine these functions as always returning locally palloc'd tuples, so that they can be guaranteed to not modify the contents of the Slot. That's fairly unfortunate though since in the vast majority of cases it will result in a palloc/pfree cycle that is not necessary. It would also mean changing most of the callers to pfree the results, unless we can tolerate memory leaks there. Plan B is to agree that these functions should be documented as potentially destructive of the slot contents, in which case this is just a bug in ExecEvalWholeRowVar: it should be using a call that is nondestructive (eg ExecCopySlotTuple). (So far as I can determine from looking at the callers, only ExecEvalWholeRowVar and its sibling ExecEvalWholeRowSlow actually pose a risk; no other call sites look like there's any risk of having other existing references into the slot contents.) Plan C is to change the Slot logic so that a slot can store both regular and minimal tuples, not just one or the other. It would only actually do that if one format is stored into it and then the other is requested. The eventual ExecClearTuple would then free both copies. Whichever format is not the one originally stored is secondary and isn't used for fetching individual datums from the Slot. This nets out at the same performance we have now, it just postpones the pfree() that currently happens immediately when we change the slot's format. It might result in higher peak memory use but I'm not exceedingly worried about that. Plan C is probably the most complicated to code, but I'm inclined to try to fix it that way, because plan A loses performance and plan B exposes us to a continuing risk of future bugs of the same type. Comments, better ideas? regards, tom lane
pgsql-hackers by date: