Re: Detection of nested function calls - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Detection of nested function calls |
Date | |
Msg-id | 2355.1382710707@sss.pgh.pa.us Whole thread Raw |
In response to | Detection of nested function calls (Hugo Mercier <hugo.mercier@oslandia.com>) |
Responses |
Re: Detection of nested function calls
Re: Detection of nested function calls Re: Detection of nested function calls |
List | pgsql-hackers |
Hugo Mercier <hugo.mercier@oslandia.com> writes: > PostGIS functions that manipulate geometries have to unserialize their > input geometries from the 'flat' varlena representation to their own, > and serialize the processed geometries back when returning. > But in such nested call queries, this serialization-unserialization > process is just an overhead. This is a reasonable thing to worry about, not just for PostGIS types but for many container types such as arrays --- it'd be nice to be able to work with an in-memory representation that wasn't just a contiguous blob of data. For instance, assignment to an array element might become a constant-time operation even when working with variable-length datatypes. > So we thought having a way for user functions to know if they are part > of a nested call could allow them to avoid this serialization phase. However, this seems like a completely wrong way to go at it. In the first place, it wouldn't help for situations like a complex value stored in a plpgsql variable. In the second, I don't think that what you are describing scales to any more than the most trivial situations. What about functions with more than one complex-type input, for example? And you'd need to be certain that every single function taking or returning the datatype gets updated at exactly the same time, else it'll break. I think the right way to attack it is to create some way for a Datum value to indicate, at runtime, whether it's a flat value or an in-memory representation. Any given function returning the type could choose to return either representation. The datatype would have to provide a way to serialize the in-memory representation, when and if it came time to store it in a table. To avoid breaking functions that hadn't yet been taught about the new representation, we'd probably want to redefine the existing DETOAST macros as also invoking this datatype flattening function, and then you'd need to use some new access macro if you wanted visibility of the non-flat representation. (This assumes that the whole thing is only applicable to toastable datatypes, but that seems like a reasonable restriction.) Another thing that would have to be attacked in order to make the plpgsql-variable case work is that you'd need some design for copying such Datums in-memory, and perhaps a reference count mechanism to optimize away unnecessary copies. Your idea of tying the optimization to the nested function call scenario would avoid the need to solve this problem, but I think it's too narrow a scope to justify all the other work that'd be involved. Some colleagues of mine at Salesforce have been playing with ideas like this, though last I heard they were nowhere near having a submittable patch. regards, tom lane
pgsql-hackers by date: