Allowing extensions to supply operator-/function-specific info - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Allowing extensions to supply operator-/function-specific info |
Date | |
Msg-id | 15193.1548028093@sss.pgh.pa.us Whole thread Raw |
Responses |
Re: Allowing extensions to supply operator-/function-specific info
Re: Allowing extensions to supply operator-/function-specific info |
List | pgsql-hackers |
Over in the thread at [1], we realized that PostGIS has been thrashing around trying to fake its way to having "special index operators", ie a way to automatically convert WHERE clauses into lossy index quals. That's existed in a non-extensible way inside indxpath.c for twenty years come July. Since the beginning I've thought we should provide a way for extensions to do similar things, but it never got to the top of the to-do queue. Now I think it's time. One low-effort answer is to add a hook call in indxpath.c that lets extensions manipulate the sets of index clauses extracted from a relation's qual clauses, but I don't especially like that: it dumps all the work onto extensions, resulting in lots of code duplication, plus they have a badly-documented and probably moving target for what they have to do. Another bit of technical debt that's even older is the lack of a way to attach selectivity estimation logic to boolean-returning functions. So that motivates me to think that whatever we do here should be easily extensible to allow different sorts of function- or operator-related knowledge to be supplied by extensions. We already have oprrest, oprjoin, and protransform hooks that allow certain kinds of knowledge to be attached to operators and functions, but we need something a bit more general. What I'm envisioning therefore is that we allow an auxiliary function to be attached to any operator or function that can provide functionality like this, and that we set things up so that the set of tasks that such functions can perform can be extended over time without SQL-level changes. For example, we could say that the function takes a single Node* argument, and that the type of Node tells it what to do, and if it doesn't recognize the type of Node it should just return NULL indicating "use default handling". We'd start out with two relevant Node types, one for the selectivity-estimation case and one for the extract-a-lossy- index-qual case, and we could add more over time. What we can do to attach such a support function to a target function is to repurpose the pg_proc.protransform column to represent the support function. The existing protransform functions already have nearly the sort of API I'm thinking about, but they only accept FuncExpr* not any other node type. It'd be easy to change them though, because there's only about a dozen and they are all in core; we never invented any way for extensions to access that functionality. (So actually, the initial API spec here would include three possibilities, the third one being equivalent to the current protransform behavior.) As for attaching support functions to operators, we could consider widening the pg_operator catalog to add a new column. But I think it might be a cleaner answer to just say "use the support function attached to the operator's implementation function, if there is one". This would require that the support functions be able to cope with either FuncExpr or OpExpr inputs, but that does not seem like much of a burden as long as it's part of the API spec from day one. Since there isn't any SQL API for attaching support functions, we'd have to add one, but adding another clause to CREATE FUNCTION isn't all that hard. (Annoyingly, we haven't created any cheaply extensible syntax for CREATE FUNCTION, so this'd likely require adding another keyword. I'm not interested in doing more than that right now, though.) I'd be inclined to rename pg_proc.protransform to "prosupport" to reflect its wider responsibility, and make the new CREATE FUNCTION clause be "SUPPORT FUNCTION foo" or some such. I'm not wedded to that terminology, if anyone has a better idea. One thing that's not entirely clear to me is what permissions would be required to use that clause. The support functions will have signature "f(internal) returns internal", so creating them at all will require superuser privilege, but it seems like we probably also need to restrict the ability to attach one to a target function --- attaching one to the wrong function could probably have bad consequences. The easy way out is to say "you must be superuser"; maybe that's enough for now, since all the plausible use-cases for this are in extensions containing C functions anyway. (A support function would have to be coded in C, although it seems possible that its target function could be something else.) Thoughts? If we have agreement on this basic design, making it happen seems like a pretty straightforward task. regards, tom lane PS: there is, however, a stumbling block that I'll address in a separate message, as it seems independent of this infrastructure. [1] https://www.postgresql.org/message-id/flat/CACowWR0TXXL0NfPMW2afCKzX++nHHBZLW3-BLusu_B0WjBB1=A@mail.gmail.com
pgsql-hackers by date: