WIP patch for "operator families" - Mailing list pgsql-patches
From | Tom Lane |
---|---|
Subject | WIP patch for "operator families" |
Date | |
Msg-id | 5378.1166735231@sss.pgh.pa.us Whole thread Raw |
Responses |
Re: WIP patch for "operator families"
|
List | pgsql-patches |
Attached is the current state of my work on generalizing operator classes into "operator families", as per discussion last week. It's far from ready to apply, but I thought I'd circulate it for comments. What's done is the restructuring of the pg_opclass/pg_amop/pg_amproc catalogs and the removal of pg_operator's oprlsortop/oprrsortop/oprltcmpop/oprgtcmpop columns, plus enough code updates that the system still compiles and passes the core regression tests. I haven't yet added dependency.c support for dropping pg_opfamily/pg_amop/pg_amproc entries, nor have I fixed CREATE/DROP OPERATOR CLASS (hence no point in trying to run the contrib regression tests yet...). Also, most of the planned improvements that this change will enable, such as better handling of cross-type cases in predtest.c, aren't done yet --- I plan to do those as separate, later commits. I have come across a couple of things that might be worthy of comment: * varchar_ops and cidr_ops still exist as pg_opclass entries; this is mainly so that any pg_dump scripts that explicitly reference them in index definitions will still work. They are effectively just aliases for text_ops and inet_ops, however, because they have the same opcfamily and opcintype as those entries. This is needful because the relcache code loads default rd_operator and rd_support entries based on lefttype = righttype = opcintype, and there aren't any entries showing varchar as those types, only text. I had to add one unexpected wart to get it to work, though: GetDefaultOpclass was failing to identify a default operator class for varchar columns, because it found no exact match and two binary-compatible matches (text_ops and bpchar_ops). I added a new heuristic to it to choose a binary-compatible match to a "preferred type" if there is only one such. I don't particularly like the "preferred type" heuristics but this doesn't seem like it could break any cases that work now. The only alternative I can see is to extend the pg_opclass definition to allow a more explicit representation of "alias" opclasses, but that seems like it'd be a wart also. * The RowCompare parsing code in parse_expr.c wants to know which opclasses are defaults so it can prefer those operator interpretations over non-default ones. This is a big pain with the new structure because operators don't belong to operator classes directly, only to families, and there's not a notion of default-ness at the operator family level. We could possibly add one but I'd rather not --- it's not immediately clear to me that all the opclasses in a family should always have the same default-or-not status. To get it working I hacked some code that looks to see if the opfamily contains opclasses that are defaults for one or both of the operator's input datatypes, but this seems ugly and not necessarily guaranteed to give unsurprising results. What I am thinking is a better solution is to say that if a row comparison operator appears in multiple opfamilies then we can use any one of them to determine its interpretation, regardless of defaultness. This is how we've done things in the past (eg in SelectSortFunction). When I wrote the RowCompare code I was worried about cases like finding the operator as both LessThan in a standard opclass and GreaterThan in a reverse-sort opclass --- but on reflection it doesn't matter whether RowCompare thinks it's LessThan or GreaterThan, it'll get the right answer either way. We could only get in trouble if the same operator were registered as, say, LessThan in one class and LessEqual in another, but it's pretty darn hard to think of non-broken opclasses for which that could hold. Comments? regards, tom lane
Attachment
pgsql-patches by date: