Re: Command Triggers patch v18 - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Command Triggers patch v18 |
Date | |
Msg-id | CA+TgmobzaBHnBnzCcbbtpKyw+v_7n2MPU8wsReYkHLw-Lw64mQ@mail.gmail.com Whole thread Raw |
In response to | Re: Command Triggers patch v18 (Andres Freund <andres@2ndquadrant.com>) |
Responses |
Re: Command Triggers patch v18
Re: Command Triggers patch v18 |
List | pgsql-hackers |
On Tue, Mar 27, 2012 at 9:37 AM, Andres Freund <andres@2ndquadrant.com> wrote: > Not necessarily. One use-case is doing something *before* something happens > like enforcing naming conventions, data types et al during development/schema > migration. That is definitely a valid use case. The only reason why we don't have something like that built into the ObjectAccessHook framework is because we haven't yet figured out a clean way to make it work. Most of KaiGai's previous attempts involved passing a bunch of garbage selected apparently at random into the hook function, and I rejected that as unmaintainable. Dimitri's code here doesn't have that problem - it passes in a consistent set of information across the board. But I still think it's unmaintainable, because there's no consistency about when triggers get invoked, or whether they get invoked at all. We need something that combines the strengths of both approaches. I actually think that, to really meet all needs here, we may need the ability to get control in more than one place. For example, one thing that KaiGai has wanted to do (and I bet Dimitri would live to be able to do it too, and I'm almost sure that Dan Farina would like to do it) is override the built-in security policy for particular commands. I think that KaiGai only wants to be able to deny things that would normally be allowed, but I suspect some of those other folks would also like to be able to allow things that would normally be denied. For those use cases, you want to get control at permissions-checking time. However, where Dimitri has the hooks right now, BEFORE triggers for ALTER commands fire after you've already looked up the object that you're manipulating. That has the advantage of allowing you to use the OID of the object, rather than its name, to make policy decisions; but by that time it's too late to cancel a denial-of-access by the first-line permissions checks. Dimitri also mentioned wanting to be able to cancel the actual operation - and presumably, do something else instead, like go execute it on a different node, and I think that's another valid use case for this kind of trigger. It's going to take some work, though, to figure out what the right set of control points is, and it's probably going to require some refactoring of the existing code, which is a mess that I have been slowly trying to clean up. >> In the >> interest of getting event triggers, you're arguing that we should >> contort the definition of the work "command" to include sub-commands, >> but not necessarily commands that turn out to be a no-op, and maybe >> things that are sort of like what commands do even though nobody >> actually ever executed a command by that name. I just don't think >> that's a good idea. We either have triggers on commands, and they >> execute once per command, period; or we have triggers on events and >> they execute every time that event happens. > I don't think thats a very helpfull definition. What on earth would you want to > do with plain passing of > CREATE SCHEMA blub CREATE TABLE foo() CREATE TABLE bar(); > > So some decomposition seems to be necessary. Getting the level right sure > ain't totally easy. > It might be helpful to pass in the context from which something like that > happened. I agree that it's not a very helpful decision, but I'm not the one who said we wanted command triggers rather than event triggers. :-) > Which would basically require including pg_dump in the backend to implement > replication and logging. I don't think librarifying pg_dump is a fair burden > at all. I don't either, but that strikes me as a largely unrelated problem. As you may recall, I've complained that too much of that functionality is in the client in the past, and I haven't changed my opinion on that. But how is that any different with Dimitri's approach? You can get a callback AFTER CREATE TABLE, and you'll get the table name. Now what? If you get the trigger in C you can get the node tree, but that's hardly any better. You're still going to need to do some pretty tricky push-ups to get reliable replication. It's not at all evident to me that the parse-tree is any better a place to start than the system catalog representation; in fact, I would argue that it's probably much worse, because you'll have to exactly replicate whatever the backend did to decide what catalog entries to create, or you'll get drift between servers. > Also I have a *very hard* time to imagining to sensibly implement replicating > CREATE TABLE or ALTER TABLE ... ADD COLUMN with just object access hooks. > There is just not enough context. How would you discern between a ADD COLUMN > and a CREATE TABLE? They look very similar or even identical on a catalog > level. That can be fixed using the optional argument to InvokeObjectAccessHook, similar to what we've done for differentiating internal drops from other drops. > I also have the strong feeling that all this would expose implementation > details *at least* as much as command triggers. A slight change in order of > catalog modifcation would be *way* harder to hide via the object hook than > something of a similar scale via command triggers. I don't see that. Right now, when you do something like CREATE TABLE foo (a serial primary key), the exact order in which the objects are created is apparent from the order of the trigger firing. The order of BEFORE vs. AFTER triggers is also drawn out of a hat. So I don't see how it's any worse. It also avoids a lot of definitional ambiguity. If you say that we're going to have a trigger on the CREATE SEQUENCE command, then what happens when the user creates a sequence via some other method? The current patch says that we should handle that by calling the CREATE SEQUENCE trigger if it happens to be convenient because we're going through the same code path that a normal CREATE SEQUENCE would have gone through, but if it uses a different code path then let's not bother. Otherwise, how do you explain the fact that a DROP .. CASCADE doesn't fire triggers for the subordinate objects dropped, but CREATE TABLE does fire triggers for the subordinate objects created? If you trigger on events, then things are a lot more cut-and-dried. You fire the trigger when the event happens. If the event doesn't happen, you don't fire the trigger. If an implementation change causes objects to be created when they weren't before, or not created when they were before, then, yes, trigger firing behavior will change, but it will be clear that the new behavior is correct. Right now, future developers modifying this code have no reasonable basis for deciding whether or not a trigger ought to fire in a given situation. There's no consistent rule, no specification, and no documentation for how that's intended to happen. > Hm. I agree that there is some work needed to make the whole handling more > consistent. But I don't think its as bad as you paint it. > > I don't want to brush of your review here - you definitely do raise valid > points. I don't particularly like the current implementation very much. But I > simply fail to see a less annoying way. > > Its also not that I am just defending a colleague - you might have noticed the > @2ndquadrant - I got interested in this patch quite a bit before that was on > the horizon. Yep, understood, and I'd have the same complaints about this patch that I do presently if it were submitted by someone who worked for EnterpriseDB, or some independent third party. I do think it's a bit of a shame that more people didn't pay attention to and help improve the ObjectAccessHook machinery as we were working on that. I mean, Dimitri is not the first or last person to want to get control during DDL operations, and KaiGai's already done a lot of work figuring out how to make it work reasonably. Pre-create hooks don't exist in that machinery not because nobody wants them, but because it's hard. This whole problem is hard. It would be wrong to paint it as a problem that is unsolvable or not valuable, but it would be equally wrong to expect that it's easy or that anyone's first attempt (mine, yours, Dimitri's, KaiGai's, or Tom Lane's) is going to fall painlessly into place without anyone needing to sweat a little blood. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: