Re: Command Triggers patch v18 - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Command Triggers patch v18 |
Date | |
Msg-id | CA+TgmoZVW1Ui+-NdOwLMfPbXLd6j0CAN8VRigVxMSpnrQkSc-A@mail.gmail.com Whole thread Raw |
In response to | Re: Command Triggers patch v18 (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Command Triggers patch v18
|
List | pgsql-hackers |
On Mon, Mar 26, 2012 at 9:11 PM, Robert Haas <robertmhaas@gmail.com> wrote: > [ various trivial issues ] OK, now I got that out of my system. Now on to bigger topics. I am extremely concerned about the way in which this patch arranges to invoke command triggers. You've got call sites spattered all over the place, and I think that's going to be, basically, an unfixable mess and a breeding ground for bugs. On a first read-through: 1. BEFORE ALTER TABLE triggers are fired in AlterTable(). However, an ALTER TABLE statement does not necessarily call AlterTable() once and only once. The real top-level logic for AlterTable is in ProcessUtility(), which runs transformAlterTableStmt() to generate a list of commands and then either calls AlterTable() or recursively invokes ProcessUtility() for each one. This means that if IF EXISTS is used and the table does not exist, then BEFORE command triggers won't get invoked at all. On the other hand, if multiple commands are specified, then I think AlterTable() may get invoked either once or more than once, depending on exactly which commands are specified; and we might manage to fire some CREATE INDEX command triggers or whatnot as well, again depending on exactly what that ALTER TABLE command is doing. 2. BEFORE CREATE TABLE triggers seem to have similar issues; see transformCreateStmt. rhaas=# create table foo (a serial primary key); NOTICE: CREATE TABLE will create implicit sequence "foo_a_seq" for serial column "foo.a" NOTICE: snitch: BEFORE CREATE SEQUENCE public.foo_a_seq [<NULL>] NOTICE: snitch: AFTER CREATE SEQUENCE public.foo_a_seq [16392] NOTICE: snitch: BEFORE CREATE TABLE public.foo [<NULL>] NOTICE: snitch: BEFORE CREATE INDEX public.foo_pkey [<NULL>] NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "foo_pkey" for table "foo" NOTICE: snitch: AFTER CREATE INDEX public.foo_pkey [16398] NOTICE: snitch: BEFORE ALTER SEQUENCE public.foo_a_seq [16392] NOTICE: snitch: AFTER ALTER SEQUENCE public.foo_a_seq [16392] NOTICE: snitch: AFTER CREATE TABLE public.foo [16394] CREATE TABLE 3. RemoveRelations() and RemoveObjects() similarly take the position that when the object does not exist, command triggers need not fire. This seems entirely arbitrary. CREATE EXTENSION IF NOT EXISTS, however, takes the opposite (and probably correct) position that even if we decide not to create the extension, we should still fire command triggers. In a similar vein, AlterFunctionOwner_oid() skips firing the command triggers if the old and new owners happen to be the same, but other forms of ALTER FUNCTION (e.g. ALTER FUNCTION .. COST) fire triggers even if the old and new parameters are the same; and AlterForeignDataWrapperOwner_internal does NOT skip firing command triggers just because the old and new owners are the same. 4. RemoveRelations() and RemoveObjects() also take the position that a statement like "DROP TABLE foo, bar" should fire each relevant BEFORE command trigger twice, then drop both objects, then fire each relevant AFTER command trigger twice. I think that's wrong. It's 100% clear that the user executed one, and only one, command. The only real argument for it is that if we were to only fire command triggers once, we wouldn't be able to pass identifying information about the objects being dropped, since the argument-passing mechanism only has room to pass details about a single object. I think that means that the argument-passing mechanism needs improvement, not that we should redefine one command as two commands. Something like CREATE SCHEMA foo CREATE VIEW bar ... has the same problem: the user only entered one command, but we're firing command triggers as if they entered multiple commands. This case is possibly more defensible than the other one, but note that the two aren't consistent with each other as regards the order of trigger firing, and I actually think that they're both wrong, and that only the toplevel command should fire triggers. 5. It seems desirable for BEFORE command triggers to fire at a consistent point during command execution, but currently they don't. For example, BEFORE DROP VIEW triggers don't fire until we've verified that q exists, is a view, and that we have permission to drop it, but LOAD triggers fire much earlier, before we've really checked anything at all. And ALTER TABLE is somewhere in the middle: we fire the BEFORE trigger after checking permissions on the main table, but before all permissions checks are done, viz: rhaas=> alter table p add foreign key (a) references p2 (a); NOTICE: snitch: BEFORE ALTER TABLE public.p [16418] ERROR: permission denied for relation p2 6. Another pretty hideous aspect of the CREATE TABLE behavior is that AFTER triggers are fired from a completely different place than BEFORE triggers. It's not this patch's fault that our CREATE TABLE and ALTER TABLE code is a Rube Goldberg machine, but finding some place to jam each trigger invocation in that happens to (mostly) work as the code exists today is not sufficiently future-proof. I am not too sure the above list is exhaustive. I think that what's leading you down the garden path here is the desire to pass as much information as possible to each command trigger. But for that, we could simply put the hooks in ProcessUtility(), adding perhaps a bit of logic to deal with recursive invocations of ProcessUtility() to handle subcommands, and call it good. In fact, I think that for AFTER triggers, you could make that work anyway - every time you enter ProcessUtility(), you push something onto a global stack, and then individual commands annotate that data structure with details such as the OIDs of objects being operated on, and after processing the command we fire the after triggers from ProcessUtility(). That way, no matter how anybody changes the code in the future, we're guaranteed that AFTER triggers will always fire exactly once per command, and the worst thing that happens if someone fails to make the right changes in some other part of the code is that the AFTER trigger doesn't get all the command details, and even that seems less likely since a lot of the early exit paths won't have those details to supply anyway. BEFORE triggers are a lot harder, because there you want to gather a certain amount of information and then fire the trigger. There's more than a little bit of tension there. The longer you delay firing the command trigger, the more information you can reliably supply - but on the other hand, the more chance that things will error out before we even reach the command trigger firing site. I think this is going to cause problems in the future as people want to supply more details to the command trigger - people who want more details to, say, selectively deny commands will want to fire the command trigger later in the execution sequence, but people who want to use it for auditing will want to fire it earlier in the sequence, to log all attempts. I don't know how to resolve this tension, but it seems to me that we'd better try to pick a rule and follow it consistently; and we'd better make sure that the rule is clear, documented, and well-defined. More nitpicking: - Creating a command trigger on a nonexistent function fails with "cache lookup failed for function 0". - tablespace.c has useless leftover changes. - One of the examples in the docs says "RETURNS command trigger" instead of "RETURNS command_trigger". - The "DROP COMMAND TRIGGER" documentation contains leftovers that refer to a parameter called "command" which no longer exists. - The example in the "DROP COMMAND TRIGGER" documentation fails to execute, as the syntax is no longer recognized. - check_cmdtrigger_name is mostly duplicative of get_cmdtrigger_oid. I think it can be rewritten as if (OidIsValid(get_cmdtrigger_oid(...)) ereport(...). - I believe that all the changes to dropdb() can be reverted. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: