Thread: WIP patch for updatable security barrier views
Hi all I have updatable security barrier views working for INSERT and DELETE, so this might be a good chance to see whether the described approach is acceptable in reality, not just in theory. I've been surprised by how well it worked out. I actually landed up removing a lot of the existing updatable views code; once update knows how to operate on a subquery it becomes unnecessary to duplicate the optimiser's knowledge of how to expand and flatten a view in the rewriter. INSERT and DELETE work. I haven't tested INSERT with defaults on the base rel yet but suspect it'll need the same support as for update. UPDATE isn't yet supported because of the need to inject references to cols in the base rel that aren't selected in the view. expand_targetlist(...) in prep/preptlist.c already does most of this work so I hope to be able to use or adapt that. This patch isn't subject to the replanning and invalidation issues discussed for RLS because updatable s.b. views don't depend on the current user or some special "bypass RLS" right like RLS would. The regression tests die because they try to update an updatable view. This isn't proposed for inclusion as it stands, it's a chance to comment and say "why the heck would you do that". -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 21 November 2013 13:15, Craig Ringer <craig@2ndquadrant.com> wrote: > Hi all > > I have updatable security barrier views working for INSERT and DELETE, > so this might be a good chance to see whether the described approach is > acceptable in reality, not just in theory. > > I've been surprised by how well it worked out. I actually landed up > removing a lot of the existing updatable views code; I fear you have removed a little too much though. For example, something somewhere has to deal with re-mapping of attributes. That's pretty fundamental, otherwise it can't hope to work properly. CREATE TABLE t1(a int, b text); CREATE VIEW v1 AS SELECT b, a FROM t1; INSERT INTO v1(a, b) VALUES(1, 'B'); ERROR: table row type and query-specified row type do not match DETAIL: Table has type integer at ordinal position 1, but query expects text. Also, you have deleted the code that supports WITH CHECK OPTION. once update knows > how to operate on a subquery it becomes unnecessary to duplicate the > optimiser's knowledge of how to expand and flatten a view in the rewriter. > > INSERT and DELETE work. I haven't tested INSERT with defaults on the > base rel yet but suspect it'll need the same support as for update. > > UPDATE isn't yet supported because of the need to inject references to > cols in the base rel that aren't selected in the view. > expand_targetlist(...) in prep/preptlist.c already does most of this > work so I hope to be able to use or adapt that. > > This patch isn't subject to the replanning and invalidation issues > discussed for RLS because updatable s.b. views don't depend on the > current user or some special "bypass RLS" right like RLS would. > > The regression tests die because they try to update an updatable view. > > This isn't proposed for inclusion as it stands, it's a chance to comment > and say "why the heck would you do that". > It sounds like it could be an interesting approach in principle, but you haven't yet shown how you intend to replace some of the rewriter code that you've removed. It feels to me that some of those things pretty much have to happen in the rewriter, because the planner just doesn't have the information it needs to be able to do it. Regards, Dean
On 11/21/2013 10:55 PM, Dean Rasheed wrote: > On 21 November 2013 13:15, Craig Ringer <craig@2ndquadrant.com> wrote: >> Hi all >> >> I have updatable security barrier views working for INSERT and DELETE, >> so this might be a good chance to see whether the described approach is >> acceptable in reality, not just in theory. >> >> I've been surprised by how well it worked out. I actually landed up >> removing a lot of the existing updatable views code; > > I fear you have removed a little too much though. Absolutely. This is really a proof of concept to show that we can do DML directly on a subquery by adding a new RTE for the resultRelation to the top level of the query. WITH CHECK OPTION was a casualty of cutting to prove the concept; I know it needs to be fitted into the subquery based approach. I really haven't thought about how WITH CHECK OPTION will fit into this, which may be a mistake - I'm hoping to deal with it after I have the basics working. There's lots of work to do, some of which will be adapting the code in your updatable views code to work with this approach, moving them around where appropriate. There's also the need to inject columns for UPDATE so the whole tuple is produced, and deal with DEFAULTs for INSERT. > For example, something somewhere has to deal with re-mapping of > attributes. That's pretty fundamental, otherwise it can't hope to work > properly. > > CREATE TABLE t1(a int, b text); > CREATE VIEW v1 AS SELECT b, a FROM t1; > INSERT INTO v1(a, b) VALUES(1, 'B'); > > ERROR: table row type and query-specified row type do not match > DETAIL: Table has type integer at ordinal position 1, but query expects text. Thanks. At this point I only expect it to work solidly for DELETE, and was frankly surprised that INSERT worked at all. The example of the problem is clear and useful, so thanks. I'm still learning about the handling of attributes and target lists - that's the next step in work on this. > It sounds like it could be an interesting approach in principle, but > you haven't yet shown how you intend to replace some of the rewriter > code that you've removed. It feels to me that some of those things > pretty much have to happen in the rewriter, because the planner just > doesn't have the information it needs to be able to do it. I'm still working a lot of that out. At this point I just wanted to demonstrate that we can in fact do DML directly on a subquery without view qual pull-up and view subquery flattening. One of my main worries is that adding and re-ordering columns needs to happen from the bottom up - for nested views, it needs to start at the real base rel and then proceed back up the chain of nested subqueries. View expansion happens recursively in the rewriter, so this isn't too easy. I'm thinking of doing it when we hit a real baserel during view expansion, walking back up the query tree doing fixups then. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Hi all Here's an updated version of the updatable security barrier views patch that's proposed as the next stage of progressing row-security toward useful and maintainable inclusion in core. If updatable security barrier views are available then the row-security code won't have to play around with remapping vars and attrs during query planning when it injects its subquery late. We'll simply stop the planner flattening an updatable security barrier view, and for row-security we'll dynamically inject one during rewriting when we see a relation that has a row-security attribute. This patch just implements updatable security barrier views. It is a work in progress with at least two known crash bugs and limited testing. I'd greatly appreciate comments (and advice) from those who are interested in the problem domain as we proceed further into work on 9.4. The patch is attached; it's on top of 46328916eefc5f9eaf249518e96f68afcd35923b, current head. It doens't yet touch the documentation, but the only change needed should be to remove the restriction on security_barrier views from the definition of what a "simply updatable" view is. There are couple of known issues: a crash with INSERT ... RETURNING; DEFAULT handling through views isn't implemented yet; and a crash I just found on UPDATE of a view that re-orders the original table columns. As a result it doesn't survive "make check" yet. I'm still working on fixing these issues and on finding more. Suggestions/comments would be appreciated. I'll post specifics of the INSERT ... RETURNING one soon, as I'm increasingly stuck on it. Simple demo: -- The 'id' is 'integer' not 'serial' because of the limitation with -- DEFAULT mentioned below. CREATE TABLE t (id integer primary key, secret text); INSERT INTO t(id, secret) SELECT x, 'secret'||x FROM generate_series(1,100) x; CREATE VIEW t_even AS SELECT id, secret FROM t WHERE id % 2 = 0; CREATE VIEW t_even_sb WITH (security_barrier) AS SELECT id, secret FROM t WHERE id % 2 = 0; CREATE VIEW t_even_check WITH (check_option = 'cascaded') AS SELECT id, secret FROM t WHERE id % 2 = 0; CREATE VIEW t_even_check_sb WITH (check_option = 'cascaded', security_barrier) AS SELECT id, secret FROM t WHERE id % 2 = 0; You'll find that the same f_leak tests used in the original read security barrier views work here, too. -- Subsets of cols work: CREATE VIEW just_id AS SELECT id FROM t; INSERT INTO just_id(id) VALUES (101); CREATE VIEW just_id_sb WITH (security_barrier) AS SELECT id FROM t; INSERT INTO just_id_sb(id) VALUES (101); -- Reversed column-lists work: CREATE VIEW reversed AS SELECT secret, id FROM t; INSERT INTO reversed(id, secret) VALUES (102, 'fred'); CREATE VIEW reversed_sb WITH (security_barrier) AS SELECT secret, id FROM t; INSERT INTO reversed_sb(id, secret) VALUES (102, 'fred'); -- WITH CHECK OPTION is working postgres=# INSERT INTO t_even_check(id, secret) values (296, 'blah'); INSERT 0 1 postgres=# INSERT INTO t_even_check(id, secret) values (297, 'blah'); ERROR: new row violates WITH CHECK OPTION for view "t_even_check" DETAIL: Failing row contains (297, blah). postgres=# INSERT INTO t_even_check_sb(id, secret) values (298, 'blah'); INSERT 0 1 postgres=# INSERT INTO t_even_check_sb(id, secret) values (299, 'blah'); ERROR: new row violates WITH CHECK OPTION for view "t_even_check_sb" DETAIL: Failing row contains (299, blah). -- 3-col views are OK with various permutations CREATE TABLE t3 ( id integer primary key, aa text, bb text ); CREATE VIEW t3_bai AS SELECT bb, aa, id FROM t3; INSERT INTO t3_bai VALUES ('bb','aa',3); UPDATE t3_bai SET bb = 'whatever' WHERE id = 3 RETURNING *; DELETE FROM t3_bai RETURNING *; CREATE VIEW t3_bai_sb WITH (security_barrier) AS SELECT bb, aa, id FROM t3; INSERT INTO t3_bai_sb VALUES ('bb','aa',3); -- This crashes, with or without RETURNING. Bug in re-ord -- UPDATE t3_bai_sb SET bb = 'whatever' WHERE id = 3 RETURNING *; -- This is OK: DELETE FROM t3_bai_sb RETURNING *; -- Known issues: DEFAULT injection doesn't occur correctly through the view. Needs some changes in the rewriter where expand_target_list is called. Haven't investigated in detail yet. Causes inserts through views to fail if there's a default not null constraint, among other issues. Any INSERT with a RETURNING clause through a view causes a crash (simple VALUES clauses) or fails with "no relation entry for relid 1" (INSERT ... SELECT). UPDATE and DELETE is fine. Seems to be doing subquery pull-up, producing a simple result sub-plan that incorrectly has a Var reference but doesn't perform any scan. Issue traced to plan_subquery, but no deeper yet. Privilege enforcement has not yet been through a thorough test matrix. UPDATE of a subset of columns fails. E.g.: CREATE VIEW just_secret AS SELECT secret FROM t; UPDATE just_secret SET secret = 'fred'; Known failing queries. Note that it doesn't matter what you choose from RETURNING, any reference to a result relation will fail; constant expressions succeed. INSERT INTO t_even(id, secret) VALUES (218, 'fred') RETURNING *; -- **crash** INSERT INTO t_even_sb(id, secret) VALUES (218, 'fred') RETURNING *; -- **crash** INSERT INTO t_even_sb SELECT x, 'secret'||x from generate_series(220,225) x RETURNING *; -- ERROR: no relation entry for relid 1 INSERT INTO t_even SELECT x, 'secret'||x from generate_series(226,230) x RETURNING *; -- ERROR: no relation entry for relid 1 -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Hi all I'm finding it necessary to remap the varno and varattno of Var elements in RETURNING lists as part of updatable s.b. view processing. A reference to a view col in RETURNING must be rewritten to point to the new resultRelation at the top level, so that the effects of BEFORE triggers are visible in the returned result. Usually the view will have a different structure to the base table, and in this case it's necessary to change the varattno of the Var to point to the corresponding col on the base rel. So the short version: Given the RTE for a simple view with only one base rel and an attribute number for a col in that view, and assuming that the view col is a simple reference to a col in the underlying base rel, is there any sane way to get the attribute number for the corresponding col on the base rel? For example, given: CREATE TABLE t (a integer, b text, c numeric); CREATE VIEW v1 AS SELECT c, a FROM t; UPDATE v1 SET c = NUMERIC '42' RETURNING a, c; ... the Vars for a and c in the RETURNING clause need to be remapped so their varno is the rti for t once the RTE has been injected, and the varattnos need changing to the corresponding attribute numbers on the base table. So a Var for v1(c), say (1,1), must be remapped to (2,3) i.e. varno 2 (t), varattno 3 (c). I'm aware that this is somewhat like the var/attr twiddling being complained about in the RLS patch. I don't see a way around it, though. I can't push the RETURNING tlist entries down into the expanded view's subquery and add an outer Var reference - they must point directly to the resultRelation so that we see the effects of any BEFORE triggers. I'm looking for advice on how to determine, given an RTI and an attribute number for a simple view, what the attribute number of the col in the view's base relation is. It can be assumed for this purpose that the view attribute is a simple Var (otherwise we'll ERROR out, since we can't update an expression). I'm a bit stumped at this point. I could adapt the ChangeVarNodes walker to handle the actual recursive rewriting and Var changing. It assumes that the varattno is the same on both the old and new varno, as it's intended for when you have an RT index change, but could be taught to optionally remap varattnos. To do that, though, I need a way to actually do that varattno remapping cleanly. Anyone got any ideas? -- Craig Ringer
On 12/24/2013 02:35 PM, Craig Ringer wrote: > So the short version: Given the RTE for a simple view with only one base > rel and an attribute number for a col in that view, and assuming that > the view col is a simple reference to a col in the underlying base rel, > is there any sane way to get the attribute number for the corresponding > col on the base rel? So, of course, I find it as soon as I post. map_variable_attnos(...), also in src/backend/rewrite/rewriteManip.c . Just generate the mapping table and go. Sorry for the noise folks. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Dec 24, 2013 at 11:47 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
On 12/24/2013 02:35 PM, Craig Ringer wrote:So, of course, I find it as soon as I post.
> So the short version: Given the RTE for a simple view with only one base
> rel and an attribute number for a col in that view, and assuming that
> the view col is a simple reference to a col in the underlying base rel,
> is there any sane way to get the attribute number for the corresponding
> col on the base rel?
map_variable_attnos(...), also in src/backend/rewrite/rewriteManip.c .
Just generate the mapping table and go.
Could you please explain a little bit more how would you solve the posed problem using map_variable_attnos?
I was recently working on a similar problem and used the following algo to solve it.
I had to find to which column of the base table does a column in the select statement of the view query belong.
To relate a target list entry in the select query of the view to an actual column in base table here is what I did
First determine whether the var's varno refer to an RTE which is a view?
If yes then get the view query (RangeTblEntry::subquery) and see which element in the view query's target list does this var's varattno point to.
Get the varno of that target list entry. Look for that RTE in the view's query and see if that RTE is a real table then use that var making sure its varno now points to the actual table.
Thanks in advance.
Sorry for the noise folks.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
--
Abbas
Architect
Skype ID: gabbasb
www.enterprisedb.comFollow us on Twitter
@EnterpriseDB
Visit EnterpriseDB for tutorials, webinars, whitepapers and more
On 12/24/2013 03:21 PM, Abbas Butt wrote: > Could you please explain a little bit more how would you solve the posed > problem using map_variable_attnos? > > I was recently working on a similar problem and used the following algo > to solve it. > > I had to find to which column of the base table does a column in > the select statement of the view query belong. > To relate a target list entry in the select query of the view to > an actual column in base table. Sounds similar. My problem is simplified by the constraint that the view must be a simple view (only one base relation) and must only contain simple column-references to single the base relation. No expressions, no joins. (These are the rules for simply updatable views). I'm new to the planner and rewriter, so don't take what I do as any kind of example beyond "this seems to work". I generate a varattno mapping as follows: /** Scan the passed view target list, whose members must consist solely* of Var nodes with a varno equal to the passed targetvarno.**A mapping is built from the resno (i.e. tlist index) of the view* tlist to the corresponding attribute numberof the base relation* the varattno points to.** Must not be called with a targetlist containing non-Var entries.*/ static void gen_view_base_attr_map(List *viewtlist, AttrNumber * attnomap, int targetvarno) { ListCell *lc; TargetEntry *te; Var *tev; int l_viewtlist = list_length(viewtlist); foreach(lc, viewtlist) { te = (TargetEntry*) lfirst(lc); /* Could relax this in future and map only thevar entries, * ignoring everything else, but currently pointless since we * are only interested in simpleviews. */ Assert(IsA(te->expr, Var)); tev = (Var*) te->expr; Assert(tev->varno == targetvarno); Assert(te->resno - 1 < l_viewtlist); attnomap[te->resno - 1] = tev->varattno; } } producing a forward mapping of view attno to base relation attno. I then apply the varattno remapping, in this case to the returning list of a DML query acting on a view, with something like: varattno_map = palloc( list_length(viewquery->targetList) * sizeof(AttrNumber) ); gen_view_base_attr_map(viewquery->targetList, varattno_map, rtr->rtindex); parsetree->returningList = map_variable_attnos( (Node*) parsetree->returningList, old_result_rt_index,0, varattno_map, list_length(viewquery->targetList), &found_whole_row_var ); if (found_whole_row_var) { /* TODO: Extend map_variable_attnos API to pass a mutator to handle whole-rowvars. */ elog(ERROR, "RETURNING list contains a whole-row variable, " "which is not currentlysupported for updatable " "views"); } ChangeVarNodes((Node*) parsetree->returningList, old_result_rt_index, new_result_rt_index, 0); I'd prefer to be doing the map_variable_attnos and ChangeVarNodes work in a single pass, but it looks like a clumsy and verbose process to write a new walker. So I'm going to leave it as an "opportunity for future optimisation" for now ;-) (As it happens that "found_whole_row_var" is a real pain; I'm going to have to deal with a TODO item in map_variable_attnos to provide a callback that replaces a whole-row Var with an expansion of it into a row-expression). -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 12/24/2013 03:21 PM, Abbas Butt wrote: > Could you please explain a little bit more how would you solve the posed > problem using map_variable_attnos? It actually turns out to be even simpler, and easy to do in one pass, when using ReplaceVarsFromTargetList . You just generate a temporary new list of TargetEntry, with resnos matching the attribute numbers of the view. Each contained Var points to the remapped varno and varattno. Then you let ReplaceVarsFromTargetList substitute Vars recursively through the expression tree with ones in the replacement tlist you supply. The more I've thought about it, the shorter this code has got. Currently: /** Scan the passed view target list, whose members must consist solely* of Var nodes with a varno equal to the passed targetvarno,and* produce a targetlist of Var nodes with the corresponding varno and* varattno of the base relation 'targetvarno'.**This tlist is used when remapping Vars that point to a view so they* point to the base relation of the viewinstead. It is entirely* newly allocated. The result tlist is not in resno order.** Must not be called with a targetlistcontaining non-Var entries.*/ static List * gen_view_base_attr_map(List *viewtlist, int targetvarno, int newvarno) { ListCell *lc; TargetEntry *te, *newte; Var *tev, *newvar; int l_viewtlist = list_length(viewtlist); List *newtlist = NIL; foreach(lc, viewtlist) { te = (TargetEntry*) lfirst(lc); /* Could relax this in future and map only thevar entries, * ignoring everything else, but currently pointless since we * are only interested in simpleviews. */ Assert(IsA(te->expr, Var)); tev = (Var*) te->expr; Assert(tev->varno == targetvarno); Assert(te->resno - 1 < l_viewtlist); /* Construct the new Var with the remapped attno and varno */ newvar = (Var*) palloc(sizeof(Var)); *newvar= *tev; newvar->varno = newvarno; newvar->varattno = tev->varattno; /* and wrap it in a new tle to cons to the list */ newte = flatCopyTargetEntry(te); newte->expr = (Expr*)newvar; newtlist = lcons(newte, newtlist); } return newtlist; } and invocation: /* * We need to adjust any RETURNING clause entries to point to the new * target RTE instead of the old one so thatwe see the effects of * BEFORE triggers. Varattnos must be remapped so that the new Var * points to the correctcol of the base rel, since the view will * usually have a different set of columns / column order. * * Aspart of this pass any whole-row references to the view are * expanded into ROW(...) expressions to ensure we don't expose * columns that are not visible through the view, and to make sure * the client gets the result type it expected. */ List * remap_tlist = gen_view_base_attr_map( viewquery->targetList, rtr->rtindex, new_result_rt_index); parsetree->returningList = ReplaceVarsFromTargetList( (Node*) parsetree->returningList, old_result_rt_index,0 /*sublevels_up*/, view_rte, remap_tlist, REPLACEVARS_REPORT_ERROR, 0/*nomatch_varno, unused */, NULL /* outer_hasSubLinks, unused */ ); -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 24 December 2013 12:12, Craig Ringer <craig@2ndquadrant.com> wrote: > On 12/24/2013 03:21 PM, Abbas Butt wrote: > >> Could you please explain a little bit more how would you solve the posed >> problem using map_variable_attnos? > > It actually turns out to be even simpler, and easy to do in one pass, > when using ReplaceVarsFromTargetList . > > You just generate a temporary new list of TargetEntry, with resnos > matching the attribute numbers of the view. Each contained Var points to > the remapped varno and varattno. > > Then you let ReplaceVarsFromTargetList substitute Vars recursively > through the expression tree with ones in the replacement tlist you supply. > > The more I've thought about it, the shorter this code has got. Currently: > > > > > /* > * Scan the passed view target list, whose members must consist solely > * of Var nodes with a varno equal to the passed targetvarno, and > * produce a targetlist of Var nodes with the corresponding varno and > * varattno of the base relation 'targetvarno'. > * > * This tlist is used when remapping Vars that point to a view so they > * point to the base relation of the view instead. It is entirely > * newly allocated. The result tlist is not in resno order. > * > * Must not be called with a targetlist containing non-Var entries. > */ > static List * > gen_view_base_attr_map(List *viewtlist, int targetvarno, int newvarno) > { > ListCell *lc; > TargetEntry *te, *newte; > Var *tev, *newvar; > int l_viewtlist = list_length(viewtlist); > List *newtlist = NIL; > > foreach(lc, viewtlist) > { > te = (TargetEntry*) lfirst(lc); > /* Could relax this in future and map only the var entries, > * ignoring everything else, but currently pointless since we > * are only interested in simple views. */ > Assert(IsA(te->expr, Var)); > tev = (Var*) te->expr; > Assert(tev->varno == targetvarno); > Assert(te->resno - 1 < l_viewtlist); > > /* Construct the new Var with the remapped attno and varno */ > newvar = (Var*) palloc(sizeof(Var)); > *newvar = *tev; > newvar->varno = newvarno; > newvar->varattno = tev->varattno; > > /* and wrap it in a new tle to cons to the list */ > newte = flatCopyTargetEntry(te); > newte->expr = (Expr*) newvar; > newtlist = lcons(newte, newtlist); > } > > return newtlist; > } > > I don't think this bit is quite right. It's not correct to assume that all the view columns are simple references to columns of the base relation --- auto-updatable views may now contain a mix of updatable and non-updatable columns, so some of the view columns may be arbitrary expressions. There is already code in rewriteTargetView() that does something very similar (to the whole parsetree, rather than just the returning list) with 2 function calls: /* * Make a copy of the view's targetlist, adjusting its Vars to reference * the new target RTE, ie make their varnosbe new_rt_index instead of * base_rt_index. There can be no Vars for other rels in the tlist, so * this is sufficientto pull up the tlist expressions for use in the * outer query. The tlist will provide the replacement expressionsused * by ReplaceVarsFromTargetList below. */ view_targetlist = copyObject(viewquery->targetList); ChangeVarNodes((Node *) view_targetlist, base_rt_index, new_rt_index, 0); and then later: /* * Now update all Vars in the outer query that reference the view to * reference the appropriate column of thebase relation instead. */ parsetree = (Query *) ReplaceVarsFromTargetList((Node *) parsetree, parsetree->resultRelation, 0, view_rte, view_targetlist, REPLACEVARS_REPORT_ERROR, 0, &parsetree->hasSubLinks); Regards, Dean > > > and invocation: > > > /* > * We need to adjust any RETURNING clause entries to point to the new > * target RTE instead of the old one so that we see the effects of > * BEFORE triggers. Varattnos must be remapped so that the new Var > * points to the correct col of the base rel, since the view will > * usually have a different set of columns / column order. > * > * As part of this pass any whole-row references to the view are > * expanded into ROW(...) expressions to ensure we don't expose > * columns that are not visible through the view, and to make sure > * the client gets the result type it expected. > */ > > List * remap_tlist = gen_view_base_attr_map( > viewquery->targetList, rtr->rtindex, new_result_rt_index); > > parsetree->returningList = ReplaceVarsFromTargetList( > (Node*) parsetree->returningList, > old_result_rt_index, 0 /*sublevels_up*/, > view_rte, > remap_tlist, > REPLACEVARS_REPORT_ERROR, 0 /*nomatch_varno, unused */, > NULL /* outer_hasSubLinks, unused */ > ); > > > > > > -- > Craig Ringer http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers
On 12/24/2013 11:17 PM, Dean Rasheed wrote: > I don't think this bit is quite right. > > It's not correct to assume that all the view columns are simple > references to columns of the base relation --- auto-updatable views > may now contain a mix of updatable and non-updatable columns, so some > of the view columns may be arbitrary expressions. Ah - it looks like I'd checked against 9.3 and missed the relaxation of those requirements. > There is already code in rewriteTargetView() that does something very > similar (to the whole parsetree, rather than just the returning list) > with 2 function calls: Copying the view tlist and then adjusting it is a much smarter way to do it. I should've seen that the pull-up code could be adapted to deal with the RETURNING list, so thankyou. It's a cleaner way to do it. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 13 December 2013 13:52, Craig Ringer <craig@2ndquadrant.com> wrote: > Hi all > > Here's an updated version of the updatable security barrier views patch > that's proposed as the next stage of progressing row-security toward > useful and maintainable inclusion in core. > > If updatable security barrier views are available then the row-security > code won't have to play around with remapping vars and attrs during > query planning when it injects its subquery late. We'll simply stop the > planner flattening an updatable security barrier view, and for > row-security we'll dynamically inject one during rewriting when we see a > relation that has a row-security attribute. Hi, I've been thinking about this some more and I don't think you can avoid the need to remap vars and attrs. AIUI, your modified version of rewriteTargetView() will turn an UPDATE query with a rangetable of the form rtable: 1: relation RTE (view) <- resultRelation fromList: [1] into one of the form rtable: 1: relation RTE (view) 2: relation RTE (base table) <- resultRelation fromList: [1] which will then get transformed later in the rewriter by fireRIRrules() into rtable: 1: subquery RTE (expanded view) 2: relation RTE (base table) <- resultRelation fromList: [1] The problem is that the subquery RTE will only expose the columns of the view, which doesn't necessarily include all the columns from the base table. These are required for UPDATE, for example: create table t(a int, b int); insert into t values(1, 2); create view v with (security_barrier=true) as select a from t; update v set a=a+1; ERROR: invalid attribute number 2 The columns actually output from the subquery are: explain (verbose, costs off) update v set a=a+1; QUERY PLAN -------------------------------------------------- Update on public.t -> Subquery Scan on v Output: (v.a + 1), t.b, v.*, t.ctid, v.* -> Seq Scan on public.t t_1 Output: t_1.a which is clearly broken. So more code will be needed to add the right set of columns to that subquery RTE, and that's where it will need to mess with the mapping between columns of the view and columns of the underlying base relation. > [snip] Needs some > changes in the rewriter where expand_target_list is called. It doesn't look right to be calling expand_target_list() from the rewriter. It looks like you are calling it before the rangetable mangling, so all it will do is add targetlist entries for columns of the view, not for columns of the base relation as the preceding comment suggests. I think that explains the EXPLAIN output above. A related problem is that the base table may be the parent of an inheritance set, in which case attempting to add all the required columns here in the rewriter is never going to work, because the inheritance set hasn't been expanded yet, and so the columns of child tables will be missed. Normally expand_target_list() is only called from the planner, after expand_inherited_tables(), at which point it's working with a subplan with the appropriate parent/child relation, and so it sees the correct set of columns. The more I think about this, the more I think that the approach of my original patch was neater. The idea was to have 2 new pieces of code: 1). In rewriteTargetView() decorate the target RTE with any security barrier quals (in the new rte->securityQuals field), instead of merging them with the main query's quals. So the output of this stage of rewriting would be something like rtable: 1: relation RTE (base table) <- resultRelation - securityQuals = [view quals] fromList: [1] 2). After all rewriting is complete, scan the query and turn all relation RTEs with non-empty securityQuals into subquery RTEs (making a copy of the original RTE in the case of the result relation). A future RLS patch would then be able to make use of this simply by adding its own securityQuals to the relevant RTEs and letting (2) expand them. The problem with my earlier patch was that it was calling the subquery expansion code (2) in the final stage of the rewriter. In light of the above, that really needs to happen after expand_inherited_tables() so that it can see the correct parent/child base relation. Another ugly feature of my earlier patch was the changes it made to expand_target_list() and the need to track the query's sourceRelation. Both of those things can be avoided simply by moving the subquery expansion code (2) to after expand_target_list(), and hence also after expand_inherited_tables(). Attached is an updated version of my earlier patch with the subquery expansion code (2) moved into the planner, in a new file optimizer/prep/prepsecurity.c (it didn't seem to obviously belong in any of the existing files) and invoked after calling preprocess_targetlist(). It turns out that this also allows that new code to be tidied up a bit, and it is easy for it to work out which attributes are actually required and only include the minimum necessary set of attributes in the subquery. Also, since this is now all happening after inheritance expansion, the subplan's subquery is built with just the relevant parent/child relation, rather than the complete hierarchy. For example: create table parent_tbl(a int); insert into parent_tbl select * from generate_series(1,1000); create table child_tbl(b int) inherits (parent_tbl); insert into child_tbl select i,i*10 from generate_series(1001,2000) g(i); create index child_tbl_idx on child_tbl(a); create view v with (security_barrier=true) as select * from parent_tbl where a > 0; explain (verbose, costs off) update v set a=a+1 where a=1500; QUERY PLAN ------------------------------------------------------------------------------ Update on public.parent_tbl parent_tbl_2 -> Subquery Scan on parent_tbl Output: (parent_tbl.a + 1), parent_tbl.ctid -> Seq Scan on public.parent_tbl parent_tbl_3 Output: parent_tbl_3.a, parent_tbl_3.ctid Filter: ((parent_tbl_3.a > 0) AND (parent_tbl_3.a = 1500)) -> Subquery Scan on parent_tbl_1 Output: (parent_tbl_1.a + 1), parent_tbl_1.b, parent_tbl_1.ctid -> Bitmap Heap Scan on public.child_tbl Output: child_tbl.a, child_tbl.ctid, child_tbl.b Recheck Cond: ((child_tbl.a > 0) AND (child_tbl.a = 1500)) -> Bitmap Index Scan on child_tbl_idx Index Cond: ((child_tbl.a > 0) AND (child_tbl.a = 1500)) where the second subquery is for updating child_tbl, and has a different set of columns (and a different access path in this case). OTOH, your patch generates the following plan for this query: Update on public.parent_tbl -> Subquery Scan on v Output: (v.a + 1), v.*, parent_tbl.ctid, v.* -> Append -> Seq Scan on public.parent_tbl parent_tbl_1 Output: parent_tbl_1.a Filter: ((parent_tbl_1.a > 0) AND (parent_tbl_1.a = 1500)) -> Bitmap Heap Scan on public.child_tbl Output: child_tbl.a Recheck Cond: ((child_tbl.a > 0) AND (child_tbl.a = 1500)) -> Bitmap Index Scan on child_tbl_idx Index Cond: ((child_tbl.a > 0) AND (child_tbl.a = 1500)) -> Subquery Scan on v_1 Output: (v_1.a + 1), b, v_1.*, ctid, v_1.* -> Append -> Seq Scan on public.parent_tbl parent_tbl_2 Output: parent_tbl_2.a Filter: ((parent_tbl_2.a > 0) AND (parent_tbl_2.a = 1500)) -> Bitmap Heap Scan on public.child_tbl child_tbl_1 Output: child_tbl_1.a Recheck Cond: ((child_tbl_1.a > 0) AND (child_tbl_1.a = 1500)) -> Bitmap Index Scan on child_tbl_idx Index Cond: ((child_tbl_1.a > 0) AND (child_tbl_1.a = 1500)) which is the wrong set of rows (and columns) in each subquery and it crashes when it is run. There is still a lot more testing to be done with my patch, so there may well be unforeseen problems, but it feels like a cleaner, more straightforward approach. Thoughts? Regards, Dean
Attachment
> I've been thinking about this some more and I don't think you can > avoid the need to remap vars and attrs. I agree. I was hoping to let expand_targetlist / preprocess_targetlist do the job, but I'm no longer convinced that'll do the job without mangling them in the process. > AIUI, your modified version of rewriteTargetView() will turn an UPDATE > query with a rangetable of the form The patch I posted is clearly incorrect in several ways. Unfortunately I'm still coming up to speed with the rewriter / planner / executor at the same time as trying to make major modifications to them. This tends to produce some false starts. (It no longer attempts to use expand_targetlist in the rewriter, btw, that's long gone). Since the posted patch I've sorted out RETURNING list rewriting and a few other issues. There are some major issues remaining with the approach though: - If we have nested views, we need to pass the tuple ctid up the chain of expanded view subqueries so ExecModifyTable canconsume it. This is proving to be a lot harder than I'd hoped. - expand_targetlist / preprocess_targetlist operate on the resultRelation. With updatable s.b. views, the resultRelationis no longer the relation from which tuples are being read for input into ExecModifyTable. - In subqueries, resultRelation is only set for the outermost layer. So preprocess_targetlist doesn't usefully add tlistentries for the inner subquery layers at all. - It is necessary to expand DEFAULTs into expression tlist entries in the innermost query layer so that Vars added furtherup in the subquery chain can refer to them. In the current experimental patch DEFAULTs aren't populated correctly. So we have the following needs: - passing ctids up through layers of subqueries - target-list expansion through layers of subqueries - Differentiating between resultRelation to heapmodifytuple and the source of the tuples to feed into heapmodifytuple now that these are no longer the same thing. I was originally hoping all this could be dealt with in the rewriter, by changing resultRelation to point to the next-inner-most RTE whenever a target view is expanded. This turns out to be too simplistic: - There is no ctid col on a view, so if we have v2 over v1 over table t, when we expand v2 there's no way to create a tlist entry to point to v1's ctid. It won't have one until we've expanded v1 into t in the next pass. The same issue applies to expanding the tlist to carry cols of "t" up the subquery chain in the rewrite phase. - Rowmarks are added to point to resultrelation after rewrite, and these then add tlist entries in preprocess_targetlist. These TLEs will point to the base resultRelation, which isn't correct when we're updating nested subqueries. So we just can't do this during recursive rewrite, because the required information is only available once the target view is fully expanded into nested subqueries. It seems that tlist fixup and injection of the ctid up the subquery chain must be done after all recursive rewriting. To do these fixups later, we need to keep track of which nested series of subqueries is the "target", i.e. produces tuples including resjunk ctid cols to feed into ExecModifyTuple. Currently this information is "resultRelation" > The more I think about this, the more I think that the approach of my > original patch was neater. The idea was to have 2 new pieces of code: > > 1). In rewriteTargetView() decorate the target RTE with any security > barrier quals (in the new rte->securityQuals field), instead of > merging them with the main query's quals. So the output of this > stage of rewriting would be something like > > rtable: > 1: relation RTE (base table) <- resultRelation > - securityQuals = [view quals] > fromList: [1] So you're proposing to still flatten views during rewrite, as the current code does, but just keep track of s.b. quals separately? If so, what about multiple S.B. views may be nested and their quals must be isolated from each other, not just from the outer query? That's why I see subqueries as such a useful model. > 2). After all rewriting is complete, scan the query and turn all > relation RTEs with non-empty securityQuals into subquery RTEs > (making a copy of the original RTE in the case of the result > relation). I'm not sure I understand how this would work in the face of multiple levels of nesting s.b. views. Are you thinking of doing recursive expansion? > Another ugly > feature of my earlier patch was the changes it made to > expand_target_list() and the need to track the query's sourceRelation. I've been fighting the need to add the concept of a "sourceRelation" for this purpose too. > Both of those things can be avoided simply by moving the subquery > expansion code (2) to after expand_target_list(), and hence also after > expand_inherited_tables(). That's certainly interesting. I'm reading over the patch now. > There is still a lot more testing to be done with my patch, so there > may well be unforeseen problems, but it feels like a cleaner, more > straightforward approach. > > Thoughts? I'm reading it now, but in general, how do you see it solving the issue of getting the ctid (and any table attrs not present in the views) up two or more layers of nested view? And default expansion? -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 01/08/2014 02:00 PM, Craig Ringer wrote: > I'm not sure I understand how this would work in the face of multiple > levels of nesting s.b. views. Are you thinking of doing recursive expansion? Never mind, that part is clearly covered in the patch. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Dean, Short version ------------- Looks amazing overall. Very clever to zip up the s.b. quals, let the rest of the rewriter and planer do their work normally, then unpack them into subqueries inserted in the planner once inheritance appendrels are expanded, etc. My main concern is that the securityQuals appear to bypass all later rewrite stages, inheritance expansion during planning, etc. I suspect this might be hard to get around (because these are disembodied quals which may have nonsense varnos), but I'm looking into it now. There's also an assertion failure whenever a correlated subquery appears as a security barrier view qual. Again, looking at it. Ideas on that issue? Much longer version: My understanding of how it works ----------------------------------------------------- My understanding from reading the patch is that this: - Flattens target views in rewriteTargetView, as in current master. If the target view is a security barrier view, the view quals are appended to a list of security barrier quals on the new RTE, instead of appended to the RTE's normal quals like for normal views. After rewrite the views are fully flattened down to a RTE_RELATION, which becomes the resultRelation. An unreferenced RTE for each view that's been rewritten is preserved in the range-table for permissions checking purposes only (same as current master). - Inheritance expansion, tlist expansion, etc then occurrs as normal. - In planning, in inheritance_planner, if any RTE has any stashed security quals in its RangeTableEntry, expand_security_qual is invoked. This iteratively wraps the base relation in a subquery with the saved security barrier quals, creating nested subqueries around the original RTE. At each pass resultRelation is changed to point to the new outer-most subquery. As a result of this approach everything looks normal to preprocess_targetlist, row-marking, etc, because they're seeing a normal RTE_RELATION as resultRelation. The security barrier quals are, at this stage, stashed aside. If there's inheritance involved, RTEs copied during appendrel expansion get copies of the security quals on in the parent RTE. Problem with inheritance, views, etc in s.b. quals -------------------------------------------------- After inheritance expansion, tlist expansion, etc, the s.b. quals are unpacked to create subqueries wrapping the original RTEs. So, with: CREATE TABLE t1 (x float, b integer, secret1 text, secret2 text); CREATE TABLE t1child (z integer) INHERITS (t1); INSERT INTO t1 (x, b, secret1, secret2) VALUES (0,0,'secret0', 'supersecret'), (1,1,'secret1', 'supersecret'), (2,2,'secret2', 'supersecret'), (3,3,'secret3', 'supersecret'), (4,4,'secret4', 'supersecret'), (5,6,'secret5', 'supersecret'); INSERT INTO t1child (x, b, secret1, secret2, z) VALUES (8,8,'secret8', 'ss', 8), (9,9,'secret8', 'ss', 9), (10,10,'secret8', 'ss', 10); CREATE VIEW v1 WITH (security_barrier) AS SELECT b AS b1, x AS x1, secret1 FROM t1 WHERE b % 2 = 0; CREATE VIEW v2 WITH (security_barrier) AS SELECT b1 AS b2, x1 AS x2 FROM v1 WHERE b1 % 4 = 0; then a statement like: UPDATE v2 SET x2 = x2 + 32; will be rewritten into something like (imaginary sql) UPDATE t1 WITH SECURITY QUALS ((b % 2 == 0), (b % 4 == 0)) SET x = x + 32 inheritance-expanded and tlist-expanded into something like (imaginary SQL) UPDATE(t1 WITH SECURITY QUALS ((b % 2 == 0), (b % 4 == 0)))UNION ALL(t1child WITH SECURITY QUALS ((b % 2 == 0), (b % 4 ==0))) SET x = x + 32; after which security qual expansion occurs, giving us something like: UPDATEt1, t1child <--- resultRelations( SELECT v2.ctid, v2.* FROM ( SELECT v1.ctid, v1.* FROM ( SELECTt1.ctid, t1.* FROM t1 WHERE b % 2 == 0 ) v1 WHERE b % 4 == 0 ) v2 UNION ALL SELECT v2.ctid, v2.* FROM ( SELECT v1.ctid, v1.* FROM ( SELECT t1child.ctid, t1child.* FROM t1child WHERE b % 2 == 0 ) v1 WHERE b % 4 == 0 ) v2 ) SET x = x + 32; Giving a plan looking like: EXPLAIN UPDATE v2 SET x2 = 32 QUERY PLAN ---------------------------------------------------------------------------Update on t1 t1_2 (cost=0.00..23.35 rows=2 width=76) -> Subquery Scan on t1 (cost=0.00..2.18 rows=1 width=74) -> Subquery Scan on t1_3 (cost=0.00..2.17 rows=1width=74) Filter: ((t1_3.b % 4) = 0) -> Seq Scan on t1 t1_4 (cost=0.00..2.16 rows=1 width=74) Filter: ((b % 2) = 0) -> Subquery Scan on t1_1 (cost=0.00..21.17 rows=1 width=78) -> Subquery Scan on t1_5 (cost=0.00..21.16 rows=1 width=78) Filter: ((t1_5.b % 4) = 0) -> SeqScan on t1child (cost=0.00..21.10 rows=4 width=78) Filter: ((b % 2) = 0) (11 rows) So far this looks like a really clever approach. My only real concern is that the security quals are currently hidden from rewrite and parsing before during the period they're hidden away in the security quals RTI field. This means they aren't processed for things like inheritance expansion. e.g. CREATE TABLE rowfilter (remainder integer, userid text); CREATE TABLE rowfilterchild () INHERITS (rowfilter); INSERT INTO rowfilterchild(remainder, userid) values (0, current_user); a view with a security qual that refers to an inherited relation won't work: CREATE VIEW sqv WITH (security_barrier) AS SELECT x, b FROM t1 WHERE ( SELECT b % 4 = remainder FROM rowfilter WHERE userid = current_user OFFSET 0 ); This is a bit contrived to force the optimiser to treat the subquery as correlated and thus make sure the ref to rowfilter gets into the securityQuals list. I expected zero results (a scan of rowfilter, but not rowfilterchild, resulting in a null subquery return) but land up with an assertion failure instead. The assertion triggers for any security qual containing a correlated subquery, so it's crashing out before we can break due to failure to expand inheritance. This isn't just about inheritance. In general, we'd need a way to process those securityQuals through any rewrite phases and through the parts of planning before they get merged back in, so they're subject to things like inheritance appendrel expansion. Same if the security qual contains a view ref: CREATE VIEW dumbview(zero) AS SELECT 0; CREATE VIEW sqv2 WITH (security_barrier) AS SELECT x, b FROM t1 WHERE (SELECT b % 2 = zero FROM dumbview OFFSET 0); -- Craig Ringer (Phew!)
On 8 January 2014 09:02, Craig Ringer <craig@2ndquadrant.com> wrote: > Dean, > > Short version > ------------- > > Looks amazing overall. Very clever to zip up the s.b. quals, let the > rest of the rewriter and planer do their work normally, then unpack them > into subqueries inserted in the planner once inheritance appendrels are > expanded, etc. > > My main concern is that the securityQuals appear to bypass all later > rewrite stages, inheritance expansion during planning, etc. I suspect > this might be hard to get around (because these are disembodied quals > which may have nonsense varnos), but I'm looking into it now. > > There's also an assertion failure whenever a correlated subquery appears > as a security barrier view qual. Again, looking at it. > > Ideas on that issue? > > > > Much longer version: My understanding of how it works > ----------------------------------------------------- > > My understanding from reading the patch is that this: > > - Flattens target views in rewriteTargetView, as in current master. If > the target view is a security barrier view, the view quals are appended > to a list of security barrier quals on the new RTE, instead of appended > to the RTE's normal quals like for normal views. > > After rewrite the views are fully flattened down to a RTE_RELATION, > which becomes the resultRelation. An unreferenced RTE for each view > that's been rewritten is preserved in the range-table for permissions > checking purposes only (same as current master). > > - Inheritance expansion, tlist expansion, etc then occurrs as normal. > > - In planning, in inheritance_planner, if any RTE has any stashed > security quals in its RangeTableEntry, expand_security_qual is invoked. > This iteratively wraps the base relation in a subquery with the saved > security barrier quals, creating nested subqueries around the original > RTE. At each pass resultRelation is changed to point to the new > outer-most subquery. > > > As a result of this approach everything looks normal to > preprocess_targetlist, row-marking, etc, because they're seeing a normal > RTE_RELATION as resultRelation. The security barrier quals are, at this > stage, stashed aside. If there's inheritance involved, RTEs copied > during appendrel expansion get copies of the security quals on in the > parent RTE. > > Problem with inheritance, views, etc in s.b. quals > -------------------------------------------------- > > After inheritance expansion, tlist expansion, etc, the s.b. quals are > unpacked to create subqueries wrapping the original RTEs. > > > So, with: > > CREATE TABLE t1 (x float, b integer, secret1 text, secret2 text); > CREATE TABLE t1child (z integer) INHERITS (t1); > > INSERT INTO t1 (x, b, secret1, secret2) > VALUES > (0,0,'secret0', 'supersecret'), > (1,1,'secret1', 'supersecret'), > (2,2,'secret2', 'supersecret'), > (3,3,'secret3', 'supersecret'), > (4,4,'secret4', 'supersecret'), > (5,6,'secret5', 'supersecret'); > > INSERT INTO t1child (x, b, secret1, secret2, z) > VALUES > (8,8,'secret8', 'ss', 8), > (9,9,'secret8', 'ss', 9), > (10,10,'secret8', 'ss', 10); > > CREATE VIEW v1 > WITH (security_barrier) > AS > SELECT b AS b1, x AS x1, secret1 > FROM t1 WHERE b % 2 = 0; > > CREATE VIEW v2 > WITH (security_barrier) > AS > SELECT b1 AS b2, x1 AS x2 > FROM v1 WHERE b1 % 4 = 0; > > > > then a statement like: > > UPDATE v2 > SET x2 = x2 + 32; > > will be rewritten into something like (imaginary sql) > > UPDATE t1 WITH SECURITY QUALS ((b % 2 == 0), (b % 4 == 0)) > SET x = x + 32 > > inheritance-expanded and tlist-expanded into something like (imaginary SQL) > > > UPDATE > (t1 WITH SECURITY QUALS ((b % 2 == 0), (b % 4 == 0))) > UNION ALL > (t1child WITH SECURITY QUALS ((b % 2 == 0), (b % 4 == 0))) > SET x = x + 32; > > > after which security qual expansion occurs, giving us something like: > > > UPDATE > t1, t1child <--- resultRelations > ( > SELECT v2.ctid, v2.* > FROM ( > SELECT v1.ctid, v1.* > FROM ( > SELECT t1.ctid, t1.* > FROM t1 > WHERE b % 2 == 0 > ) v1 > WHERE b % 4 == 0 > ) v2 > > UNION ALL > > SELECT v2.ctid, v2.* > FROM ( > SELECT v1.ctid, v1.* > FROM ( > SELECT t1child.ctid, t1child.* > FROM t1child > WHERE b % 2 == 0 > ) v1 > WHERE b % 4 == 0 > ) v2 > > ) > SET x = x + 32; > > > Giving a plan looking like: > > EXPLAIN UPDATE v2 SET x2 = 32 > > > QUERY PLAN > --------------------------------------------------------------------------- > Update on t1 t1_2 (cost=0.00..23.35 rows=2 width=76) > -> Subquery Scan on t1 (cost=0.00..2.18 rows=1 width=74) > -> Subquery Scan on t1_3 (cost=0.00..2.17 rows=1 width=74) > Filter: ((t1_3.b % 4) = 0) > -> Seq Scan on t1 t1_4 (cost=0.00..2.16 rows=1 width=74) > Filter: ((b % 2) = 0) > -> Subquery Scan on t1_1 (cost=0.00..21.17 rows=1 width=78) > -> Subquery Scan on t1_5 (cost=0.00..21.16 rows=1 width=78) > Filter: ((t1_5.b % 4) = 0) > -> Seq Scan on t1child (cost=0.00..21.10 rows=4 width=78) > Filter: ((b % 2) = 0) > (11 rows) > > > > > So far this looks like a really clever approach. My only real concern is > that the security quals are currently hidden from rewrite and parsing > before during the period they're hidden away in the security quals RTI > field. > > This means they aren't processed for things like inheritance expansion. e.g. > > CREATE TABLE rowfilter (remainder integer, userid text); > CREATE TABLE rowfilterchild () INHERITS (rowfilter); > INSERT INTO rowfilterchild(remainder, userid) values (0, current_user); > > a view with a security qual that refers to an inherited relation won't work: > > CREATE VIEW sqv > WITH (security_barrier) > AS > SELECT x, b FROM t1 WHERE ( > SELECT b % 4 = remainder > FROM rowfilter > WHERE userid = current_user > OFFSET 0 > ); > > This is a bit contrived to force the optimiser to treat the subquery as > correlated and thus make sure the ref to rowfilter gets into the > securityQuals list. > > I expected zero results (a scan of rowfilter, but not rowfilterchild, > resulting in a null subquery return) but land up with an assertion > failure instead. The assertion triggers for any security qual containing > a correlated subquery, so it's crashing out before we can break due to > failure to expand inheritance. > > > This isn't just about inheritance. In general, we'd need a way to > process those securityQuals through any rewrite phases and through the > parts of planning before they get merged back in, so they're subject to > things like inheritance appendrel expansion. > > Same if the security qual contains a view ref: > > CREATE VIEW dumbview(zero) > AS SELECT 0; > > CREATE VIEW sqv2 > WITH (security_barrier) > AS > SELECT x, b FROM t1 > WHERE (SELECT b % 2 = zero FROM dumbview OFFSET 0); > Thanks for testing, and good catch on the sublinks. There was a trivial bug in my new code in rewriteTargetView() --- it needs to check the added security barrier qual for sublinks and mark the parent query accordingly. After that the rewriter will descend into the sublinks on the security barrier quals expanding any views they contain, so the fix for that part is trivial (see the attached update). The assertion failure with inheritance and sublinks is a separate issue --- adjust_appendrel_attrs() is not expecting to find any unplanned sublinks in the query tree when it is invoked, since they would normally have all been planned by that point. However, the addition of the new security barrier subqueries after inheritance expansion can now insert new sublinks which need to be planned. I'll look into how best to make that happen. Regards, Dean
Attachment
On 8 January 2014 10:19, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > The assertion failure with inheritance and sublinks is a separate > issue --- adjust_appendrel_attrs() is not expecting to find any > unplanned sublinks in the query tree when it is invoked, since they > would normally have all been planned by that point. However, the > addition of the new security barrier subqueries after inheritance > expansion can now insert new sublinks which need to be planned. I'll > look into how best to make that happen. > Actually that wasn't quite it. The problem was that an RTE in the top-level query had a security barrier qual with a sublink in it, and it wasn't preprocessing those quals, so that sublink was still unplanned by the time it got to adjust_appendrel_attrs(), which then complained. My first thought was that it should just preprocess any security barrier quals in subquery_planner() in the same way as other quals are preprocessed. But thinking about it further, those quals are destined to become the quals of subqueries in the range table, so we don't actually want to preprocess them at that stage --- that will happen later when the new subquery is planned by recursion back into subquery_planner(). So I think the right answer is to make adjust_appendrel_attrs() handle recursion into sublink subqueries. This doesn't affect performance in the normal case, because all other sublinks in the query will have been turned into subplans by that point, so it only needs to handle unplanned sublinks in RTE security barrier quals, which can only happen for updates to s.b. views. The attached patch does that, which fixes the case you reported. > On 8 January 2014 09:02, Craig Ringer <craig@2ndquadrant.com> wrote: >> My main concern is that the securityQuals appear to bypass all later >> rewrite stages, inheritance expansion during planning, etc. I suspect >> this might be hard to get around (because these are disembodied quals >> which may have nonsense varnos), but I'm looking into it now. >> Actually that's not the case. The securityQuals are traversed in the standard walker/mutator functions, so the rewriter *will* recursively expand any view references they contain (provided the query is correctly marked with hasSubLinks, which my first patch failed to do!). Inheritance expansion of relations in subqueries in the securityQuals is handled by recursion in the planner --- subquery_planner() invokes grouping_planner(), expanding securityQuals into new subquery RTEs, then subquery_planner() is recursively invoked for the new RTE subquery, which preprocesses its quals, which recursively invokes subquery_planner() for the sublinks in those quals, which then expands the inheritance sets they contain. >> My understanding from reading the patch is that this: >> >> - Flattens target views in rewriteTargetView, as in current master. If >> the target view is a security barrier view, the view quals are appended >> to a list of security barrier quals on the new RTE, instead of appended >> to the RTE's normal quals like for normal views. >> Right. >> After rewrite the views are fully flattened down to a RTE_RELATION, >> which becomes the resultRelation. An unreferenced RTE for each view >> that's been rewritten is preserved in the range-table for permissions >> checking purposes only (same as current master). >> Right. >> - Inheritance expansion, tlist expansion, etc then occurrs as normal. >> Right. >> - In planning, in inheritance_planner, if any RTE has any stashed >> security quals in its RangeTableEntry, expand_security_qual is invoked. >> This iteratively wraps the base relation in a subquery with the saved >> security barrier quals, creating nested subqueries around the original >> RTE. At each pass resultRelation is changed to point to the new >> outer-most subquery. >> Actually the resultRelation is only changed in the first pass. Each subsequent pass that creates an additional nested subquery RTE modifies the old subquery RTE in-place. The new subquery has an "identity" targetlist, which means that no further rewriting of the outer query is necessary after the first s.b. subquery is created. This avoids having multiple levels of attribute rewriting in the case where s.b. views are nested on top of one another. >> As a result of this approach everything looks normal to >> preprocess_targetlist, row-marking, etc, because they're seeing a normal >> RTE_RELATION as resultRelation. The security barrier quals are, at this >> stage, stashed aside. If there's inheritance involved, RTEs copied >> during appendrel expansion get copies of the security quals on in the >> parent RTE. >> >> Problem with inheritance, views, etc in s.b. quals >> -------------------------------------------------- >> >> After inheritance expansion, tlist expansion, etc, the s.b. quals are >> unpacked to create subqueries wrapping the original RTEs. >> >> >> So, with: >> >> CREATE TABLE t1 (x float, b integer, secret1 text, secret2 text); >> CREATE TABLE t1child (z integer) INHERITS (t1); >> >> INSERT INTO t1 (x, b, secret1, secret2) >> VALUES >> (0,0,'secret0', 'supersecret'), >> (1,1,'secret1', 'supersecret'), >> (2,2,'secret2', 'supersecret'), >> (3,3,'secret3', 'supersecret'), >> (4,4,'secret4', 'supersecret'), >> (5,6,'secret5', 'supersecret'); >> >> INSERT INTO t1child (x, b, secret1, secret2, z) >> VALUES >> (8,8,'secret8', 'ss', 8), >> (9,9,'secret8', 'ss', 9), >> (10,10,'secret8', 'ss', 10); >> >> CREATE VIEW v1 >> WITH (security_barrier) >> AS >> SELECT b AS b1, x AS x1, secret1 >> FROM t1 WHERE b % 2 = 0; >> >> CREATE VIEW v2 >> WITH (security_barrier) >> AS >> SELECT b1 AS b2, x1 AS x2 >> FROM v1 WHERE b1 % 4 = 0; >> >> >> >> then a statement like: >> >> UPDATE v2 >> SET x2 = x2 + 32; >> >> will be rewritten into something like (imaginary sql) >> >> UPDATE t1 WITH SECURITY QUALS ((b % 2 == 0), (b % 4 == 0)) >> SET x = x + 32 >> >> inheritance-expanded and tlist-expanded into something like (imaginary SQL) >> >> >> UPDATE >> (t1 WITH SECURITY QUALS ((b % 2 == 0), (b % 4 == 0))) >> UNION ALL >> (t1child WITH SECURITY QUALS ((b % 2 == 0), (b % 4 == 0))) >> SET x = x + 32; >> >> >> after which security qual expansion occurs, giving us something like: >> >> >> UPDATE >> t1, t1child <--- resultRelations >> ( >> SELECT v2.ctid, v2.* >> FROM ( >> SELECT v1.ctid, v1.* >> FROM ( >> SELECT t1.ctid, t1.* >> FROM t1 >> WHERE b % 2 == 0 >> ) v1 >> WHERE b % 4 == 0 >> ) v2 >> >> UNION ALL >> >> SELECT v2.ctid, v2.* >> FROM ( >> SELECT v1.ctid, v1.* >> FROM ( >> SELECT t1child.ctid, t1child.* >> FROM t1child >> WHERE b % 2 == 0 >> ) v1 >> WHERE b % 4 == 0 >> ) v2 >> >> ) >> SET x = x + 32; >> >> >> Giving a plan looking like: >> >> EXPLAIN UPDATE v2 SET x2 = 32 >> >> >> QUERY PLAN >> --------------------------------------------------------------------------- >> Update on t1 t1_2 (cost=0.00..23.35 rows=2 width=76) >> -> Subquery Scan on t1 (cost=0.00..2.18 rows=1 width=74) >> -> Subquery Scan on t1_3 (cost=0.00..2.17 rows=1 width=74) >> Filter: ((t1_3.b % 4) = 0) >> -> Seq Scan on t1 t1_4 (cost=0.00..2.16 rows=1 width=74) >> Filter: ((b % 2) = 0) >> -> Subquery Scan on t1_1 (cost=0.00..21.17 rows=1 width=78) >> -> Subquery Scan on t1_5 (cost=0.00..21.16 rows=1 width=78) >> Filter: ((t1_5.b % 4) = 0) >> -> Seq Scan on t1child (cost=0.00..21.10 rows=4 width=78) >> Filter: ((b % 2) = 0) >> (11 rows) >> >> >> >> >> So far this looks like a really clever approach. My only real concern is >> that the security quals are currently hidden from rewrite and parsing >> before during the period they're hidden away in the security quals RTI >> field. >> >> This means they aren't processed for things like inheritance expansion. e.g. >> >> CREATE TABLE rowfilter (remainder integer, userid text); >> CREATE TABLE rowfilterchild () INHERITS (rowfilter); >> INSERT INTO rowfilterchild(remainder, userid) values (0, current_user); >> >> a view with a security qual that refers to an inherited relation won't work: >> >> CREATE VIEW sqv >> WITH (security_barrier) >> AS >> SELECT x, b FROM t1 WHERE ( >> SELECT b % 4 = remainder >> FROM rowfilter >> WHERE userid = current_user >> OFFSET 0 >> ); >> >> This is a bit contrived to force the optimiser to treat the subquery as >> correlated and thus make sure the ref to rowfilter gets into the >> securityQuals list. >> >> I expected zero results (a scan of rowfilter, but not rowfilterchild, >> resulting in a null subquery return) but land up with an assertion >> failure instead. The assertion triggers for any security qual containing >> a correlated subquery, so it's crashing out before we can break due to >> failure to expand inheritance. >> Having fixed the assertion failure by making adjust_appendrel_attrs() handle descent into sublink subqueries, this now works, and it does correctly expand the inheritance in the subquery in the s.b. qual, for the reasons explained above: explain (verbose, costs off) update sqv set b=b*10 where b%5=0; QUERY PLAN ------------------------------------------------------------------------------------------------------- Update on public.t1 t1_2 -> Subquery Scan on t1 Output: t1.x, (t1.b * 10), t1.secret1, t1.secret2, t1.ctid Filter: ((t1.b % 5) = 0) -> Seq Scan on public.t1 t1_3 Output: t1_3.b, t1_3.ctid, t1_3.x, t1_3.secret1, t1_3.secret2 Filter: (SubPlan 1) SubPlan 1 -> Result Output: ((t1_3.b % 4) = rowfilter.remainder) -> Append -> Seq Scan on public.rowfilter Output: rowfilter.remainder Filter: (rowfilter.userid = ("current_user"())::text) -> Seq Scan on public.rowfilterchild Output: rowfilterchild.remainder Filter: (rowfilterchild.userid = ("current_user"())::text) -> Subquery Scan on t1_1 Output: t1_1.x, (t1_1.b * 10), t1_1.secret1, t1_1.secret2, t1_1.z, t1_1.ctid Filter: ((t1_1.b % 5) = 0) -> Seq Scan on public.t1child Output: t1child.b, t1child.ctid, t1child.x, t1child.secret1, t1child.secret2, t1child.z Filter: (SubPlan 2) SubPlan 2 -> Result Output: ((t1child.b % 4) = rowfilter_1.remainder) -> Append -> Seq Scan on public.rowfilter rowfilter_1 Output: rowfilter_1.remainder Filter: (rowfilter_1.userid = ("current_user"())::text) -> Seq Scan on public.rowfilterchild rowfilterchild_1 Output: rowfilterchild_1.remainder Filter: (rowfilterchild_1.userid = ("current_user"())::text) >> This isn't just about inheritance. In general, we'd need a way to >> process those securityQuals through any rewrite phases and through the >> parts of planning before they get merged back in, so they're subject to >> things like inheritance appendrel expansion. >> I believe that should work because the rewriter traverses the query tree applying rewrite rules to everything it sees, and that will include any securityQuals. In fact one of my concerns about your approach of expanding the s.b. view in rewriteTargetView() was how that would interact with later stages of the rewriter if there were more rules on the base relation. rewriteTargetView() happens very early in the rewriter, so there is potentially a lot more that can happen to the query tree after that, which would make it difficult to keep track of the relationship between the target RTE and expanded subquery RTE. Storing the securityQuals on the RTE keeps them tied together, which makes it easier to predict what will happen, although this still does need a lot more testing. Regards, Dean doc/src/sgml/ref/create_view.sgml | 6 src/backend/commands/tablecmds.c | 6 src/backend/commands/view.c | 6 src/backend/nodes/copyfuncs.c | 1 src/backend/nodes/equalfuncs.c | 1 src/backend/nodes/nodeFuncs.c | 4 src/backend/nodes/outfuncs.c | 1 src/backend/nodes/readfuncs.c | 1 src/backend/optimizer/plan/planner.c | 37 !! src/backend/optimizer/prep/Makefile | 2 src/backend/optimizer/prep/prepsecurity.c | 423 ++++++++++++++++++++++++++ src/backend/optimizer/prep/prepunion.c | 57 !!! src/backend/rewrite/rewriteHandler.c | 53 - src/include/nodes/parsenodes.h | 1 src/include/optimizer/prep.h | 5 src/include/rewrite/rewriteHandler.h | 1 src/test/regress/expected/create_view.out | 2 src/test/regress/expected/updatable_views.out | 261 +++++++++++!!! src/test/regress/sql/updatable_views.sql | 92 ++++! 19 files changed, 738 insertions(+), 25 deletions(-), 197 modifications(!)
Attachment
Dean Rasheed <dean.a.rasheed@gmail.com> writes: > My first thought was that it should just preprocess any security > barrier quals in subquery_planner() in the same way as other quals are > preprocessed. But thinking about it further, those quals are destined > to become the quals of subqueries in the range table, so we don't > actually want to preprocess them at that stage --- that will happen > later when the new subquery is planned by recursion back into > subquery_planner(). So I think the right answer is to make > adjust_appendrel_attrs() handle recursion into sublink subqueries. TBH, this sounds like doubling down on a wrong design choice. I see no good reason that updatable security views should require any fundamental rearrangements of the order of operations in the planner; and I doubt that this is the last bug you'll have if you insist on doing that. regards, tom lane
On 9 January 2014 15:19, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Dean Rasheed <dean.a.rasheed@gmail.com> writes: >> My first thought was that it should just preprocess any security >> barrier quals in subquery_planner() in the same way as other quals are >> preprocessed. But thinking about it further, those quals are destined >> to become the quals of subqueries in the range table, so we don't >> actually want to preprocess them at that stage --- that will happen >> later when the new subquery is planned by recursion back into >> subquery_planner(). So I think the right answer is to make >> adjust_appendrel_attrs() handle recursion into sublink subqueries. > > TBH, this sounds like doubling down on a wrong design choice. Perhaps, but it's a design choice informed by all the problems that arose from the previous attempts. Right now I don't have any other ideas how to tackle this, so perhaps continued testing to find where this falls down will inform a better approach. If nothing else, we're collecting a useful set of test cases that the final patch will need to pass. Regards, Dean
On 01/09/2014 06:48 PM, Dean Rasheed wrote: > On 8 January 2014 10:19, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: >> The assertion failure with inheritance and sublinks is a separate >> issue --- adjust_appendrel_attrs() is not expecting to find any >> unplanned sublinks in the query tree when it is invoked, since they >> would normally have all been planned by that point. However, the >> addition of the new security barrier subqueries after inheritance >> expansion can now insert new sublinks which need to be planned. I'll >> look into how best to make that happen. > > The attached patch does that, which fixes the case you reported. Dean, any objections to adding this to the current CF, or to my doing so? I want to adjust the RLS patch to build on top of this patch, splitting the RLS patch up into a series that can be considered separately. To have any hope of doing that, I'm going to need to be able to base it on this patch. Even if -hackers collectively decides the approach you've posted for updatable s.b. views isn't the best way at some future point, we can replace it with a better one then without upsetting users. RLS only needs quite a high level interface over this, so it should adapt well to anything that lets it wrap a table into a s.b. qualified subquery. If there's no better approach forthcoming, then I think this proposal should be committed. I'll do further testing to see if I can find anything that breaks it, of course. I've been bashing my head against this for weeks without great inspiration - everything I try when doing this in the rewriter creates three problems for every one it fixes. That's not to say it can't be done, just that I haven't been able to do it while trying to learn the basics of the rewriter, planner and executor at the same time ;-) I've been consistently stuck on how to expand the tlist and inject ctid (and oid, where required) Vars back *up* the chain of expanded subqueries after views are expanded in rewrite. We only know the required tlist and can only access the ctid and oid attrs once we expand the inner-most view, but we've got no way to walk back up the tree of subqueries (Query*) to inject Var tlis entries pointing to the next-inner-most subquery. Need some way to walk back up the nested tree of queries injecting this info. (I've had another idea about this that I need to explore tonight, but every previous approach I've tried has circled back to this same problem). -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 01/09/2014 11:19 PM, Tom Lane wrote: > Dean Rasheed <dean.a.rasheed@gmail.com> writes: >> My first thought was that it should just preprocess any security >> barrier quals in subquery_planner() in the same way as other quals are >> preprocessed. But thinking about it further, those quals are destined >> to become the quals of subqueries in the range table, so we don't >> actually want to preprocess them at that stage --- that will happen >> later when the new subquery is planned by recursion back into >> subquery_planner(). So I think the right answer is to make >> adjust_appendrel_attrs() handle recursion into sublink subqueries. > > TBH, this sounds like doubling down on a wrong design choice. I see > no good reason that updatable security views should require any > fundamental rearrangements of the order of operations in the planner; > and I doubt that this is the last bug you'll have if you insist on > doing that. I'd be quite happy to do this entirely within the rewriter. I've found two persistent obstacles to that, and frankly I'm stuck. I'm going to be reworking the RLS patches on top of Dean's functional patch unless I can find some way to progress with a rewriter based approach. The key problems are: 1. preprocess_targetlist in the planner assumes that the resultRelation is the correct RTE to set as the varno in a new Var it adds to fetch the row ctid (with label "ctid1") as a resjunk attr for row-marking. This causes the tlist to have entries pointing to different RTE to the one being scanned by the eventual seqscan / indexscan, though the underlying Relation is the same. tlist validation checks don't like that. There may be other places that need to add tlist entries pointing to the relation we're reading rows from. They'll also need to be able to deal with the fact that this no longer the resultRelation. 2. Despite bashing my head against it for ages, I haven't figured out how to inject references to the base-rel's ctid, oid (if WITH OIDS), and any tlist entries not specified in the DML statement into the subquery tree. These are only accessible at the deepest level of rewriting, when the final view is expanded into a subquery and processed with rewriteTargetListUD(..). At this point we don't have "breadcrumbs" to use to walk back up the nested subqueries adding the required tlist entries. I keep on exploring ideas for this one, and get stuck in a dead end for every one. Without a way to move on these, I don't have much hope of adding updatable security barrier views support using work done in the rewriter. It seems inevitable that we'll have to add the separate concepts of "source relation" (tuples to feed into HeapModifyTable for ctid, and for heap_modify_table after junkfiltering) and "result relation" (target Relation of heap_modify_table to actually write tuples to, target of row level locking operations). There's also going to need to be some kind of breadcrumb chain to allow us to walk from the inner-most expanded view's RTE_RELATION back up the expanded view subquery tlists, adding next-inner-most refs to resjunk "ctid" and (if needed) "oid", injecting defaults, and expanding the target list with Vars to match non-referenced attributes of the inner-most RTE_RELATION. So far I haven't come up with a sensible form for that breadcrumb trail. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 12 January 2014 10:12, Craig Ringer <craig@2ndquadrant.com> wrote: > On 01/09/2014 06:48 PM, Dean Rasheed wrote: >> On 8 January 2014 10:19, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: >>> The assertion failure with inheritance and sublinks is a separate >>> issue --- adjust_appendrel_attrs() is not expecting to find any >>> unplanned sublinks in the query tree when it is invoked, since they >>> would normally have all been planned by that point. However, the >>> addition of the new security barrier subqueries after inheritance >>> expansion can now insert new sublinks which need to be planned. I'll >>> look into how best to make that happen. >> >> The attached patch does that, which fixes the case you reported. > > Dean, any objections to adding this to the current CF, or to my doing so? > OK, I'll do that. I've added a page to the wiki with a more in-depth description of how the patch works, and the test cases I've tried so far: https://wiki.postgresql.org/wiki/Making_security_barrier_views_automatically_updatable there's obviously still a lot more testing to do, but the early signs are encouraging. Regards, Dean
On 01/09/2014 11:19 PM, Tom Lane wrote: > Dean Rasheed <dean.a.rasheed@gmail.com> writes: >> My first thought was that it should just preprocess any security >> barrier quals in subquery_planner() in the same way as other quals are >> preprocessed. But thinking about it further, those quals are destined >> to become the quals of subqueries in the range table, so we don't >> actually want to preprocess them at that stage --- that will happen >> later when the new subquery is planned by recursion back into >> subquery_planner(). So I think the right answer is to make >> adjust_appendrel_attrs() handle recursion into sublink subqueries. > > TBH, this sounds like doubling down on a wrong design choice. I see > no good reason that updatable security views should require any > fundamental rearrangements of the order of operations in the planner. In that case, would you mind offerign a quick sanity check on the following alternative idea: - Add "sourceRelation" to Query. This refers to the RTE that supplies tuple projections to feed into ExecModifyTable, with appropriate resjunk "ctid" and (if requ'd) "oid" cols present. - When expanding a target view into a subquery, set "sourceRelation" on the outer view to the index of the RTE of the newly expanded subquery. - In rewriteTargetView, as now, reassign resultRelation to the target view's base rel. This is required so that do any RETURNING and WITH CHECK OPTION fixups required to adjust the RETURNING list to the new result relation, so they act on the final tuple after any BEFORE triggers act. Do not flatten the view subquery and merge the quals (as currently happens); allow it to be expanded as a subquery by the rewriter instead. Don't mess with the view tlist at this point except by removing the whole-row Var added by rewriteTargetListUD. - When doing tlist expansion in preprocess_targetlist, when we process the outer Query (the only one for which query type is not SELECT, and the only one that has a non-zero resultRelation), if resultRelation != sourceRelation recursively follow the chain of sourceRelation s to the bottom one with type RTE_RELATION. Do tlist expansion on that inner-most Query first, using sourceRelation to supply the varno for injected TLEs, including injecting "ctid", "oid" if req'd, etc. During call stack unwind, have each intermediate layer do regular tlist expansion, adding a Var pointing to each tlist entry of the inner subquery. At the outer level of preprocess_targetlist, sort the tlist, now expanded to include all required vars, into attribute order for the resultRelation. (this level is the only one that has resultRelation set). Avoid invoking preprocess_targetlist on the inner Query again when it's processed in turn, or just bail out when we see sourceRelation set since we know it's already been done. (Alternately, it might be possible to run preprocess_targetlist depth-first instead of the current outermost-first, but I haven't looked at that). The optimizer can still flatten non-security-barrier updatable views, following the chain of Vars as it collapses each layer. That's effectively what the current rewriteTargetView code is doing manually at each pass right now. I'm sure there are some holes in this outline, but it's struck me as possibly workable. The key is to set sourceRelation on every inner subquery in the target query chain, not just the outer one, so it can be easily followed from the outer query though the subqueries into the innermost query with RTE_RELATION type. The only alternative I've looked at is looking clumsier the longer I examine it: adding a back-reference in each subquery's Query struct, to the Query containing it and the RTI of the subquery within the outer Query. That way, once rewriting hits the innermost rel with RTE_RELATION type, the rewriter can walk back up the Query tree doing tlist rewriting. I'm not sure if this is workable yet, and it creates ugly pointer-based backrefs *up* the Query chain, making what was previously a tree of Query* into a graph. That could get exciting, though there'd never be any need for mutators to follow the parent query pointer so it wouldn't make tree rewrites harder. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
(2014/01/13 22:53), Craig Ringer wrote: > On 01/09/2014 11:19 PM, Tom Lane wrote: >> Dean Rasheed <dean.a.rasheed@gmail.com> writes: >>> My first thought was that it should just preprocess any security >>> barrier quals in subquery_planner() in the same way as other quals are >>> preprocessed. But thinking about it further, those quals are destined >>> to become the quals of subqueries in the range table, so we don't >>> actually want to preprocess them at that stage --- that will happen >>> later when the new subquery is planned by recursion back into >>> subquery_planner(). So I think the right answer is to make >>> adjust_appendrel_attrs() handle recursion into sublink subqueries. >> >> TBH, this sounds like doubling down on a wrong design choice. I see >> no good reason that updatable security views should require any >> fundamental rearrangements of the order of operations in the planner. > > In that case, would you mind offerign a quick sanity check on the > following alternative idea: > > - Add "sourceRelation" to Query. This refers to the RTE that supplies > tuple projections to feed into ExecModifyTable, with appropriate resjunk > "ctid" and (if requ'd) "oid" cols present. > > - When expanding a target view into a subquery, set "sourceRelation" on > the outer view to the index of the RTE of the newly expanded subquery. > I have sane opinion. Existing assumption, "resultRelation" is RTE of the table to be read not only modified, makes the implementation complicated. If we would have a separate "sourceRelation", it allows to have flexible form including sub-query with security_barrier on the reader side. Could you tell me the direction you're inclined right now? I wonder whether I should take the latest patch submitted on 09-Jan for a deep code reviewing and testing. Thanks, > - In rewriteTargetView, as now, reassign resultRelation to the target > view's base rel. This is required so that do any RETURNING and WITH > CHECK OPTION fixups required to adjust the RETURNING list to the new > result relation, so they act on the final tuple after any BEFORE > triggers act. Do not flatten the view subquery and merge the quals (as > currently happens); allow it to be expanded as a subquery by the > rewriter instead. Don't mess with the view tlist at this point except by > removing the whole-row Var added by rewriteTargetListUD. > > - When doing tlist expansion in preprocess_targetlist, when we process > the outer Query (the only one for which query type is not SELECT, and > the only one that has a non-zero resultRelation), if resultRelation != > sourceRelation recursively follow the chain of sourceRelation s to the > bottom one with type RTE_RELATION. Do tlist expansion on that inner-most > Query first, using sourceRelation to supply the varno for injected TLEs, > including injecting "ctid", "oid" if req'd, etc. During call stack > unwind, have each intermediate layer do regular tlist expansion, adding > a Var pointing to each tlist entry of the inner subquery. > > At the outer level of preprocess_targetlist, sort the tlist, now > expanded to include all required vars, into attribute order for the > resultRelation. (this level is the only one that has resultRelation set). > > Avoid invoking preprocess_targetlist on the inner Query again when it's > processed in turn, or just bail out when we see sourceRelation set since > we know it's already been done. > > (Alternately, it might be possible to run preprocess_targetlist > depth-first instead of the current outermost-first, but I haven't looked > at that). > > > The optimizer can still flatten non-security-barrier updatable views, > following the chain of Vars as it collapses each layer. That's > effectively what the current rewriteTargetView code is doing manually at > each pass right now. > > I'm sure there are some holes in this outline, but it's struck me as > possibly workable. The key is to set sourceRelation on every inner > subquery in the target query chain, not just the outer one, so it can be > easily followed from the outer query though the subqueries into the > innermost query with RTE_RELATION type. > > > > The only alternative I've looked at is looking clumsier the longer I > examine it: adding a back-reference in each subquery's Query struct, to > the Query containing it and the RTI of the subquery within the outer > Query. That way, once rewriting hits the innermost rel with RTE_RELATION > type, the rewriter can walk back up the Query tree doing tlist > rewriting. I'm not sure if this is workable yet, and it creates ugly > pointer-based backrefs *up* the Query chain, making what was previously > a tree of Query* into a graph. That could get exciting, though there'd > never be any need for mutators to follow the parent query pointer so it > wouldn't make tree rewrites harder. > > > > > > -- OSS Promotion Center / The PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com>
On 01/21/2014 09:09 AM, KaiGai Kohei wrote: > (2014/01/13 22:53), Craig Ringer wrote: >> On 01/09/2014 11:19 PM, Tom Lane wrote: >>> Dean Rasheed <dean.a.rasheed@gmail.com> writes: >>>> My first thought was that it should just preprocess any security >>>> barrier quals in subquery_planner() in the same way as other quals are >>>> preprocessed. But thinking about it further, those quals are destined >>>> to become the quals of subqueries in the range table, so we don't >>>> actually want to preprocess them at that stage --- that will happen >>>> later when the new subquery is planned by recursion back into >>>> subquery_planner(). So I think the right answer is to make >>>> adjust_appendrel_attrs() handle recursion into sublink subqueries. >>> >>> TBH, this sounds like doubling down on a wrong design choice. I see >>> no good reason that updatable security views should require any >>> fundamental rearrangements of the order of operations in the planner. >> >> In that case, would you mind offerign a quick sanity check on the >> following alternative idea: >> >> - Add "sourceRelation" to Query. This refers to the RTE that supplies >> tuple projections to feed into ExecModifyTable, with appropriate resjunk >> "ctid" and (if requ'd) "oid" cols present. >> >> - When expanding a target view into a subquery, set "sourceRelation" on >> the outer view to the index of the RTE of the newly expanded subquery. >> > I have sane opinion. Existing assumption, "resultRelation" is RTE of the > table to be read not only modified, makes the implementation complicated. > If we would have a separate "sourceRelation", it allows to have flexible > form including sub-query with security_barrier on the reader side. > > Could you tell me the direction you're inclined right now? My focus right now is getting your original RLS patch rebased on top of head, separated into a series of patches that can be understood as separate functional units, then rewritten on top of updatable security barrier views. ( See http://www.postgresql.org/message-id/52DCBEF1.3010004@2ndquadrant.com ) I don't really care which updatable security barrier views implementation it is. Dean's has the advantage of actually working, so I'm basing it on his. It sounds like Tom objects to Dean's approach to some degree, but we'll have to see whether that turns into concrete issues. If it does, it should be possible to replace the underlying updatable security barrier views implementation without upsetting the rewritten RLS significantly. That's part of why I've gone down this path - it separates "filtering rows according to security predicates" from the rest of RLS, into a largely functionally separate patch. > I wonder whether I should take the latest patch submitted on 09-Jan for > a deep code reviewing and testing. That would be extremely valuable. Please break it, or show that you could not figure out how to break it. If you find an unfixable flaw, we'll go back to the approach outlined in this mail - split resultRelation, insert ctids down the subquery chain, etc. If you don't find a major flaw, then that's one less thing to worry about, and we can focus on the RLS layer. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 21 January 2014 01:09, KaiGai Kohei <kaigai@ak.jp.nec.com> wrote: > (2014/01/13 22:53), Craig Ringer wrote: >> >> On 01/09/2014 11:19 PM, Tom Lane wrote: >>> >>> Dean Rasheed <dean.a.rasheed@gmail.com> writes: >>>> >>>> My first thought was that it should just preprocess any security >>>> barrier quals in subquery_planner() in the same way as other quals are >>>> preprocessed. But thinking about it further, those quals are destined >>>> to become the quals of subqueries in the range table, so we don't >>>> actually want to preprocess them at that stage --- that will happen >>>> later when the new subquery is planned by recursion back into >>>> subquery_planner(). So I think the right answer is to make >>>> adjust_appendrel_attrs() handle recursion into sublink subqueries. >>> >>> >>> TBH, this sounds like doubling down on a wrong design choice. I see >>> no good reason that updatable security views should require any >>> fundamental rearrangements of the order of operations in the planner. >> >> >> In that case, would you mind offerign a quick sanity check on the >> following alternative idea: >> >> - Add "sourceRelation" to Query. This refers to the RTE that supplies >> tuple projections to feed into ExecModifyTable, with appropriate resjunk >> "ctid" and (if requ'd) "oid" cols present. >> >> - When expanding a target view into a subquery, set "sourceRelation" on >> the outer view to the index of the RTE of the newly expanded subquery. >> > I have sane opinion. Existing assumption, "resultRelation" is RTE of the > table to be read not only modified, makes the implementation complicated. > If we would have a separate "sourceRelation", it allows to have flexible > form including sub-query with security_barrier on the reader side. > > Could you tell me the direction you're inclined right now? > I wonder whether I should take the latest patch submitted on 09-Jan for > a deep code reviewing and testing. > Yes, please review the patch from 09-Jan (http://www.postgresql.org/message-id/CAEZATCUiKxOg=vOOvjA2S6G-sixzzxg18ToTggP8zOBq6QnQHQ@mail.gmail.com). The idea behind that patch is to avoid a lot of the complication that leads to and then arises from having a separate "sourceRelation" in the query. If you go down the route of expanding the subquery in the rewriter and making it the "sourceRelation", then you have to make extensive changes to preprocess_targetlist so that it recursively descends into the subquery to pull out variables needed by an update. Also you would probably need additional changes in the rewriter so that later stages didn't trample on the "sourceRelation", and correctly updated it in the case of views on top of other views. Also, you would need to make changes to the inheritance planner code, because you'd now have 2 RTEs referring to the target relation ("resultRelation" and "sourceRelation" wrapping it in a subquery). Referring to the comment in inheritance_planner(): * Source inheritance is expanded at the bottom of the* plan tree (see allpaths.c), but target inheritance has to be expandedat* the top. except that in the case of the "sourceRelation", it is actually effectively the target, which means it shouldn't be expanded, otherwise you'd get plans like the ones I complained about upthread (see the final explain output in http://www.postgresql.org/message-id/CAEZATCU3VcDKJpnHGFkRMrkz0axKCUH4CE_kQq3Z2HzkNhi5iA@mail.gmail.com). Perhaps all of that could be made to work, but I think that it would end up being a far more invasive patch than my 09-Jan patch. My patch avoids both those issues by not doing the subquery expansion until after inheritance expansion, and after targetlist preprocessing. Regards, Dean
(2014/01/21 18:18), Dean Rasheed wrote: > On 21 January 2014 01:09, KaiGai Kohei <kaigai@ak.jp.nec.com> wrote: >> (2014/01/13 22:53), Craig Ringer wrote: >>> >>> On 01/09/2014 11:19 PM, Tom Lane wrote: >>>> >>>> Dean Rasheed <dean.a.rasheed@gmail.com> writes: >>>>> >>>>> My first thought was that it should just preprocess any security >>>>> barrier quals in subquery_planner() in the same way as other quals are >>>>> preprocessed. But thinking about it further, those quals are destined >>>>> to become the quals of subqueries in the range table, so we don't >>>>> actually want to preprocess them at that stage --- that will happen >>>>> later when the new subquery is planned by recursion back into >>>>> subquery_planner(). So I think the right answer is to make >>>>> adjust_appendrel_attrs() handle recursion into sublink subqueries. >>>> >>>> >>>> TBH, this sounds like doubling down on a wrong design choice. I see >>>> no good reason that updatable security views should require any >>>> fundamental rearrangements of the order of operations in the planner. >>> >>> >>> In that case, would you mind offerign a quick sanity check on the >>> following alternative idea: >>> >>> - Add "sourceRelation" to Query. This refers to the RTE that supplies >>> tuple projections to feed into ExecModifyTable, with appropriate resjunk >>> "ctid" and (if requ'd) "oid" cols present. >>> >>> - When expanding a target view into a subquery, set "sourceRelation" on >>> the outer view to the index of the RTE of the newly expanded subquery. >>> >> I have sane opinion. Existing assumption, "resultRelation" is RTE of the >> table to be read not only modified, makes the implementation complicated. >> If we would have a separate "sourceRelation", it allows to have flexible >> form including sub-query with security_barrier on the reader side. >> >> Could you tell me the direction you're inclined right now? >> I wonder whether I should take the latest patch submitted on 09-Jan for >> a deep code reviewing and testing. >> > > Yes, please review the patch from 09-Jan > (http://www.postgresql.org/message-id/CAEZATCUiKxOg=vOOvjA2S6G-sixzzxg18ToTggP8zOBq6QnQHQ@mail.gmail.com). > > The idea behind that patch is to avoid a lot of the complication that > leads to and then arises from having a separate "sourceRelation" in > the query. > > If you go down the route of expanding the subquery in the rewriter and > making it the "sourceRelation", then you have to make extensive > changes to preprocess_targetlist so that it recursively descends into > the subquery to pull out variables needed by an update. > > Also you would probably need additional changes in the rewriter so > that later stages didn't trample on the "sourceRelation", and > correctly updated it in the case of views on top of other views. > > Also, you would need to make changes to the inheritance planner code, > because you'd now have 2 RTEs referring to the target relation > ("resultRelation" and "sourceRelation" wrapping it in a subquery). > Referring to the comment in inheritance_planner(): > > * Source inheritance is expanded at the bottom of the > * plan tree (see allpaths.c), but target inheritance has to be expanded at > * the top. > > except that in the case of the "sourceRelation", it is actually > effectively the target, which means it shouldn't be expanded, > otherwise you'd get plans like the ones I complained about upthread > (see the final explain output in > http://www.postgresql.org/message-id/CAEZATCU3VcDKJpnHGFkRMrkz0axKCUH4CE_kQq3Z2HzkNhi5iA@mail.gmail.com). > > Perhaps all of that could be made to work, but I think that it would > end up being a far more invasive patch than my 09-Jan patch. My patch > avoids both those issues by not doing the subquery expansion until > after inheritance expansion, and after targetlist preprocessing. > Probably, I could get your point. Even though the sub-query being replaced from a relation with securityQuals is eventually planned according to the usual manner, usual order as a part of regular sub-query planning, however, adjust_appendrel_attrs() is called by inheritance_planner() earlier than sub-query's planning. Maybe, we originally had this problem but not appeared on the surface, because child relations don't have qualifiers on this phase. (It shall be distributed later.) But now, this assumption was broken because of a relation with securityQuals being replaced by a sub-query with SubLink. So, I'm inclined to revise the assumption side, rather than existing assertion checks. Shall we investigate what assumption should be revised if child- relation, being expanded at expand_inherited_tables(), would be a sub-query having Sub-Link? A minor comment is below: ! /* ! * Make sure that the query is marked correctly if the added qual ! * has sublinks. ! */ ! if (!parsetree->hasSubLinks) ! parsetree->hasSubLinks = checkExprHasSubLink(viewqual); Is this logic really needed? This flag says the top-level query contains SubLinks, however, the above condition checks whether a sub-query to be constructed shall contain SubLinks. Also, securityQuals is not attached to the parse tree right now. Thanks, -- OSS Promotion Center / The PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com>
On 21 January 2014 09:18, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > Yes, please review the patch from 09-Jan > (http://www.postgresql.org/message-id/CAEZATCUiKxOg=vOOvjA2S6G-sixzzxg18ToTggP8zOBq6QnQHQ@mail.gmail.com). > After further testing I found a bug --- it involves having a security barrier view on top of a base relation that has a rule that rewrites the query to have a different result relation, and possibly also a different command type, so that the securityQuals are no longer on the result relation, which is a code path not previously tested and the rowmark handling was wrong. That's probably a pretty obscure case in the context of security barrier views, but that code path would be used much more commonly if RLS were built on top of this. Fortunately the fix is trivial --- updated patch attached. Regards, Dean
Attachment
On 23 January 2014 06:13, KaiGai Kohei <kaigai@ak.jp.nec.com> wrote: > A minor comment is below: > > ! /* > ! * Make sure that the query is marked correctly if the added > qual > ! * has sublinks. > ! */ > ! if (!parsetree->hasSubLinks) > ! parsetree->hasSubLinks = checkExprHasSubLink(viewqual); > > Is this logic really needed? This flag says the top-level query > contains SubLinks, however, the above condition checks whether > a sub-query to be constructed shall contain SubLinks. > Also, securityQuals is not attached to the parse tree right now. > Thanks for looking at this. Yes that logic is needed. Without it the top-level query wouldn't be marked as having sublinks, which could cause all sorts of things to go wrong --- for example, without it fireRIRrules() wouldn't walk the query tree looking for SELECT rules in sublinks to expand, so a security barrier qual with a sublink subquery that selected from another view wouldn't work. It is not true to say that "securityQuals is not attached to the parse tree". The securityQuals *are* part of the parse tree, they just happen to be held in a different place to keep them isolated from other quals. So the remaining rewriter code that walks or mutates the parse tree will process them just like any other quals, recursively expanding any rules they contain. Regards, Dean
Hello, I checked the latest updatable security barrier view patch. Even though I couldn't find a major design problem in this revision, here are two minor comments below. I think, it needs to be reviewed by committer to stick direction to implement this feature. Of course, even I know Tom argued the current design of this feature on the up-thread, it does not seem to me Dean's design is not reasonable. Below is minor comments of mine: @@ -932,9 +938,32 @@ inheritance_planner(PlannerInfo *root) if (final_rtable == NIL) final_rtable = subroot.parse->rtable; else - final_rtable = list_concat(final_rtable, + { + List *tmp_rtable = NIL; + ListCell *cell1, *cell2; + + /* + * Planning this new child may have turned some of the original + * RTEs into subqueries (if they had security barrier quals). If + * so, we want to use these in the final rtable. + */ + forboth(cell1, final_rtable, cell2, subroot.parse->rtable) + { + RangeTblEntry *rte1 = (RangeTblEntry *) lfirst(cell1); + RangeTblEntry *rte2 = (RangeTblEntry *) lfirst(cell2); + + if (rte1->rtekind == RTE_RELATION && + rte1->securityQuals != NIL && + rte2->rtekind == RTE_SUBQUERY) + tmp_rtable = lappend(tmp_rtable, rte2); + else + tmp_rtable = lappend(tmp_rtable, rte1); + } Do we have a case if rte1 is regular relation with securityQuals but rte2 is not a sub-query? If so, rte2->rtekind == RTE_SUBQUERY should be a condition in Assert, but the third condition in if-block. In case when a sub-query is simple enough; no qualifier and no projection towards underlying scan, is it pulled-up even if this sub-query has security-barrier attribute, isn't it? See the example below. The view v2 is defined as follows. postgres=# CREATE VIEW v2 WITH (security_barrier) AS SELECT * FROM t2 WHERE x % 10 = 5; CREATE VIEW postgres=# EXPLAIN SELECT * FROM v2 WHERE f_leak(z); QUERY PLAN ---------------------------------------------------------Subquery Scan on v2 (cost=0.00..3.76 rows=1 width=41) Filter:f_leak(v2.z) -> Seq Scan on t2 (cost=0.00..3.50 rows=1 width=41) Filter: ((x % 10) = 5) (4 rows) postgres=# EXPLAIN SELECT * FROM v2; QUERY PLAN ---------------------------------------------------Seq Scan on t2 (cost=0.00..3.50 rows=1 width=41) Filter: ((x % 10) =5) (2 rows) The second explain result shows the underlying sub-query is pulled-up even though it has security-barrier attribute. (IIRC, it was a new feature in v9.3.) On the other hand, this kind of optimization was not applied on a sub-query being extracted from a relation with securityQuals postgres=# EXPLAIN UPDATE v2 SET z = z; QUERY PLAN --------------------------------------------------------------------Update on t2 t2_1 (cost=0.00..3.51 rows=1 width=47) -> Subquery Scan on t2 (cost=0.00..3.51 rows=1 width=47) -> Seq Scan on t2 t2_2 (cost=0.00..3.50 rows=1width=47) Filter: ((x % 10) = 5) (4 rows) If it has no security_barrier option, the view reference is extracted in the rewriter stage, it was pulled up as we expected. postgres=# ALTER VIEW v2 RESET (security_barrier); ALTER VIEW postgres=# EXPLAIN UPDATE t2 SET z = z; QUERY PLAN -----------------------------------------------------------Update on t2 (cost=0.00..3.00 rows=100 width=47) -> Seq Scanon t2 (cost=0.00..3.00 rows=100 width=47) (2 rows) Probably, it misses something to be checked and applied when we extract a sub-query in prepsecurity.c. # BTW, which code does it pull up? pull_up_subqueries() is not. Thanks, > -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Dean Rasheed > Sent: Thursday, January 23, 2014 7:06 PM > To: Kaigai, Kouhei(海外, 浩平) > Cc: Craig Ringer; Tom Lane; PostgreSQL Hackers; Kohei KaiGai; Robert Haas; > Simon Riggs > Subject: Re: [HACKERS] WIP patch (v2) for updatable security barrier views > > On 21 January 2014 09:18, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > > Yes, please review the patch from 09-Jan > > > (http://www.postgresql.org/message-id/CAEZATCUiKxOg=vOOvjA2S6G-sixzzxg > 18ToTggP8zOBq6QnQHQ@mail.gmail.com). > > > > After further testing I found a bug --- it involves having a security barrier > view on top of a base relation that has a rule that rewrites the query to > have a different result relation, and possibly also a different command > type, so that the securityQuals are no longer on the result relation, which > is a code path not previously tested and the rowmark handling was wrong. > That's probably a pretty obscure case in the context of security barrier > views, but that code path would be used much more commonly if RLS were built > on top of this. Fortunately the fix is trivial --- updated patch attached. > > Regards, > Dean
On 27 January 2014 07:54, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote: > Hello, > > I checked the latest updatable security barrier view patch. > Even though I couldn't find a major design problem in this revision, > here are two minor comments below. > > I think, it needs to be reviewed by committer to stick direction > to implement this feature. Of course, even I know Tom argued the > current design of this feature on the up-thread, it does not seem > to me Dean's design is not reasonable. > Thanks for looking at this. > Below is minor comments of mine: > > @@ -932,9 +938,32 @@ inheritance_planner(PlannerInfo *root) > if (final_rtable == NIL) > final_rtable = subroot.parse->rtable; > else > - final_rtable = list_concat(final_rtable, > + { > + List *tmp_rtable = NIL; > + ListCell *cell1, *cell2; > + > + /* > + * Planning this new child may have turned some of the original > + * RTEs into subqueries (if they had security barrier quals). If > + * so, we want to use these in the final rtable. > + */ > + forboth(cell1, final_rtable, cell2, subroot.parse->rtable) > + { > + RangeTblEntry *rte1 = (RangeTblEntry *) lfirst(cell1); > + RangeTblEntry *rte2 = (RangeTblEntry *) lfirst(cell2); > + > + if (rte1->rtekind == RTE_RELATION && > + rte1->securityQuals != NIL && > + rte2->rtekind == RTE_SUBQUERY) > + tmp_rtable = lappend(tmp_rtable, rte2); > + else > + tmp_rtable = lappend(tmp_rtable, rte1); > + } > > Do we have a case if rte1 is regular relation with securityQuals but > rte2 is not a sub-query? If so, rte2->rtekind == RTE_SUBQUERY should > be a condition in Assert, but the third condition in if-block. > Yes it is possible for rte1 to be a RTE_RELATION with securityQuals and rte2 to be something other than a RTE_SUBQUERY because the subquery expansion code in expand_security_quals() only expands RTEs that are actually used in the query. So for example, when planning the query to update an inheritance child, the rtable will contain an RTE for the parent, but it will not be referenced in the parse tree, and so it will not be expanded while planning the child update. > In case when a sub-query is simple enough; no qualifier and no projection > towards underlying scan, is it pulled-up even if this sub-query has > security-barrier attribute, isn't it? > See the example below. The view v2 is defined as follows. > > postgres=# CREATE VIEW v2 WITH (security_barrier) AS SELECT * FROM t2 WHERE x % 10 = 5; > CREATE VIEW > postgres=# EXPLAIN SELECT * FROM v2 WHERE f_leak(z); > QUERY PLAN > --------------------------------------------------------- > Subquery Scan on v2 (cost=0.00..3.76 rows=1 width=41) > Filter: f_leak(v2.z) > -> Seq Scan on t2 (cost=0.00..3.50 rows=1 width=41) > Filter: ((x % 10) = 5) > (4 rows) > > postgres=# EXPLAIN SELECT * FROM v2; > QUERY PLAN > --------------------------------------------------- > Seq Scan on t2 (cost=0.00..3.50 rows=1 width=41) > Filter: ((x % 10) = 5) > (2 rows) > > The second explain result shows the underlying sub-query is > pulled-up even though it has security-barrier attribute. > (IIRC, it was a new feature in v9.3.) > Actually what happens is that it is planned as a subquery scan, then at the very end of the planning process, it detects that the subquery scan is trivial (has no quals, and has a no-op targetlist) and it removes that plan node --- see set_subqueryscan_references() and trivial_subqueryscan(). That subquery scan removal code requires the targetlist to have exactly the same number of attributes, in exactly the same order as the underlying relation. As soon as you add anything non-trivial to the select list in the above queries, or even just change the order of its attributes, the subquery scan node is no longer removed. > On the other hand, this kind of optimization was not applied > on a sub-query being extracted from a relation with securityQuals > > postgres=# EXPLAIN UPDATE v2 SET z = z; > QUERY PLAN > -------------------------------------------------------------------- > Update on t2 t2_1 (cost=0.00..3.51 rows=1 width=47) > -> Subquery Scan on t2 (cost=0.00..3.51 rows=1 width=47) > -> Seq Scan on t2 t2_2 (cost=0.00..3.50 rows=1 width=47) > Filter: ((x % 10) = 5) > (4 rows) > > If it has no security_barrier option, the view reference is extracted > in the rewriter stage, it was pulled up as we expected. > > postgres=# ALTER VIEW v2 RESET (security_barrier); > ALTER VIEW > postgres=# EXPLAIN UPDATE t2 SET z = z; > QUERY PLAN > ----------------------------------------------------------- > Update on t2 (cost=0.00..3.00 rows=100 width=47) > -> Seq Scan on t2 (cost=0.00..3.00 rows=100 width=47) > (2 rows) > > Probably, it misses something to be checked and applied when we extract > a sub-query in prepsecurity.c. > # BTW, which code does it pull up? pull_up_subqueries() is not. > For UPDATE and DELETE the subquery scan node removal code will never be able to do anything because the targetlist will not be a no-op (it might just be possible to make it a no-op for an identity UPDATE, but that wouldn't be of much practical use). The main way in which it will attempt to optimise queries against security barrier views is by pushing quals down into the subquery, where it is safe to do so. Regards, Dean
On 27 January 2014 15:04, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > So for example, when planning the query to update an inheritance > child, the rtable will contain an RTE for the parent, but it will not > be referenced in the parse tree, and so it will not be expanded while > planning the child update. Am I right in thinking that we have this fully working now? If we commit this aspect soon, we stand a chance of also touching upon RLS. AFAICS the only area of objection is the handling of inherited relations, which occurs within the planner in the current patch. I can see that would be a cause for concern since the planner is pluggable and it would then be possible to bypass security checks. Obviously installing a new planner isn't trivial, but doing so shouldn't cause collateral damage. We have long had restrictions around updateable views. My suggestion from here is that we accept the restriction that we cannot yet have the 3-way combination of updateable views, security views and views on inherited tables. Most people aren't using inherited tables and people that are have special measures in place for their apps. We won't lose much by accepting that restriction for 9.4 and re-addressing the issue in a later release. We need not adopt an all or nothing approach. Perhaps we might yet find a solution for 9.4, but again, that need not delay the rest of the patch. From a review perspective, I'd want to see some greatly expanded README comments, but given the Wiki entry, I think we can do that quickly. Other than that, the code seems clear, modular and well tested, so is something I could see me committing the uncontentious parts of. Thoughts? -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Simon Riggs <simon@2ndQuadrant.com> writes: > Am I right in thinking that we have this fully working now? I will look at this at some point during the CF, but have not yet, and probably won't as long as it's not marked "ready for committer". regards, tom lane
On 27 January 2014 16:11, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Simon Riggs <simon@2ndQuadrant.com> writes: >> Am I right in thinking that we have this fully working now? > > I will look at this at some point during the CF, but have not yet, > and probably won't as long as it's not marked "ready for committer". I've marked it Ready for Committer, to indicate my personal opinion. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 01/28/2014 12:09 AM, Simon Riggs wrote: > On 27 January 2014 15:04, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > >> So for example, when planning the query to update an inheritance >> child, the rtable will contain an RTE for the parent, but it will not >> be referenced in the parse tree, and so it will not be expanded while >> planning the child update. > > Am I right in thinking that we have this fully working now? I haven't found any further problems, though I've been focusing more on reworking RLS on top of it. > AFAICS the only area of objection is the handling of inherited > relations, which occurs within the planner in the current patch. I can > see that would be a cause for concern since the planner is pluggable > and it would then be possible to bypass security checks. Obviously > installing a new planner isn't trivial, but doing so shouldn't cause > collateral damage. FWIW, I don't see any way _not_ to do that in RLS. If there are security quals on a child table, they must be added, and that can only happen once inheritance expansion happens. That's in the planner. I don't see it as acceptable to ignore security quals on child tables, and if we can't, we've got to do some work in the planner. (I'm starting to really loathe inheritance). > We have long had restrictions around updateable views. My suggestion > from here is that we accept the restriction that we cannot yet have > the 3-way combination of updateable views, security views and views on > inherited tables. That prevents the use of updatable security barrier views over partitioned tables, and therefore prevents row-security use on inherited tables. That seems like a very big thing to close off. I'm perfectly happy having that limitation for 9.4, we just need to make it possible to remove the limitation later. > Most people aren't using inherited tables Again, because we (ab)use them for paritioning, I'm not sure they're as little-used as I'd like. > and people that are have > special measures in place for their apps. We won't lose much by > accepting that restriction for 9.4 and re-addressing the issue in a > later release. Yep, I'd be happy with that. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
> > AFAICS the only area of objection is the handling of inherited > > relations, which occurs within the planner in the current patch. I can > > see that would be a cause for concern since the planner is pluggable > > and it would then be possible to bypass security checks. Obviously > > installing a new planner isn't trivial, but doing so shouldn't cause > > collateral damage. > > FWIW, I don't see any way _not_ to do that in RLS. If there are security > quals on a child table, they must be added, and that can only happen once > inheritance expansion happens. That's in the planner. > > I don't see it as acceptable to ignore security quals on child tables, and > if we can't, we've got to do some work in the planner. > > (I'm starting to really loathe inheritance). > Let me ask an elemental question. What is the reason why inheritance expansion logic should be located on the planner stage, not on the tail of rewriter? Reference to a relation with children is very similar to reference of multiple tables using UNION ALL. Isn't it a crappy idea to move the logic into rewriter stage (if we have no technical reason here)? Thanks, -- NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com> > -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Craig Ringer > Sent: Tuesday, January 28, 2014 12:35 PM > To: Simon Riggs; Dean Rasheed > Cc: Kaigai, Kouhei(海外, 浩平); Tom Lane; PostgreSQL Hackers; Kohei KaiGai; > Robert Haas > Subject: Re: [HACKERS] WIP patch (v2) for updatable security barrier views > > On 01/28/2014 12:09 AM, Simon Riggs wrote: > > On 27 January 2014 15:04, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > > > >> So for example, when planning the query to update an inheritance > >> child, the rtable will contain an RTE for the parent, but it will not > >> be referenced in the parse tree, and so it will not be expanded while > >> planning the child update. > > > > Am I right in thinking that we have this fully working now? > > I haven't found any further problems, though I've been focusing more on > reworking RLS on top of it. > > > AFAICS the only area of objection is the handling of inherited > > relations, which occurs within the planner in the current patch. I can > > see that would be a cause for concern since the planner is pluggable > > and it would then be possible to bypass security checks. Obviously > > installing a new planner isn't trivial, but doing so shouldn't cause > > collateral damage. > > FWIW, I don't see any way _not_ to do that in RLS. If there are security > quals on a child table, they must be added, and that can only happen once > inheritance expansion happens. That's in the planner. > > I don't see it as acceptable to ignore security quals on child tables, and > if we can't, we've got to do some work in the planner. > > (I'm starting to really loathe inheritance). > > > We have long had restrictions around updateable views. My suggestion > > from here is that we accept the restriction that we cannot yet have > > the 3-way combination of updateable views, security views and views on > > inherited tables. > > That prevents the use of updatable security barrier views over partitioned > tables, and therefore prevents row-security use on inherited tables. > > That seems like a very big thing to close off. I'm perfectly happy having > that limitation for 9.4, we just need to make it possible to remove the > limitation later. > > > Most people aren't using inherited tables > > Again, because we (ab)use them for paritioning, I'm not sure they're as > little-used as I'd like. > > > and people that are have > > special measures in place for their apps. We won't lose much by > > accepting that restriction for 9.4 and re-addressing the issue in a > > later release. > > Yep, I'd be happy with that. > > > -- > Craig Ringer http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make > changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers
On 28 January 2014 04:10, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote: >> > AFAICS the only area of objection is the handling of inherited >> > relations, which occurs within the planner in the current patch. I can >> > see that would be a cause for concern since the planner is pluggable >> > and it would then be possible to bypass security checks. Obviously >> > installing a new planner isn't trivial, but doing so shouldn't cause >> > collateral damage. >> >> FWIW, I don't see any way _not_ to do that in RLS. If there are security >> quals on a child table, they must be added, and that can only happen once >> inheritance expansion happens. That's in the planner. >> >> I don't see it as acceptable to ignore security quals on child tables, and >> if we can't, we've got to do some work in the planner. >> >> (I'm starting to really loathe inheritance). >> > Let me ask an elemental question. What is the reason why inheritance > expansion logic should be located on the planner stage, not on the tail > of rewriter? > Reference to a relation with children is very similar to reference of > multiple tables using UNION ALL. Isn't it a crappy idea to move the > logic into rewriter stage (if we have no technical reason here)? I agree that this is being seen the wrong way around. The planner contains things it should not do, and as a result we are now discussing enhancing the code that is in the wrong place, which of course brings objections. I think we would be best served by stopping inheritance in its tracks and killing it off. It keeps getting in the way. What we need is real partitioning. Other uses are pretty obscure and we should re-examine things. In the absence of that, releasing this updateable-security views without suppport for inheritance is a good move. It gives us most of what we want now and continuing to have some form of restriction is better than having a much greater restriction of it not working at all. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Jan 28, 2014 at 5:02 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > I agree that this is being seen the wrong way around. The planner > contains things it should not do, and as a result we are now > discussing enhancing the code that is in the wrong place, which of > course brings objections. > > I think we would be best served by stopping inheritance in its tracks > and killing it off. It keeps getting in the way. What we need is real > partitioning. Other uses are pretty obscure and we should re-examine > things. I actually think that inheritance is a pretty good foundation for real partitioning. If we were to get rid of it, we'd likely end up needing to add most of the same functionality back when we tried to do some kind of real partitioning later, and that doesn't sound fun. I don't see any reason why some kind of partitioning syntax couldn't be added that leverages the existing inheritance mechanism but stores additional metadata allowing for better optimization. Well... I'm lying, a little bit. If our chosen implementation of "real partitioning" involved some kind of partition objects that could be created and dropped but never directly accessed via DML commands, then we might not need anything that looks like the current planner support for partitioned tables. But I think that would be a surprising choice for this community. IMV, the problem with the planner and inheritance isn't that there's too much cruft in there already, but that there are still key optimizations that are missing. Still, I'd rather try to fix that than start over. > In the absence of that, releasing this updateable-security views > without suppport for inheritance is a good move. It gives us most of > what we want now and continuing to have some form of restriction is > better than having a much greater restriction of it not working at > all. -1. Inheritance may be a crappy substitute for real partitioning, but there are plenty of people using it that way. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Kouhei Kaigai <kaigai@ak.jp.nec.com> writes: > Let me ask an elemental question. What is the reason why inheritance > expansion logic should be located on the planner stage, not on the tail > of rewriter? I think it's mostly historical. You would however have to think of a way to preserve the inheritance relationships in the parsetree representation. In the current code, expand_inherited_tables() adds AppendRelInfo nodes to the planner's data structures as it does the expansion; but I don't think AppendRelInfo is a suitable structure for the rewriter to work with, and in any case there's no place to put it in the Query representation. Actually though, isn't this issue mostly about inheritance of a query *target* table? Moving that expansion to the rewriter is a totally different and perhaps more tractable change. It's certainly horribly ugly as it is; hard to see how doing it at the rewriter could be worse. regards, tom lane
> Kouhei Kaigai <kaigai@ak.jp.nec.com> writes: > > Let me ask an elemental question. What is the reason why inheritance > > expansion logic should be located on the planner stage, not on the > > tail of rewriter? > > I think it's mostly historical. You would however have to think of a way > to preserve the inheritance relationships in the parsetree representation. > In the current code, expand_inherited_tables() adds AppendRelInfo nodes > to the planner's data structures as it does the expansion; but I don't think > AppendRelInfo is a suitable structure for the rewriter to work with, and > in any case there's no place to put it in the Query representation. > > Actually though, isn't this issue mostly about inheritance of a query > *target* table? Moving that expansion to the rewriter is a totally > different and perhaps more tractable change. It's certainly horribly ugly > as it is; hard to see how doing it at the rewriter could be worse. > It's just an idea, so might not be a deep consideration. Isn't ii available to describe a parse tree as if some UPDATE/DELETE statements are combined with UNION ALL? Of course, even if it is only internal form. UPDATE parent SET x = 2*x, y = y || '_update' WHERE x % 10 = 5 UNION ALL UPDATE child_1 SET x = 2*x, y = y || '_update'WHERE x % 10 = 5 : Right now, only SELECT statement is allowed being placed under set-operations. If rewriter can expand inherited relations into multiple individual selects with UNION ALL, it may be a reasonable internal representation. In similar way, both of UPDATE/DELETE takes a scan on relation once, then it modifies the target relation. Probably, here is no significant different on the earlier half; that performs as if multiple SELECTs with UNION ALL are running, except for it fetches ctid system column as junk attribute and acquires row-level locks. On the other hand, it may need to adjust planner code to construct ModifyTable node running on multiple relations. Existing code pulls resultRelations from AppendRelInfo, doesn't it? It needs to take the list of result relations using recursion of set operations, if not flatten. Once we can construct ModifyTable with multiple relations on behalf of multiple relation's scan, here is no difference from what we're doing now. How about your opinion? Thanks, -- NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com> > -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: Wednesday, January 29, 2014 3:43 PM > To: Kaigai, Kouhei(海外, 浩平) > Cc: Craig Ringer; Simon Riggs; Dean Rasheed; PostgreSQL Hackers; Kohei > KaiGai; Robert Haas > Subject: Re: [HACKERS] WIP patch (v2) for updatable security barrier views > > Kouhei Kaigai <kaigai@ak.jp.nec.com> writes: > > Let me ask an elemental question. What is the reason why inheritance > > expansion logic should be located on the planner stage, not on the > > tail of rewriter? > > I think it's mostly historical. You would however have to think of a way > to preserve the inheritance relationships in the parsetree representation. > In the current code, expand_inherited_tables() adds AppendRelInfo nodes > to the planner's data structures as it does the expansion; but I don't think > AppendRelInfo is a suitable structure for the rewriter to work with, and > in any case there's no place to put it in the Query representation. > > Actually though, isn't this issue mostly about inheritance of a query > *target* table? Moving that expansion to the rewriter is a totally > different and perhaps more tractable change. It's certainly horribly ugly > as it is; hard to see how doing it at the rewriter could be worse. > > regards, tom lane
On 01/29/2014 03:34 PM, Kouhei Kaigai wrote: >> Kouhei Kaigai <kaigai@ak.jp.nec.com> writes: >>> Let me ask an elemental question. What is the reason why inheritance >>> expansion logic should be located on the planner stage, not on the >>> tail of rewriter? >> >> I think it's mostly historical. You would however have to think of a way >> to preserve the inheritance relationships in the parsetree representation. >> In the current code, expand_inherited_tables() adds AppendRelInfo nodes >> to the planner's data structures as it does the expansion; but I don't think >> AppendRelInfo is a suitable structure for the rewriter to work with, and >> in any case there's no place to put it in the Query representation. >> >> Actually though, isn't this issue mostly about inheritance of a query >> *target* table? Moving that expansion to the rewriter is a totally >> different and perhaps more tractable change. It's certainly horribly ugly >> as it is; hard to see how doing it at the rewriter could be worse. >> > It's just an idea, so might not be a deep consideration. > > Isn't ii available to describe a parse tree as if some UPDATE/DELETE statements > are combined with UNION ALL? Of course, even if it is only internal form. > > UPDATE parent SET x = 2*x, y = y || '_update' WHERE x % 10 = 5 > UNION ALL > UPDATE child_1 SET x = 2*x, y = y || '_update' WHERE x % 10 = 5 > : > > Right now, only SELECT statement is allowed being placed under set-operations. > If rewriter can expand inherited relations into multiple individual selects > with UNION ALL, it may be a reasonable internal representation. > > In similar way, both of UPDATE/DELETE takes a scan on relation once, then > it modifies the target relation. Probably, here is no significant different > on the earlier half; that performs as if multiple SELECTs with UNION ALL are > running, except for it fetches ctid system column as junk attribute and > acquires row-level locks. > > On the other hand, it may need to adjust planner code to construct > ModifyTable node running on multiple relations. Existing code pulls > resultRelations from AppendRelInfo, doesn't it? It needs to take the list > of result relations using recursion of set operations, if not flatten. > Once we can construct ModifyTable with multiple relations on behalf of > multiple relation's scan, here is no difference from what we're doing now. > > How about your opinion? My worry here is that a fair bit of work gets done before inheritance expansion. Partitioning already performs pretty poorly for anything but small numbers of partitions. If we're expanding inheritance in the rewriter, won't that further increase the already large amount of duplicate work involved in planning a query that involves inheritance? Or am I misunderstanding you? -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 28 January 2014 21:28, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Jan 28, 2014 at 5:02 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> I agree that this is being seen the wrong way around. The planner >> contains things it should not do, and as a result we are now >> discussing enhancing the code that is in the wrong place, which of >> course brings objections. >> >> I think we would be best served by stopping inheritance in its tracks >> and killing it off. It keeps getting in the way. What we need is real >> partitioning. Other uses are pretty obscure and we should re-examine >> things. > > I actually think that inheritance is a pretty good foundation for real > partitioning. If we were to get rid of it, we'd likely end up needing > to add most of the same functionality back when we tried to do some > kind of real partitioning later, and that doesn't sound fun. I don't > see any reason why some kind of partitioning syntax couldn't be added > that leverages the existing inheritance mechanism but stores > additional metadata allowing for better optimization. > > Well... I'm lying, a little bit. If our chosen implementation of > "real partitioning" involved some kind of partition objects that could > be created and dropped but never directly accessed via DML commands, > then we might not need anything that looks like the current planner > support for partitioned tables. But I think that would be a > surprising choice for this community. IMV, the problem with the > planner and inheritance isn't that there's too much cruft in there > already, but that there are still key optimizations that are missing. > Still, I'd rather try to fix that than start over. > >> In the absence of that, releasing this updateable-security views >> without suppport for inheritance is a good move. It gives us most of >> what we want now and continuing to have some form of restriction is >> better than having a much greater restriction of it not working at >> all. > > -1. Inheritance may be a crappy substitute for real partitioning, but > there are plenty of people using it that way. When I talk about removing inheritance, of course I see some need for partitioning. Given the way generalised inheritance works, it is possible to come up with some monstrous designs that are then hard to rewrite and plan. What I propose is that we remove the user-visible generalised inheritance feature and only allow a more structured version which we call partitioning. If all target/result relations are identical it will be much easier to handle things because there'll be no special purpose code to juggle. Yes, key optimizations are missing. Overburdening ourselves with complications that slow us down from delivering more useful features is sensible. Think of it as feature-level refactoring, rather than the normal code-level refactoring we frequently discuss. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 29 January 2014 06:43, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Actually though, isn't this issue mostly about inheritance of a query > *target* table? Exactly. IMHO updateable views on inheritance sets will have multiple other performance problems, so trying to support them here will not make their usage seamless. We allowed updateable views to work with restrictions in earlier releases, so I can't see why continuing with a much reduced restriction would be a problem in this release. We don't need to remove the remaining restriction all in one release, so we? > Moving that expansion to the rewriter is a totally > different and perhaps more tractable change. It's certainly horribly ugly > as it is; hard to see how doing it at the rewriter could be worse. I see the patch adding some changes to inheritance_planner that might well get moved to rewriter. As long as the ugliness all stays in one place, does it matter where that is -- for this patch -- ? Just asking whether moving it will reduce the net ugliness of our inheritance support. @Craig: I don't think this would have much effect on partitioning performance. The main cost of that is constraint exclusion, which we wuld still perform only once. All other copying and re-jigging still gets performed even for excluded relations, so doing it earlier wouldn't hurt as much as you might think. @Dean: If we moved that to rewriter, would expand_security_quals() and preprocess_rowmarks() also move? -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 01/23/2014 06:06 PM, Dean Rasheed wrote: > On 21 January 2014 09:18, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: >> Yes, please review the patch from 09-Jan >> (http://www.postgresql.org/message-id/CAEZATCUiKxOg=vOOvjA2S6G-sixzzxg18ToTggP8zOBq6QnQHQ@mail.gmail.com). >> > > After further testing I found a bug --- it involves having a security > barrier view on top of a base relation that has a rule that rewrites > the query to have a different result relation, and possibly also a > different command type, so that the securityQuals are no longer on the > result relation, which is a code path not previously tested and the > rowmark handling was wrong. That's probably a pretty obscure case in > the context of security barrier views, but that code path would be > used much more commonly if RLS were built on top of this. Fortunately > the fix is trivial --- updated patch attached. This is the most recent patch I see, and the one I've been working on top of. Are there any known tests that this patch fails? Can we construct any tests that this patch fails? If so, can we make it pass them, or error out cleanly? The discussion has gone a bit off the wall a bit - partly my fault I think - I mentioned inheritance. Lets try to refocus on the immediate patch at hand, and whether it's good to go. Right now, I'm not personally aware of tests cases that cause this code to fail. There's a good-taste complaint about handling of inheritance, but frankly, there's not much about inheritance that _is_ good taste. I don't see that this patch makes it worse, and it's functional. It might be interesting to revisit some of these broader questions in 9.5, but what can we do to get this functionality in place for 9.4? -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 29 January 2014 11:27, Simon Riggs <simon@2ndquadrant.com> wrote: > On 29 January 2014 06:43, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Actually though, isn't this issue mostly about inheritance of a query >> *target* table? > > Exactly. IMHO updateable views on inheritance sets will have multiple > other performance problems, so trying to support them here will not > make their usage seamless. > > We allowed updateable views to work with restrictions in earlier > releases, so I can't see why continuing with a much reduced > restriction would be a problem in this release. We don't need to > remove the remaining restriction all in one release, so we? > >> Moving that expansion to the rewriter is a totally >> different and perhaps more tractable change. It's certainly horribly ugly >> as it is; hard to see how doing it at the rewriter could be worse. > > I see the patch adding some changes to inheritance_planner that might > well get moved to rewriter. > As long as the ugliness all stays in one place, does it matter where > that is -- for this patch -- ? Just asking whether moving it will > reduce the net ugliness of our inheritance support. > > @Craig: I don't think this would have much effect on partitioning > performance. The main cost of that is constraint exclusion, which we > wuld still perform only once. All other copying and re-jigging still > gets performed even for excluded relations, so doing it earlier > wouldn't hurt as much as you might think. > > @Dean: If we moved that to rewriter, would expand_security_quals() and > preprocess_rowmarks() also move? > Actually I tend to think that expand_security_quals() should remain where it is, regardless of where inheritance expansion happens. One of the things that simplifies the job that expand_security_quals() has to do is that it takes place after preprocess_targetlist(), so the targetlist for the security barrier subqueries that it constructs is known. Calling expand_security_quals() earlier would require additional surgery on preprocess_targetlist() and expand_targetlist(), and would probably also mean that the Query would need to record the sourceRelation subquery RTE, as discussed upthread. Regards, Dean
On 29 January 2014 11:34, Craig Ringer <craig@2ndquadrant.com> wrote: > On 01/23/2014 06:06 PM, Dean Rasheed wrote: >> On 21 January 2014 09:18, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: >>> Yes, please review the patch from 09-Jan >>> (http://www.postgresql.org/message-id/CAEZATCUiKxOg=vOOvjA2S6G-sixzzxg18ToTggP8zOBq6QnQHQ@mail.gmail.com). >>> >> >> After further testing I found a bug --- it involves having a security >> barrier view on top of a base relation that has a rule that rewrites >> the query to have a different result relation, and possibly also a >> different command type, so that the securityQuals are no longer on the >> result relation, which is a code path not previously tested and the >> rowmark handling was wrong. That's probably a pretty obscure case in >> the context of security barrier views, but that code path would be >> used much more commonly if RLS were built on top of this. Fortunately >> the fix is trivial --- updated patch attached. > > This is the most recent patch I see, and the one I've been working on > top of. > > Are there any known tests that this patch fails? > None that I've been able to come up with. > Can we construct any tests that this patch fails? If so, can we make it > pass them, or error out cleanly? > Sounds sensible. Feel free to add any test cases you think up to the wiki page. Even if we don't like this design, any alternative must at least pass all the tests listed there. https://wiki.postgresql.org/wiki/Making_security_barrier_views_automatically_updatable Regards, Dean
On 29 January 2014 06:43, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Kouhei Kaigai <kaigai@ak.jp.nec.com> writes: >> Let me ask an elemental question. What is the reason why inheritance >> expansion logic should be located on the planner stage, not on the tail >> of rewriter? > > I think it's mostly historical. You would however have to think of a > way to preserve the inheritance relationships in the parsetree > representation. In the current code, expand_inherited_tables() adds > AppendRelInfo nodes to the planner's data structures as it does the > expansion; but I don't think AppendRelInfo is a suitable structure > for the rewriter to work with, and in any case there's no place to > put it in the Query representation. > > Actually though, isn't this issue mostly about inheritance of a query > *target* table? Moving that expansion to the rewriter is a totally > different and perhaps more tractable change. It's certainly horribly ugly > as it is; hard to see how doing it at the rewriter could be worse. > That's interesting. Presumably then we could make rules work properly on inheritance children. I'm not sure if anyone has actually complained that that currently doesn't work. Thinking about that though, it does potentially open up a whole other can of worms --- a single update query could be turned into multiple other queries of different command types. Perhaps that's not so different from what currently happens in the rewriter, except that you'd need a way to track which of those queries counts towards the statement's final row count. And how many ModifyTable nodes would the resulting plan have? Regards, Dean
On 29 January 2014 12:20, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: >> @Dean: If we moved that to rewriter, would expand_security_quals() and >> preprocess_rowmarks() also move? >> > > Actually I tend to think that expand_security_quals() should remain > where it is, regardless of where inheritance expansion happens. One of > the things that simplifies the job that expand_security_quals() has to > do is that it takes place after preprocess_targetlist(), so the > targetlist for the security barrier subqueries that it constructs is > known. Calling expand_security_quals() earlier would require > additional surgery on preprocess_targetlist() and expand_targetlist(), > and would probably also mean that the Query would need to record the > sourceRelation subquery RTE, as discussed upthread. Looks to me that preprocess_targetlist() could be done earlier also, if necessary, since it appears to contain checks and additional columns, not anything related to optimization. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 01/29/2014 08:29 PM, Dean Rasheed wrote: > On 29 January 2014 11:34, Craig Ringer <craig@2ndquadrant.com> wrote: >> On 01/23/2014 06:06 PM, Dean Rasheed wrote: >>> On 21 January 2014 09:18, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: >>>> Yes, please review the patch from 09-Jan >>>> (http://www.postgresql.org/message-id/CAEZATCUiKxOg=vOOvjA2S6G-sixzzxg18ToTggP8zOBq6QnQHQ@mail.gmail.com). >>>> >>> >>> After further testing I found a bug --- it involves having a security >>> barrier view on top of a base relation that has a rule that rewrites >>> the query to have a different result relation, and possibly also a >>> different command type, so that the securityQuals are no longer on the >>> result relation, which is a code path not previously tested and the >>> rowmark handling was wrong. That's probably a pretty obscure case in >>> the context of security barrier views, but that code path would be >>> used much more commonly if RLS were built on top of this. Fortunately >>> the fix is trivial --- updated patch attached. >> >> This is the most recent patch I see, and the one I've been working on >> top of. >> >> Are there any known tests that this patch fails? >> > > None that I've been able to come up with. I've found an issue. I'm not sure if it can be triggered from SQL, but it affects in-code users who add their own securityQuals. expand_security_quals fails to update any rowMarks that may point to a relation being expanded. If the relation isn't the target relation, this causes a rowmark to refer to a RTE with no relid, and thus failure in the executor. Relative patch against updatable s.b. views attached (for easier reading), along with a new revision of updatable s.b. views that incorporates the patch. This isn't triggered by FOR SHARE / FOR UPDATE rowmark on a security barrier view because the flattening to securityQuals and re-expansion in the optimizer is only done for _target_ security barrier views. For target views, different rowmark handling applies, so they don't trigger it either. This is triggered by adding securityQuals to a non-target relation during row-security processing, though. > Sounds sensible. Feel free to add any test cases you think up to the > wiki page. Even if we don't like this design, any alternative must at > least pass all the tests listed there. Eh, much better to add directly to src/regress/ IMO. If they're on the wiki they can get overlooked/forgotten. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 30 January 2014 05:36, Craig Ringer <craig@2ndquadrant.com> wrote: > On 01/29/2014 08:29 PM, Dean Rasheed wrote: >> On 29 January 2014 11:34, Craig Ringer <craig@2ndquadrant.com> wrote: >>> On 01/23/2014 06:06 PM, Dean Rasheed wrote: >>>> On 21 January 2014 09:18, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: >>>>> Yes, please review the patch from 09-Jan >>>>> (http://www.postgresql.org/message-id/CAEZATCUiKxOg=vOOvjA2S6G-sixzzxg18ToTggP8zOBq6QnQHQ@mail.gmail.com). >>>>> >>>> >>>> After further testing I found a bug --- it involves having a security >>>> barrier view on top of a base relation that has a rule that rewrites >>>> the query to have a different result relation, and possibly also a >>>> different command type, so that the securityQuals are no longer on the >>>> result relation, which is a code path not previously tested and the >>>> rowmark handling was wrong. That's probably a pretty obscure case in >>>> the context of security barrier views, but that code path would be >>>> used much more commonly if RLS were built on top of this. Fortunately >>>> the fix is trivial --- updated patch attached. >>> >>> This is the most recent patch I see, and the one I've been working on >>> top of. >>> >>> Are there any known tests that this patch fails? >>> >> >> None that I've been able to come up with. > > I've found an issue. I'm not sure if it can be triggered from SQL, but > it affects in-code users who add their own securityQuals. > > expand_security_quals fails to update any rowMarks that may point to a > relation being expanded. If the relation isn't the target relation, this > causes a rowmark to refer to a RTE with no relid, and thus failure in > the executor. > > Relative patch against updatable s.b. views attached (for easier > reading), along with a new revision of updatable s.b. views that > incorporates the patch. > > This isn't triggered by FOR SHARE / FOR UPDATE rowmark on a security > barrier view because the flattening to securityQuals and re-expansion in > the optimizer is only done for _target_ security barrier views. For > target views, different rowmark handling applies, so they don't trigger > it either. > > This is triggered by adding securityQuals to a non-target relation > during row-security processing, though. > Hmm, looks like this is a pre-existing bug. The first thing I tried was to reproduce it using SQL, so I used a RULE to turn a DELETE into a SELECT FOR UPDATE. This is pretty dumb, but it shows the problem: CREATE TABLE foo (a int); CREATE RULE foo_del_rule AS ON DELETE TO foo DO INSTEAD SELECT * FROM foo FOR UPDATE; DELETE FROM foo; ERROR: no relation entry for relid 1 So I think this should be fixed independently of the updatable s.b. view code. Regards, Dean
On 30 January 2014 11:55, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > Hmm, looks like this is a pre-existing bug. > > The first thing I tried was to reproduce it using SQL, so I used a > RULE to turn a DELETE into a SELECT FOR UPDATE. This is pretty dumb, > but it shows the problem: > > CREATE TABLE foo (a int); > CREATE RULE foo_del_rule AS ON DELETE TO foo > DO INSTEAD SELECT * FROM foo FOR UPDATE; > DELETE FROM foo; > > ERROR: no relation entry for relid 1 > > So I think this should be fixed independently of the updatable s.b. view code. Looks to me there isn't much use case for turning DML into a SELECT - where would we expect the output to go to if the caller wasn't prepared to handle the result rows? IMHO we should simply prohibit such cases rather than attempt to fix the fact they don't work. We know for certain nobody relies on this behaviour because its broken already. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 30 January 2014 05:36, Craig Ringer <craig@2ndquadrant.com> wrote: > On 01/29/2014 08:29 PM, Dean Rasheed wrote: >> On 29 January 2014 11:34, Craig Ringer <craig@2ndquadrant.com> wrote: >>> On 01/23/2014 06:06 PM, Dean Rasheed wrote: >>>> On 21 January 2014 09:18, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: >>>>> Yes, please review the patch from 09-Jan >>>>> (http://www.postgresql.org/message-id/CAEZATCUiKxOg=vOOvjA2S6G-sixzzxg18ToTggP8zOBq6QnQHQ@mail.gmail.com). >>>>> >>>> >>>> After further testing I found a bug --- it involves having a security >>>> barrier view on top of a base relation that has a rule that rewrites >>>> the query to have a different result relation, and possibly also a >>>> different command type, so that the securityQuals are no longer on the >>>> result relation, which is a code path not previously tested and the >>>> rowmark handling was wrong. That's probably a pretty obscure case in >>>> the context of security barrier views, but that code path would be >>>> used much more commonly if RLS were built on top of this. Fortunately >>>> the fix is trivial --- updated patch attached. >>> >>> This is the most recent patch I see, and the one I've been working on >>> top of. >>> >>> Are there any known tests that this patch fails? >>> >> >> None that I've been able to come up with. > > I've found an issue. I'm not sure if it can be triggered from SQL, but > it affects in-code users who add their own securityQuals. > It's difficult to fix this kind of issue without a reproducible test case, but... > expand_security_quals fails to update any rowMarks that may point to a > relation being expanded. If the relation isn't the target relation, this > causes a rowmark to refer to a RTE with no relid, and thus failure in > the executor. > > Relative patch against updatable s.b. views attached (for easier > reading), along with a new revision of updatable s.b. views that > incorporates the patch. > I don't like this fix --- you appear to be adding another RTE to the rangetable (one not in the FROM list) and applying the rowmarks to it, which seems wrong because you're not locking the right set of rows. This is reflected in the change to the regression test output where, in one of the tests, the ctids for the table to update are no longer coming from the same table. I think a better approach is to push down the rowmark into the subquery so that any locking required applies to the pushed down RTE --- see the attached patch. This is an alternative (and more general) fix to the problem I found upthread (http://www.postgresql.org/message-id/CAEZATCVAqJV5WTjLmyObP21n+CzhbEx2AOzH4e6qmTcueVDjdQ@mail.gmail.com) with rules on the base table, but hopefully it will solve the issue you're seeing too. It doesn't change any of the existing regression test output, but neither does it add any new tests, since I can't see a way to provoke your issue in isolation using SQL without first running into the pre-existing bug with rules that turn DML into SELECT ... FOR UPDATE. Anyway, please test if this works with your RLS code. Regards, Dean
Attachment
On 01/31/2014 05:09 PM, Dean Rasheed wrote: > I don't like this fix --- you appear to be adding another RTE to the > rangetable (one not in the FROM list) and applying the rowmarks to it, > which seems wrong because you're not locking the right set of rows. > This is reflected in the change to the regression test output where, > in one of the tests, the ctids for the table to update are no longer > coming from the same table. I think a better approach is to push down > the rowmark into the subquery so that any locking required applies to > the pushed down RTE --- see the attached patch. Thankyou. I wasn't able to detect any changes in behaviour caused by the original change other than the table alias change due to the additional RTE. The new RTE referred to the same Relation, and there didn't seem to be any problems caused by that. Nonetheless, your fix seems a lot cleaner, especially in the target-view case. I've updated the commitfest app to show this patch. > Anyway, please test if this works with your RLS code. It does. Thankyou. I'm still working on the other issues I outlined yesterday, but that's for the other thread. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
I've found an issue with updatable security barrier views. Locking is being pushed down into the subquery. Locking is thus applied before user-supplied quals are, so we potentially lock too many rows. I'm looking into the code now, but in the mean time, here's a demo of the problem: regress=> CREATE TABLE t1(x integer, y integer); CREATE TABLE regress=> INSERT INTO t1(x,y) VALUES (1,1), (2,2), (3,3), (4,4); INSERT 0 4 regress=> CREATE VIEW v1 WITH (security_barrier) AS SELECT x, y FROM t1 WHERE x % 2 = 0; CREATE VIEW regress=> EXPLAIN SELECT * FROM v1 FOR UPDATE; QUERY PLAN -----------------------------------------------------------------------LockRows (cost=0.00..42.43 rows=11 width=40) -> Subquery Scan on v1 (cost=0.00..42.32 rows=11 width=40) -> LockRows (cost=0.00..42.21 rows=11 width=14) -> Seq Scan on t1 (cost=0.00..42.10 rows=11 width=14) Filter: ((x % 2) = 0)Planning time: 0.140ms (6 rows) or, preventing pushdown with a wrapper function to demonstrate the problem: regress=> CREATE FUNCTION is_one(integer) RETURNS boolean AS $$ DECLARE result integer; BEGIN SELECT $1 = 1 regress=> EXPLAIN SELECT * FROM v1 WHERE is_one(x) FOR UPDATE; QUERY PLAN -----------------------------------------------------------------------LockRows (cost=0.00..45.11 rows=4 width=40) -> Subquery Scan on v1 (cost=0.00..45.07 rows=4 width=40) Filter: is_one(v1.x) -> LockRows (cost=0.00..42.21rows=11 width=14) -> Seq Scan on t1 (cost=0.00..42.10 rows=11 width=14) Filter: ((x % 2) = 0)Planning time: 0.147 ms (7 rows) OK, so it looks like the code: /* * Now deal with any PlanRowMark on this RTE by requesting a lock * of the same strength on the RTE copied down to the subquery. */ rc = get_plan_rowmark(root->rowMarks,rt_index); if (rc != NULL) { switch (rc->markType) { /* .... */ } root->rowMarks = list_delete(root->rowMarks, rc); } isn't actually appropriate. We should _not_ be pushing locking down into the subquery. Instead, we should be retargeting the rowmark so it points to the new subquery RTE, marking rows _after_filtering. We want a plan like: regress=> EXPLAIN SELECT * FROM v1 WHERE is_one(x) FOR UPDATE; QUERY PLAN -----------------------------------------------------------------------LockRows (cost=0.00..45.11 rows=4 width=40) -> Subquery Scan on v1 (cost=0.00..45.07 rows=4 width=40) Filter: is_one(v1.x) -> Seq Scan on t1 (cost=0.00..42.10rows=11 width=14) Filter: ((x % 2) = 0)Planning time: 0.147 ms (7 rows) I'm not too sure what the best way to do that is. Time permitting I'll see if I can work out the RowMark code and set something up. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 10 March 2014 03:36, Craig Ringer <craig@2ndquadrant.com> wrote: > I've found an issue with updatable security barrier views. Locking is > being pushed down into the subquery. Locking is thus applied before > user-supplied quals are, so we potentially lock too many rows. > > I'm looking into the code now, but in the mean time, here's a demo of > the problem: > > > > regress=> CREATE TABLE t1(x integer, y integer); > CREATE TABLE > regress=> INSERT INTO t1(x,y) VALUES (1,1), (2,2), (3,3), (4,4); > INSERT 0 4 > regress=> CREATE VIEW v1 WITH (security_barrier) AS SELECT x, y FROM t1 > WHERE x % 2 = 0; > CREATE VIEW > regress=> EXPLAIN SELECT * FROM v1 FOR UPDATE; > QUERY PLAN > ----------------------------------------------------------------------- > LockRows (cost=0.00..42.43 rows=11 width=40) > -> Subquery Scan on v1 (cost=0.00..42.32 rows=11 width=40) > -> LockRows (cost=0.00..42.21 rows=11 width=14) > -> Seq Scan on t1 (cost=0.00..42.10 rows=11 width=14) > Filter: ((x % 2) = 0) > Planning time: 0.140 ms > (6 rows) > That has nothing to do with *updatable* security barrier views, because that's not an update. In fact you get that exact same plan on HEAD, without the updatable security barrier views patch. > > or, preventing pushdown with a wrapper function to demonstrate the problem: > > regress=> CREATE FUNCTION is_one(integer) RETURNS boolean AS $$ DECLARE > result integer; BEGIN SELECT $1 = 1 > > regress=> EXPLAIN SELECT * FROM v1 WHERE is_one(x) FOR UPDATE; > QUERY PLAN > ----------------------------------------------------------------------- > LockRows (cost=0.00..45.11 rows=4 width=40) > -> Subquery Scan on v1 (cost=0.00..45.07 rows=4 width=40) > Filter: is_one(v1.x) > -> LockRows (cost=0.00..42.21 rows=11 width=14) > -> Seq Scan on t1 (cost=0.00..42.10 rows=11 width=14) > Filter: ((x % 2) = 0) > Planning time: 0.147 ms > (7 rows) > > > > > > > OK, so it looks like the code: > > > > /* > * Now deal with any PlanRowMark on this RTE by requesting a > lock > * of the same strength on the RTE copied down to the subquery. > */ > rc = get_plan_rowmark(root->rowMarks, rt_index); > if (rc != NULL) > { > switch (rc->markType) > { > /* .... */ > } > root->rowMarks = list_delete(root->rowMarks, rc); > } > > > isn't actually appropriate. We should _not_ be pushing locking down into > the subquery. > That code isn't being invoked in this test case because you're just selecting from the view, so it's the normal view expansion code, not the new securityQuals expansion code. > Instead, we should be retargeting the rowmark so it points to the new > subquery RTE, marking rows _after_filtering. We want a plan like: > > > > regress=> EXPLAIN SELECT * FROM v1 WHERE is_one(x) FOR UPDATE; > QUERY PLAN > ----------------------------------------------------------------------- > LockRows (cost=0.00..45.11 rows=4 width=40) > -> Subquery Scan on v1 (cost=0.00..45.07 rows=4 width=40) > Filter: is_one(v1.x) > -> Seq Scan on t1 (cost=0.00..42.10 rows=11 width=14) > Filter: ((x % 2) = 0) > Planning time: 0.147 ms > (7 rows) > > > I'm not too sure what the best way to do that is. Time permitting I'll > see if I can work out the RowMark code and set something up. > Agreed, that seems like it would be a saner plan, but the place to look to fix it isn't in the updatable security barriers view patch. Perhaps the updatable security barriers view patch suffers from the same problem, but first I'd like to know what that problem is in HEAD. Regards, Dean
Dean Rasheed <dean.a.rasheed@gmail.com> writes: > On 10 March 2014 03:36, Craig Ringer <craig@2ndquadrant.com> wrote: >> I've found an issue with updatable security barrier views. Locking is >> being pushed down into the subquery. Locking is thus applied before >> user-supplied quals are, so we potentially lock too many rows. > That has nothing to do with *updatable* security barrier views, > because that's not an update. In fact you get that exact same plan on > HEAD, without the updatable security barrier views patch. I got around to looking at this. The proximate reason for the two LockRows nodes is: (1) The parser and rewriter intentionally insert RowMarkClauses into both the outer query and the subquery when a FOR UPDATE/SHARE applies to a subquery-in-FROM. The comment in transformLockingClause explains why: * FOR UPDATE/SHARE of subquery is propagated to all of * subquery's rels, too. Wecould do this later (based on * the marking of the subquery RTE) but it is convenient * to have local knowledge in each query level about which * rels need to be opened with RowShareLock. that is, if we didn't push down the RowMarkClause, processing of the subquery would be at risk of opening relations with too weak a lock. In the example case, this pushdown is actually done by the rewriter's markQueryForLocking function, but it's just emulating what the parser would have done if the view query had been written in-line. (2) The planner doesn't consider this, and generates a LockRows plan node for each level of the query, since both of them have rowMarks. However, I'm not sure this is a bug, or at least it's not the same bug you think it is. The lower LockRows node is doing a ROW_MARK_EXCLUSIVE mark on the physical table rows, while the upper one is doing a ROW_MARK_COPY on the virtual rows emitted by the subquery. *These are not the same thing*. A ROW_MARK_COPY isn't a lock of any sort; it's just there to allow EvalPlanQual to see the row that was emitted from the subquery, in case a recheck has to be done during a Read Committed UPDATE/DELETE. Since this is a pure SELECT, the upper "locking" action is useless, and arguably the planner ought to be smart enough not to emit it. But it's just wasting some cycles, it's not changing any semantics. So if you wanted user-supplied quals to limit which rows get locked physically, they would need to be applied before the lower LockRows node. To my mind, it's not immediately apparent that that is a reasonable expectation. The entire *point* of a security_barrier view is that unsafe quals don't get pushed into it, so why would you expect that those quals get applied early enough to limit locking? regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Dean Rasheed <dean.a.rasheed@gmail.com> writes: > > On 10 March 2014 03:36, Craig Ringer <craig@2ndquadrant.com> wrote: > >> I've found an issue with updatable security barrier views. Locking is > >> being pushed down into the subquery. Locking is thus applied before > >> user-supplied quals are, so we potentially lock too many rows. > > > That has nothing to do with *updatable* security barrier views, > > because that's not an update. In fact you get that exact same plan on > > HEAD, without the updatable security barrier views patch. > > I got around to looking at this. Thanks! > So if you wanted user-supplied quals to limit which rows get locked > physically, they would need to be applied before the lower LockRows node. Right. > To my mind, it's not immediately apparent that that is a reasonable > expectation. The entire *point* of a security_barrier view is that > unsafe quals don't get pushed into it, so why would you expect that > those quals get applied early enough to limit locking? Right, we don't want unsafe quals to get pushed down below the security_barrier view. The point here is that those quals should not be able to lock rows which they don't even see. My understanding of the issue was that the unsafe quals were being able to see and lock rows that they shouldn't know exist. If that's not the case then I don't think there's a bug here..? Craig/Dean, can you provide a complete test case which shows the issue? Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: > * Tom Lane (tgl@sss.pgh.pa.us) wrote: >> So if you wanted user-supplied quals to limit which rows get locked >> physically, they would need to be applied before the lower LockRows node. > Right. >> To my mind, it's not immediately apparent that that is a reasonable >> expectation. The entire *point* of a security_barrier view is that >> unsafe quals don't get pushed into it, so why would you expect that >> those quals get applied early enough to limit locking? > Right, we don't want unsafe quals to get pushed down below the > security_barrier view. The point here is that those quals should not be > able to lock rows which they don't even see. That sounds a bit confused. The quals aren't getting to see rows they shouldn't be allowed to see. The issue rather is whether rows that the user quals would exclude might get locked anyway because the locking happens before those quals are checked. [ thinks for awhile... ] I guess the thing that is surprising is that ordinarily, if you say SELECT ... WHERE ... FOR UPDATE, only rows passing the WHERE clause get locked. If there's a security view involved, that gets reversed (at least for unsafe clauses). So from that standpoint it does seem pretty bogus. I'm not sure how much we can do about it given the current implementation technique for security views. A physical row lock requires access to the source relation and the ctid column, neither of which are visible at all outside an un-flattened subquery. Anyway, this is definitely a pre-existing issue with security_barrier views (or really with any unflattenable subquery), and so I'm not sure if it has much to do with the patch at hand. regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > That sounds a bit confused. It was- I clearly hadn't followed the thread entirely. > The quals aren't getting to see rows they > shouldn't be allowed to see. The issue rather is whether rows that the > user quals would exclude might get locked anyway because the locking > happens before those quals are checked. This, imv, moves this from a 'major' bug to more of a 'performance' or 'obnoxious' side-effect issue that we should address *eventually*, but not something to hold up the RLS implementation. We often have more locking than is strictly required and then reduce it later (see also: ALTER TABLE lock reduction thread). It will be an unfortunate "gotcha" for a release or two, but hopefully we'll be able to address it... > [ thinks for awhile... ] I guess the thing that is surprising is that > ordinarily, if you say SELECT ... WHERE ... FOR UPDATE, only rows passing > the WHERE clause get locked. If there's a security view involved, that > gets reversed (at least for unsafe clauses). So from that standpoint > it does seem pretty bogus. I'm not sure how much we can do about it > given the current implementation technique for security views. A physical > row lock requires access to the source relation and the ctid column, > neither of which are visible at all outside an un-flattened subquery. Interesting. I'm trying to reason out why we don't have it already in similar situations; we can't *always* flatten a subquery... Updatable security_barrier views and RLS may make this a more promenint issue but hopefully that'll just encourage work on fixing it (perhaps take the higher level lock and then figure out a way to drop it when the rows don't match the later quals..?). > Anyway, this is definitely a pre-existing issue with security_barrier > views (or really with any unflattenable subquery), and so I'm not sure > if it has much to do with the patch at hand. Agreed. Thanks! Stephen
Stephen Frost <sfrost@snowman.net> writes: > Interesting. I'm trying to reason out why we don't have it already in > similar situations; we can't *always* flatten a subquery... I think we do, just nobody's particularly noticed (which further reduces the urgency of finding a solution, I suppose). regards, tom lane
On 4/9/14 9:56 PM, Stephen Frost wrote: > As for docs and testing, those are things we would certainly be better > off with and may mean that this isn't able to make it into 9.4, which is > fair, but I wouldn't toss it out solely due to that. We have a git repo with multiple worked out code examples, ones that have been in active testing for months now. It's private just to reduce mistakes as things are cleared for public consumption, but I (and Mark and Jeff here) can pull anything we like from it to submit to hackers. There's also a test case set from Yeb Havinga he used for his review. I expected to turn at least one of those into a Postgres regression test. The whole thing squealed to a stop when the regression tests Craig was working on were raising multiple serious questions. I didn't see much sense in bundling more boring, passing tests when the ones we already had seemed off--and no one was sure why. Now that Tom has given some guidance on the row locking weirdness, maybe it's time for me to start updating those into make check form. The documentation has been in a similar holding pattern. I have lots of resources to help document what does and doesn't work here to the quality expected in the manual. I just need a little more confidence about which feature set is commit worthy. The documentation that makes sense is very different if only updatable security barrier views is committed. -- Greg Smith greg.smith@crunchydatasolutions.com Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/
Dean, Craig, all, * Dean Rasheed (dean.a.rasheed@gmail.com) wrote: > This is reflected in the change to the regression test output where, > in one of the tests, the ctids for the table to update are no longer > coming from the same table. I think a better approach is to push down > the rowmark into the subquery so that any locking required applies to > the pushed down RTE --- see the attached patch. I'm working through this patch and came across a few places where I wanted to ask questions (as much for my own edification as questioning the actual implementation). Also, feel free to point out if I'm bringing up something which has already been discussed. I've been trying to follow the discussion but it's been a while and my memory may have faded. While in the planner, we need to address the case of a child RTE which has been transformed from a relation to a subquery. That all makes perfect sense, but I'm wondering if it'd be better to change this conditional: ! if (rte1->rtekind == RTE_RELATION && ! rte1->securityQuals != NIL && ! rte2->rtekind == RTE_SUBQUERY) which essentially says "if a relation was changed to a subquery *and* it has security quals then we need to update the entry" into one like this: ! if (rte1->rtekind == RTE_RELATION && ! rte2->rtekind == RTE_SUBQUERY) ! { ! Assert(rte1->securityQuals != NIL); ! ... which changes it to "if a relation was changed to a subquery, it had better be because it's got securityQuals that we're dealing with". My general thinking here is that we'd be better off with the Assert() firing during some later development which changes things in this area than skipping the change because there aren't any securityQuals and then expecting everything to be fine with the subquery actually being a relation.. I could see flipping that around too, to check if there are securityQuals and then Assert() on the change from relation to subquery- after all, if there are securityQuals then it *must* be a subquery, right? A similar case exists in prepunion.c where we're checking if we should recurse while in adjust_appendrel_attrs_mutator()- the check exists as ! if (IsA(node, Query)) (... which used to be an Assert(!IsA(node, Query)) ...) but the comment is then quite clear that we should only be doing this in the security-barrier case; perhaps we should Assert() there to that effect? It'd certainly make me feel a bit better about removing the two Asserts() which were there; presumably we had to also remove the Assert(!IsA(node, SubLink)) ? Also, it seems like there should be a check_stack_depth() call here now? That covers more-or-less everything outside of prepsecurity.c itself. I'm planning to review that tomorrow night. In general, I'm pretty happy with the shape of this. The wiki and earlier discussions were quite useful. My one complaint about this is that it feels like a few more comments and more documentation updates would be warrented; and in particular we need to make note of the locking "gotcha" in the docs. That's not a "solution", of course, but since we know about it we should probably make sure users are aware. Thanks, Stephen
On 11 April 2014 04:04, Stephen Frost <sfrost@snowman.net> wrote: > Dean, Craig, all, > > * Dean Rasheed (dean.a.rasheed@gmail.com) wrote: >> This is reflected in the change to the regression test output where, >> in one of the tests, the ctids for the table to update are no longer >> coming from the same table. I think a better approach is to push down >> the rowmark into the subquery so that any locking required applies to >> the pushed down RTE --- see the attached patch. > > I'm working through this patch and came across a few places where I > wanted to ask questions (as much for my own edification as questioning > the actual implementation). Also, feel free to point out if I'm > bringing up something which has already been discussed. I've been > trying to follow the discussion but it's been a while and my memory may > have faded. > Thanks for looking at this. > While in the planner, we need to address the case of a child RTE which > has been transformed from a relation to a subquery. That all makes > perfect sense, but I'm wondering if it'd be better to change this > conditional: > > ! if (rte1->rtekind == RTE_RELATION && > ! rte1->securityQuals != NIL && > ! rte2->rtekind == RTE_SUBQUERY) > > which essentially says "if a relation was changed to a subquery *and* > it has security quals then we need to update the entry" into one like > this: > > ! if (rte1->rtekind == RTE_RELATION && > ! rte2->rtekind == RTE_SUBQUERY) > ! { > ! Assert(rte1->securityQuals != NIL); > ! ... > > which changes it to "if a relation was changed to a subquery, it had > better be because it's got securityQuals that we're dealing with". My > general thinking here is that we'd be better off with the Assert() > firing during some later development which changes things in this area > than skipping the change because there aren't any securityQuals and then > expecting everything to be fine with the subquery actually being a > relation.. > Yes, I think that makes sense, and it seems like a sensible check. > I could see flipping that around too, to check if there are > securityQuals and then Assert() on the change from relation to subquery- > after all, if there are securityQuals then it *must* be a subquery, > right? > No, because a relation with securityQuals is only changed to a subquery if it is actually used in the main query (see the check in expand_security_quals). During the normal course of events when working with an inheritance hierarchy, there are RTEs for other children that aren't referred to in the main query for the current inheritance child, and those RTEs are not expanded into subqueries. > A similar case exists in prepunion.c where we're checking if we should > recurse while in adjust_appendrel_attrs_mutator()- the check exists as > > ! if (IsA(node, Query)) > > (... which used to be an Assert(!IsA(node, Query)) ...) > > but the comment is then quite clear that we should only be doing this in > the security-barrier case; perhaps we should Assert() there to that > effect? It'd certainly make me feel a bit better about removing the two > Asserts() which were there; presumably we had to also remove the > Assert(!IsA(node, SubLink)) ? > When I originally wrote those comments, I thought that this change would only apply to securityQuals. Later, following a seemingly unrelated bug involving UPDATEs containing LATERAL references to the target relation (which is now disallowed), I thought that this change might help with that case too, if such UPDATEs were to be allowed again in the future (http://www.postgresql.org/message-id/CAEZATCXpOsF5wZ1XXWQur7G5M52=MwzUaqYE9b0RgqhXvw34Pw@mail.gmail.com). For now though, the comments are correct, and it can only happen with securityQuals. Adding an Assert() to that effect might be a bit fiddly though, because the information required isn't available on the local Node being processed, so I think it would have to add and maintain an additional flag on the mutator context. I'm not sure that it's worth it. > Also, it seems like there should be a check_stack_depth() call here now? > Hm, perhaps. I don't see any such checks in the rewriter though, and I wouldn't expect the call depth here to get any deeper than it had previously done there. > That covers more-or-less everything outside of prepsecurity.c itself. > I'm planning to review that tomorrow night. In general, I'm pretty > happy with the shape of this. The wiki and earlier discussions were > quite useful. My one complaint about this is that it feels like a few > more comments and more documentation updates would be warrented; and in > particular we need to make note of the locking "gotcha" in the docs. > That's not a "solution", of course, but since we know about it we should > probably make sure users are aware. > Am I right in thinking that the "locking gotcha" only happens if you create a security_barrier view conaining a "SELECT ... FOR UPDATE"? If so, that seems like rather a niche case - not that that means we shouldn't warn people about it. Thanks again for looking at this. Regards, Dean
Dean, * Dean Rasheed (dean.a.rasheed@gmail.com) wrote: > On 11 April 2014 04:04, Stephen Frost <sfrost@snowman.net> wrote: > > which changes it to "if a relation was changed to a subquery, it had > > better be because it's got securityQuals that we're dealing with". My > > general thinking here is that we'd be better off with the Assert() > > firing during some later development which changes things in this area > > than skipping the change because there aren't any securityQuals and then > > expecting everything to be fine with the subquery actually being a > > relation.. > > > > Yes, I think that makes sense, and it seems like a sensible check. Ok, I'll change that. > > I could see flipping that around too, to check if there are > > securityQuals and then Assert() on the change from relation to subquery- > > after all, if there are securityQuals then it *must* be a subquery, > > right? > > > > No, because a relation with securityQuals is only changed to a > subquery if it is actually used in the main query (see the check in > expand_security_quals). During the normal course of events when > working with an inheritance hierarchy, there are RTEs for other > children that aren't referred to in the main query for the current > inheritance child, and those RTEs are not expanded into subqueries. Ah, yes, makes sense. > For now though, the comments are correct, and it can only happen with > securityQuals. Adding an Assert() to that effect might be a bit fiddly > though, because the information required isn't available on the local > Node being processed, so I think it would have to add and maintain an > additional flag on the mutator context. I'm not sure that it's worth > it. Yeah, I was afraid that might be the case. No problem. > > Also, it seems like there should be a check_stack_depth() call here now? > > > > Hm, perhaps. I don't see any such checks in the rewriter though, and I > wouldn't expect the call depth here to get any deeper than it had > previously done there. Hmm, ok. I'll think on that one. > > That covers more-or-less everything outside of prepsecurity.c itself. > > I'm planning to review that tomorrow night. In general, I'm pretty > > happy with the shape of this. The wiki and earlier discussions were > > quite useful. My one complaint about this is that it feels like a few > > more comments and more documentation updates would be warrented; and in > > particular we need to make note of the locking "gotcha" in the docs. > > That's not a "solution", of course, but since we know about it we should > > probably make sure users are aware. > > Am I right in thinking that the "locking gotcha" only happens if you > create a security_barrier view conaining a "SELECT ... FOR UPDATE"? If > so, that seems like rather a niche case - not that that means we > shouldn't warn people about it. Hmm, the 'gotcha' I was referring to was the issue discussed upthread around rows getting locked to be updated which didn't pass all the quals (they passed the security_barrier view's, but not the user-supplied ones), which could happen during a normal 'update' against a security_barrier view, right? I didn't think that would require the view definition to be 'FOR UPDATE'; if that's required then it would seem like we're actually doing what the user expects based on their view definition.. Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: > * Dean Rasheed (dean.a.rasheed@gmail.com) wrote: >> Am I right in thinking that the "locking gotcha" only happens if you >> create a security_barrier view conaining a "SELECT ... FOR UPDATE"? If >> so, that seems like rather a niche case - not that that means we >> shouldn't warn people about it. > Hmm, the 'gotcha' I was referring to was the issue discussed upthread > around rows getting locked to be updated which didn't pass all the quals > (they passed the security_barrier view's, but not the user-supplied > ones), which could happen during a normal 'update' against a > security_barrier view, right? I didn't think that would require the > view definition to be 'FOR UPDATE'; if that's required then it would > seem like we're actually doing what the user expects based on their view > definition.. Yeah, the point of the "gotcha" is that a FOR UPDATE specified *outside* a security-barrier view would act as though it had appeared *inside* the view, since it effectively gets pushed down even though outer quals don't. regards, tom lane
Dean, Craig, all, * Dean Rasheed (dean.a.rasheed@gmail.com) wrote: > On 11 April 2014 04:04, Stephen Frost <sfrost@snowman.net> wrote: > > which changes it to "if a relation was changed to a subquery, it had > > better be because it's got securityQuals that we're dealing with". My > > general thinking here is that we'd be better off with the Assert() > > firing during some later development which changes things in this area > > than skipping the change because there aren't any securityQuals and then > > expecting everything to be fine with the subquery actually being a > > relation.. > > > > Yes, I think that makes sense, and it seems like a sensible check. I've made this change and updated the comments a bit. > For now though, the comments are correct, and it can only happen with > securityQuals. Adding an Assert() to that effect might be a bit fiddly > though, because the information required isn't available on the local > Node being processed, so I think it would have to add and maintain an > additional flag on the mutator context. I'm not sure that it's worth > it. Added a comment to that effect. > > Also, it seems like there should be a check_stack_depth() call here now? > > Hm, perhaps. I don't see any such checks in the rewriter though, and I > wouldn't expect the call depth here to get any deeper than it had > previously done there. I didn't do this. I tend to agree with you, though it'd be interesting to see if it's possible to break things with enough levels... I suspect that if we have a problem there though that it's in existing releases. > Am I right in thinking that the "locking gotcha" only happens if you > create a security_barrier view conaining a "SELECT ... FOR UPDATE"? If > so, that seems like rather a niche case - not that that means we > shouldn't warn people about it. I've added both C-level comments and a new paragraph to the 'updatable views' area of the documentation which attempt to address the issue here- please review and comment. I also went over prepsecurity.c and the regression tests and didn't really find much to complain about there. I should be able to look at the RLS patch tomorrow. Thanks, Stephen
Attachment
All, * Tom Lane (tgl@sss.pgh.pa.us) wrote: > Yeah, the point of the "gotcha" is that a FOR UPDATE specified *outside* a > security-barrier view would act as though it had appeared *inside* the > view, since it effectively gets pushed down even though outer quals don't. Alright, I've committed this with an updated note regarding the locking, and a few additional regression tests (which appear to have upset some of the buildfarm- will look at that...). Please let me know if you see any issues. I'm planning to spend more-or-less all of tomorrow looking over the RLS patch. As it's rather large, I'm not sure I'll be able to get through it all, but I'm gonna give it a go. Thanks, Stephen
Greg, * Gregory Smith (gregsmithpgsql@gmail.com) wrote: > Now that Tom has given some guidance on the row locking weirdness, > maybe it's time for me to start updating those into make check form. If you have time, that'd certainly be helpful. > The documentation has been in a similar holding pattern. I have > lots of resources to help document what does and doesn't work here > to the quality expected in the manual. I just need a little more > confidence about which feature set is commit worthy. The > documentation that makes sense is very different if only updatable > security barrier views is committed. Guess I see that a bit differently- the two features, while they might be able to be used together, should both be documented appropriately and at the level that each can be used independently. I'm not sure that documentation about how to build RLS on top of updatable security barrier views w/o actual RLS support being in PG would be something we'd want to include in the PG documentation, which I guess is what you're getting at here. Thanks, Stephen
On 13 April 2014 05:23, Stephen Frost <sfrost@snowman.net> wrote: > Alright, I've committed this with an updated note regarding the locking, > and a few additional regression tests That's awesome. Thank you very much. Regards, Dean