Re: API change advice: Passing plan invalidation info from the rewriter into the planner? - Mailing list pgsql-hackers
From | Stephen Frost |
---|---|
Subject | Re: API change advice: Passing plan invalidation info from the rewriter into the planner? |
Date | |
Msg-id | 20140612005917.GD2556@tamriel.snowman.net Whole thread Raw |
In response to | Re: API change advice: Passing plan invalidation info from the rewriter into the planner? (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: API change advice: Passing plan invalidation info from
the rewriter into the planner?
|
List | pgsql-hackers |
* Robert Haas (robertmhaas@gmail.com) wrote: > On Wed, Jun 11, 2014 at 12:23 PM, Stephen Frost <sfrost@snowman.net> wrote: > > This argument could have been made for column-level privileges also, no? > > Not really. First of all, we didn't have security_barrier views at > that time, let alone security barrier views that are auto-updateable. We had security definer functions, even set-returning ones, along with rules and triggers. > That's a really important piece of technology which makes filtering > access via views feasible in ways that really were not feasible in the > past. Secondly, column-level permissions - like every other > currently-existing type of permissions - are declarative. They are an > additional opportunity for the system to say "no" to something it > otherwise would have allowed, but no user-defined code is executed. We could try to avoid calling user-defined code for RLS, but it'd add a whole lot of complexity and as far as I can see and your proposed solution isn't avoiding the user-defined code anyway, so I'm not sure why this solution should be required to meet that. > Row-level security is not a chance for the system to deny access; it's > a chance for user-defined code to take control and perform arbitrary > operations. So the scope of what we're contemplating for row-level > security is really far, far more invasive than what we did for > column-level privileges. In this case the user-defined code needs to return a boolean. We don't currently do anything to prevent it from having side-effects, no, but the same is true with views which incorporate functions. I agree that it makes a difference when compared to column-level privileges, but my point was that we have provided easier ways to do things which were possible using more complicated methods before. Perhaps the risk with RLS is higher but these issues look managable to me and the level of doubt about our ability to provide this feature in a reasonable and principled way that our users will understand surprises me. > > I agree that views, or even security-definer functions, offer a great > > deal of flexibility, and that may be necessary in some use-cases, but I > > fail to see why that means we should avoid providing the mechanics to > > achieve simple and usable RLS akin to what other major RDBMS's have. > > Because we don't have a good design. We're using a design that's found in multiple other RDBMS's and used extensively by certain industries which use those RDBMS's today. I'm certainly open to improving what is found in other systems for PG but I have a hard time seeing this approach as a bad design. Perhaps you're referring to our implementation, in which case I might agree and things like running the quals as the table owner is something which should be considered (I don't know how the other RDBMS's operate in this regard offhand- it'd be good to find out). > I'm not categorically opposed to adding more RLS features to > PostgreSQL and never have been; in fact, I was deeply involved in the > original design of security barrier views and committed the original > patch to add that functionality to PostgreSQL, without which none of > what we're talking about here would be possible. But the > currently-proposed design is very unappealing to me, for the reasons > that I've explained. The right answer to "this feature doesn't > provide anything that we don't already have and will introduce major > new security exposures that haven't been adequate thought" is > debatable, but "well other people have this so we should too" is > definitely not it. How about "it's in high demand by our user base"? In particular, it's being asked for by a *highly* technical section of our user base who uses these capabilities today, with this design, in those other databases. > Craig's patch really hasn't grappled with any of > these thorny definition and security issues; it's just about making > the basic functionality work. That's fine for a POC, but it's not > enough for a feature that the project would be committing to maintain > for the indefinite future. Improving the patch is exactly what I'd like to do, but throwing out the notion that RLS can't be allowed to execute user-defined code is cutting the legs out of the feature completely- particularly with our system where users can create all manner of objects in the system with their own code being run. > >> That's mighty useful for debugging, at least IMHO. And, if you want > >> to have several row-level security policies for different classes of > >> users, just create more than one view and grant different privileges > >> on each. > > > > I'm really not impressed with the idea that RLS should be done with > > multiple different views of the same underlying table. > > Are you equally unimpressed with the idea that RLS as proposed can't > support more than one security policy right now *at all*? Because it > seems to me that either you think multiple RLS policies on a single > table is important (in which case the current patch is inadequate) or > you think it's not important (in which case we need not argue about > whether doing it with multiple views over the same underlying table is > awkward). The current approach allows a nearly unlimited level of flexibility, should the user wish it, by being able to run user-defined code. Perhaps that would be considered 'one policy', but it could certainly take under consideration the calling user, the object being queried (if a function is defined per table, or if we provide a way to get that information in the function), etc. What it wouldn't require is the same object to be queried through different object names, which is what I was principally objecting to. What would it mean to have mutliple RLS policies for a given object? There would have to be some criteria to distinguish which one would be applied, yet that can be handled with the existing design by the user already, if they wish to. Were we to preclude users from being able to have user-defined functions called, then there's quite a bit of additional complexity we'd need to replicate. Per-user policies, per-role policies, a definition of which one applies when, per-source-IP, per-connection-type (SSL vs. non-SSL), per-security-label, etc.. > >> By contrast, it seems to me that every design so far proposed for > >> something that is actually called row-level security - as opposed to > >> commit 842faa714c0454d67e523f5a0b6df6500e9bc1a5, which *really is* > >> row-level security, is extremely limited. Look back at all the things > >> listed in the previous paragraph; can you do those things easily with > >> the designs that have been proposed? As far as I can see, not really. > > > > I don't feel that RLS will, or even *should*, have the same level of > > flexibility that you can achieve with views and/or security definer > > functions. I expect that, over time, we will add more capabilities to > > it, but it's never going to be able to redefine the contents of a column > > as a view can, nor will it be able to add columns to a table as views > > can. I don't see those as reasons against having support for RLS. > > What this patch is doing is basically allowing a table to really be a > view over itself. I don't agree with this characterization. This patch specifically allows filtering the rows returned from the table, and it intentionally does not allow changing the data. > If we choose to support that, I think it is > absolutely inevitable that people are going to want all the same > options that they would have if they really made a separate view - > separate permissions, WITH CHECK OPTION, all of it. We are already looking at WITH CHECK OPTION-style support, but I disagree that separate permissions or data changing will ever be a part of RLS because then it's no longer RLS. > I find the > contrary argument - that people will only want X amount and no more - > simply not plausible. I'm not sure where you are seeing the requests for this feature from, but where I have heard them it's been to match what exists in other RDBMS's which do not have the capabilities that you're describing users will want- yet RLS is heavily used in those organizations. For the use cases that I've had in the past, RLS-as-defined would be the feature that I want for most tables, with views for joins and data-changing operations. > If it's valuable to have some of those > capabilities in an RLS framework, somebody's going to want all of > them. There's no bright line to divide the things that are valuable > in that context from those that aren't. I see the line quite clearly- RLS is about having a filtering mechanism and that's it. If it isn't filtering the rows (meaning giving back a 'true' or 'false' result for each row) then it's beyond RLS. > > I'm glad to hear your thoughts on the level of granularity which might > > be nice to have with RLS. What would be great is to spend a bit more > > time reviewing what other systems provide in this area and considering > > what makes sense for us. This will also be a feature and an area which > > we will be improving for a long time to come, but we do need this > > capability and we have to start somewhere. > > I think this definitely important. I also think that we should be > careful to study the deficiencies in those other systems and to > clearly call out what value the capabilities we're thinking of adding > to PostgreSQL 9.5 have over the status quo in PostgreSQL 9.4. I'm not > so much arguing that we shouldn't have row-level security as that, in > every way that's really meaningful, we already do. This is not the feeling that the users which I have been working with have, nor does it match my feelings about this. As mentioned in my email to Tom just now, having another object to deal with adds unnecessary complexity and will require application changes potentially to implement over existing tables. > >> There's no way for users who are RLS exempt to turn > >> off their exemption for testing purposes, let alone on a per-table > >> basis. > > > > I don't follow this argument entirely- users can't turn off the existing > > permissions system for testing either, unless an authorized user with > > the correct permissions makes the change to allow it- or the user bumps > > themselves up to superuser, or to a role which has broader permissions, > > both of which would also be possible to do with RLS. > > Sure, but in the existing system, the query either returns the same > results for everybody, or it fails outright with an error. It's > certainly possible to screw up the existing permissions, but this new > thing that's being proposed is much more complicated, because it's not > just whether it works that's at issue, but what results you actually > get. I agree that we'll need to make sure we return the correct answer. There is complexity there, but hopefully we've addressed much or all of that with what we have in 9.4 and this is just adding a simpler and often requested way to use that capability without the need to create and manage another object in the system. > >> There's no way to have multiple RLS policies on a single > >> table. All of those are things that we get "for free" in the > >> view-over-table model, and implementing formal RLS basically requires > >> us to either invent a new RLS-specific way of doing each of those > >> things, or suffer along with a subset of the functionality. Yuck. > > > > What would probably be good is to review the use-cases which the current > > patch already addresses- and we've had good responses from actual users > > who are already playing with the patch and are hearing that it is > > addressing their requirements. > > Yes. And in particular, I think we should have a much clearer > statement than we currently do about the use cases in which it falls > short. I'm happy to have that discussion with the users who are asking for this but in the conversations that I've had to date, updatable s.b. views are not RLS to them and I have to agree- having to maintain twice as many objects in the system which have to be named differently and have permissions which can be distinct from each other (which is something that could be a *problem* if it isn't intended), must both be updated when adding or removing columns, etc, makes that solution quite unappealing. > > You're suggesting that we use views instead, which clearly could run > > someone else's code. Perhaps the user will notice that they're > > selecting from a view instead of a table, but I've never seen a security > > design around making sure that what is being select'd from is a table > > vs. a view. Have you seen applications which implement such a check > > prior to running a query? > > Yes. pg_dump, to name one really important one. I wouldn't be > surprised if graphical clients did something similar - display the > table data for a table, or the view definition for a view. I'm quite sure you can select back the data from a view in every graphical client that exists- and without any warning popping up that you might be running code that someone else wrote. Yes, you can also get the definition of the view in many cases and you can tell if what you're selecting is a view or a table but that doesn't mean people are actively being paranoid about that distinction or worrying about the other cases where user-defined code might be run, even when selecting from a table, in general. > But I > admit to not having checked that. More than that, if I were a DBA, > I'd certainly be darn careful about selecting from untrusted views, > but I expect to be able to read a table, or run pg_dump, without > getting my account hacked. I'd love to hear how you decide which views are trusted and which are not. Last I checked, most serious attacks still come from internal individuals rather than external ones. Don't get me wrong- we definitely have an issue here that it'd be great to find a solution to, as has been discussed extensively, but I don't see RLS as making that problem particularly worse, and really, excluding superusers and having the option for other users to be excluded goes above what we've done to date in other areas. > I don't think restricting what can go into an RLS policy is the right > answer; that to me misses the point. What needs to be restricted is > the possibility that a user will inadvertently run code they didn't > mean to run. I'm glad that you agree that restricting the RLS policy isn't the right answer. I agree that we want to come up with a way to prevent users from running code that isn't safe or isn't intended. I still don't see RLS as making that particularly worse. The system is really nearly unusable in any interactive way if you restrict yourself to operations which can't possibly run any user-defined code today. There have been discussions about ways to possibly improve that, and those ways would need to address the RLS case in addition to the other already existing cases but I don't see that as a signifigant increase in the amount of work required to address that problem (which is already quite large..). > > I don't see this as being an insurmountable issue. I agree that having > > a way for pg_dump to run safely is important and the superuser check > > does address that, given that we don't have a "read-only (and > > everything)" capability today. Once we do (and I surely hope that will > > come sooner rather than later), such a role should also have the 'no > > RLS' bit, as it wouldn't make any sense for such a role anyway. The > > lack of that is not a strike against RLS though. > > It addresses running pg_dump *as the superuser*, but not as a database > owner or just a regular users. If unprivileged user A runs pg_dump -t > some_table_owned_by_user_b, and falls victim to a Trojan horse, that > is going to get reported as a security defect in PostgreSQL. Telling > the person who reports that issue that it's design behavior is not > going to make them happy, or result in good press coverage for > PostgreSQL. We have this problem with psql today, as has been discussed. The fact that pg_dump doesn't happen to have this problem is great but it's no true solution for the problem at hand. > >> 2. There's a danger that the functionality available in the two models > >> will diverge, so that certain things can only be done in one world or > >> the other. > > > > They will always be distinct, intentionally so. > > I think that's an absolutely terrible idea. We do not want to be in > the business of having two parallel systems with slightly different > capabilities and syntax that are providing the same fundamental > functionality. And they are: the proposal for RLS is to make it work > just like a security_barrier view, sharing a common implementation. While RLS could be viewed as providing a subset of what updatable sb views provide, I can see a clear line between the two and, for my part, we should allow users to make their own decision about if they want the complexity involved with maintaining another object in the system to provide the filtering or if they want to implement the filtering and the data manipulation, joins, etc, independently. That's really another big point to be made here- there's value in separating these concerns. Security is a big enough concern and a big enough issue that being able to address it explicitly and with a simple syntax is extremely valuable. RLS as we've been discussing it allows that, while having to include it in more complicated view definitions could make it much more difficult to reason about. I suppose one could define a view for just the filtering and then another view for the data manipulation and joining over top of the other views, but, again, that adds another level of complexity that isn't needed- and you can't be 100% sure that the only thing the supposedly filtering view is doing is *just* filtering unless you audit it regularly. > >> 4. Making SELECT * FROM tab ambiguous seems likely to be a security minefield. > > > > While I agree that we need to consider this, I don't think it will be a > > "minefield", but rather something we need to document and educate our > > users about. If you'd like a "disable-all-RLS" GUC, I'm all for it. > > I would definitely like that. I have proposed it in the past. Great. Thanks, Stephen
pgsql-hackers by date: