Re: Making joins involving ctid work for the benefit of UPSERT - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Making joins involving ctid work for the benefit of UPSERT |
Date | |
Msg-id | CA+TgmoYNszLBSjDVf-bn7ff3dGWsgYNyLZY2y=zN=rZEBfL0vA@mail.gmail.com Whole thread Raw |
In response to | Re: Making joins involving ctid work for the benefit of UPSERT (Peter Geoghegan <pg@heroku.com>) |
Responses |
Re: Making joins involving ctid work for the benefit of UPSERT
Re: Making joins involving ctid work for the benefit of UPSERT Re: Making joins involving ctid work for the benefit of UPSERT |
List | pgsql-hackers |
On Sat, Jul 19, 2014 at 10:04 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Fri, Jul 18, 2014 at 11:23 AM, Andres Freund <andres@2ndquadrant.com> wrote: >> Meh. A understandable syntax wouldn't require the pullups with a special >> scan node and such. > > Well, in general ExecModifyTable()/ExecUpdate() trusts the tid passed > to match the qual in the query. Unless you're willing to give up on > the idea of having a qual on the update (other than something like an > ON CONFLICTS qual, obviously) I think you'll probably end up with some > kind of pseudo-scan node anyway, if only for consistency with how > things already work, to give you somewhere to show the Filter in > explain output and so on. ExecScan() probably needs to ExecQual(). > Inserts don't have scan nodes, but updates do, and so it seems pretty > odd to make the "Insert" node (a ModifyTable node) pullup from a > result node (as always), and the "Update" node (also a ModifyTable > node) treat the first ModifyTable node as its own pseudo-scan node, > when in fact we are scanning the heap in a manner not unlike a > conventional update. Or do you envision a second result node where an > update qual might be evaluated? I am not enthused about either > possibility. > > The idea of not having the ability to put a predicate on the update at > all, much like the MySQL thing isn't completely outrageous, but it > certainly isn't going to go down well those that have already > expressed concerns about how much of a foot gun this could be. This all seems completely to one side of Andres's point. I think what he's saying is: don't implement an SQL syntax of the form INSERT ON CONFLICT and let people use that to implement upsert. Instead, directly implement an SQL command called UPSERT or similar. That's long been my intuition as well. I think generality here is not your friend. I'd suggest something like: UPSERT table SET col = value [, ...] WHERE predicate; with semantics like this: 1. Search the table, using any type of scan you like, for a row matching the given predicate. 2. If you find more than one tuple that is visible to your scan, error. 3. If you find exactly one tuple that is visible to your scan, update it. Stop. 4. If you find no tuple that is visible to your scan, but you do find at least one tuple whose xmin has committed and whose xmax has not committed, then (4a) if the isolation level is REPEATABLE READ or higher, error; (4b) if there is more than one such tuple, error; else (4b) update it, in violation of normal MVCC visibility rules. 5. Construct a new tuple using the col/value pairs from the SET clause and try to insert it. If this would fail with a unique index violation, back out and go back to step 1. Having said all that, I believe the INSERT ON CONFLICT syntax is more easily comprehensible than previous proposals. But I still tend to agree with Andres that an explicit UPSERT syntax or something like it, that captures all of the MVCC games inside itself, is likely preferable from a user standpoint, whatever the implementation ends up looking like. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: