Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0 - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0 |
Date | |
Msg-id | CAM3SWZSc3-ujrTdKp7OMvVxDLPFfqeSh2FYhoh0VV1qZA5sbjQ@mail.gmail.com Whole thread Raw |
In response to | Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0 (Heikki Linnakangas <hlinnaka@iki.fi>) |
Responses |
Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0
|
List | pgsql-hackers |
On Mon, Mar 2, 2015 at 12:15 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > Hmm. I used a b-tree to estimate the effect that the locking would have in > the UPSERT case, for UPSERT into a table with a b-tree index. But you're > right that for the question of whether this is acceptable for the case of > regular insert into a table with exclusion constraints, other indexams are > more interesting. In a very quick test, the overhead with a single GiST > index seems to be about 5%. IMHO that's still a noticeable slowdown, so > let's try to find a way to eliminate it. Honestly, I don't know why you're so optimistic that this can be fixed, even with that heavyweight lock (for regular inserters + exclusion constraints). My experimental branch, which I showed you privately shows big problems when I simulate upserting with exclusion constraints (so we insert first, handle exclusion violations using the traditional upsert subxact pattern that does work with B-Trees). It's much harder to back out of a heap_update() than it is to back out of a heap_insert(), since we might well be updated a tuple visible to some other MVCC snapshot. Whereas we can super delete a tuple knowing that only a dirty snapshot might have seen it, which bounds the complexity (only wait sites for B-Trees + exclusion constraints need to worry about speculative insertion tokens and so on). My experimental branch works just fine (with a variant jjanes_upsert with subxact looping), until I need to restart an update after a "failed" heap_update() that still returned HeapTupleMayBeUpdated (having super deleted within an ExecUpdate() call). There is no good way to do that for ExecUpdate() that I can see, because an existing, visible row is affected (unlike with ExecInsert()). Even if it was possible, it would be hugely invasive to already very complicated code paths. I continue to believe that the best way forward is to incrementally commit the work by committing ON CONFLICT IGNORE first. That way, speculative tokens will remain strictly the concern of UPSERTers or sessions doing INSERT ... ON CONFLICT IGNORE. Unless, you're happy to have UPDATEs still deadlock, and only fix unprincipled deadlocks for the INSERT case. I don't know why you want to make the patch incremental by "fixing" unprincipled deadlocks in the first place, since you've said that you don't really care about it as a real world problem, and because it now appears to have significant additional difficulties that were not anticipated. Please, let's focus on getting ON CONFLICT IGNORE into a commitable state - that's the best way of incrementally committing this work. Fixing unprincipled deadlocks is not a useful way of incrementally committing this work. I've spent several days producing a prototype that shows the exact nature of the problem. If you think I'm mistaken, and that fixing unprincipled deadlocks first is the right thing to do, please explain why with reference to that prototype. AFAICT, doing things your way is going to add significant additional complexity for *no appreciable benefit*. You've already gotten exactly what you were looking for in every other regard. In particular, ON CONFLICT UPDATE could work with exclusion constraints without any of these problems, because of the way we do the auxiliary update there (we lock the row ahead of updating/qual evaluation). I've bent over backwards to make sure that is the case. Please, throw me a bone here. Thank you -- Peter Geoghegan
pgsql-hackers by date: