INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0 - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0 |
Date | |
Msg-id | CAM3SWZS9DTFt1=sQT=WFdp5UFkOac2Qc1i5WE-Od4BXJZ=JsCw@mail.gmail.com Whole thread Raw |
Responses |
Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0
Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0 |
List | pgsql-hackers |
Attached patch series forms what I'm calling V3.0 of the INSERT ... ON CONFLICT IGNORE/UPDATE feature. (Sorry about all the threads. I feel this development justifies a new thread, though.) This features a new way of organizing the code. Heikki and I are now in agreement that the best way of incrementally committing the work is to do ON CONFLICT IGNORE first (perhaps someone else can review ON CONFLICT UPDATE). This patch series is organized so that the first commit adds tests, documentation and code that only relates to the IGNORE variant. RLS support is still split out as a separate commit, which is not intended to actually be committed separately (from the main ON CONFLICT UPDATE commit). As before, I've just done that to highlight that part for the benefit of RLS subject matter experts. The RTE permissions split patch is also broken out, but I believe that there is consensus that that could sensibly be committed on its own. There are some minor changes to the code/documentation itself: * Ran the code through pg_indent. * Documented ON CONFLICT UPDATE as a MERGE alternative under "unsupported SQL features" (sql_features.txt). * Minor tweaks of comments here and there. * Logical decoding fixes. * Call ExecCheckPlanOutput() for an ON CONFLICT UPDATE auxiliary query, too. I should point out that I'm aware of the following open issues around the patch series (most of these apply to the base IGNORE commit, so Heikki should of course look out for them): * Andres and I are still hashing out details of whether or not certain assumptions made around handling super deletion within logical decoding are safe [1]. Consider the decoding stuff suspect for now. * There is still an unresolved semantics issue with unique index inference and non-default opclasses. I think it's sufficient that the available/defined unique indexes dictate our idea of a unique violation (that necessitates taking the alternative path). Even in a world where there exists a non-default opclass with an "equals" operator that does not match that of the default opclass (that is not really the world we live in, because the only counter-example known is made that way specifically to *discourage* its use by users), this seems okay to me. It seems okay to me because surely the relevant definition of equality is the one actually chosen for the available unique index. If there exists an ambiguity for some strange reason (i.e. there are two unique indexes, on the same attribute(s), but with different "equals" operators), then its a costing issue, so the behavior given is essentially non-predictable (it could end up being either...but hey, those are the semantics). I have a very hard time imagining how that could ever be the case, even when we have (say) case insensitive opclasses for the text type. A case insensitive opclass is stricter than a case sensitive opclass. Why would a user ever want both on the same attribute(s) of the same table? Is the user really more or less expecting to never get a unique violation on the non-arbitrating unique index, despite all this? If reviewers are absolutely insistent that this theoretical ambiguity is a problem, we can add an optional CREATE INDEX style opclass specification (I'm already using the IndexElems representation from CREATE INDEX for the inference specification, actually, so that would be easy enough). I really have a hard time believing that the ugliness is a problem for those hypothetical users that eventually consider "equals" operator ambiguity among opclasses among available unique indexes to be a problem. I haven't just gone and implemented this already because I didn't want to document that an opclass specification will be accepted. Since there is literally no reason why anyone would care today, I see no reason to add what IMV would really just be cruft. * I'm flying blind with the SEPostgres changes. Unfortunately, it's not very possible to test SEPostgres on my machine. (Does not apply to IGNORE variant.) As always, the jjanes_upsert tool [2] has proven invaluable with testing this patch (Thanks Jeff!). Reviewers should look to that tool for ideas on how the patch might be broken. It caught a number of race conditions with exclusion constraints in the past, for example. We're in good shape with those now (and have been since V2.3 - see "livelock insurance" comments within the IGNORE commit). Thoughts? [1] http://www.postgresql.org/message-id/20150303111342.GF2579@alap3.anarazel.de [2] https://github.com/petergeoghegan/jjanes_upsert -- Peter Geoghegan
Attachment
pgsql-hackers by date: