Thread: Proposal for UPDATE: do not insert new tuple on heap if update does not change data
Proposal for UPDATE: do not insert new tuple on heap if update does not change data
From
Gasper Zejn
Date:
Hi, I was wondering if PostgreSQL adds new tuple if data is not changed when using UPDATE. It turns out it does add them and I think it might be beneficial not to add a new tuple in this case, since it causes a great deal of maintenance: updating indexes, vacuuming table and index, also heap fragmentation. How to check: CREATE TABLE foo (pk serial primary key, val text); -- Starting point: two rows. INSERT INTO foo VALUES (1, 'first'); INSERT INTO foo VALUES (2, 'second'); CHECKPOINT; -- Updating row with same value. UPDATE foo SET val = 'second' where pk = 2; CHECKPOINT; -- "Upsert" is the same. INSERT INTO foo VALUES (2, 'second') ON CONFLICT (pk) DO UPDATE SET val = 'second'; CHECKPOINT; If after any checkpoint you look at page data, you can see multiple versions of same row with "second". Unfortunately, I don't believe I can come up with a patch on my own, but will happily offer any further help with testing and ideas. Attached is a script with minimal test case. Kind regards, Gasper Zejn
Attachment
Re: Proposal for UPDATE: do not insert new tuple on heap if update does not change data
From
Konstantin Knizhnik
Date:
Hi,<br /><br /> To eliminate creation of new tuple version in this case it is necessary to check that update actually doesn'tchange the record. <br /> It is not a cheapest test and it seems to be not so good idea to perform it always.<br />But if you fill that in your case there are many "identical" updates, you can always explicitly rewrite query by addingextra check:<br /><pre wrap="">UPDATE foo SET val = 'second' where pk = 2 and val <> 'second';</pre><br /><br/><br /><div class="moz-cite-prefix">On 20.01.2016 12:55, Gasper Zejn wrote:<br /></div><blockquote cite="mid:CAMxXOOEZ4p0J6nke76XSjs1RfPR0mcb0Ddh-vbmAknjSPq-LTw@mail.gmail.com"type="cite"><pre wrap="">Hi, I was wondering if PostgreSQL adds new tuple if data is not changed when using UPDATE. It turns out it does add them and I think it might be beneficial not to add a new tuple in this case, since it causes a great deal of maintenance: updating indexes, vacuuming table and index, also heap fragmentation. How to check: CREATE TABLE foo (pk serial primary key, val text); -- Starting point: two rows. INSERT INTO foo VALUES (1, 'first'); INSERT INTO foo VALUES (2, 'second'); CHECKPOINT; -- Updating row with same value. UPDATE foo SET val = 'second' where pk = 2; CHECKPOINT; -- "Upsert" is the same. INSERT INTO foo VALUES (2, 'second') ON CONFLICT (pk) DO UPDATE SET val = 'second'; CHECKPOINT; If after any checkpoint you look at page data, you can see multiple versions of same row with "second". Unfortunately, I don't believe I can come up with a patch on my own, but will happily offer any further help with testing and ideas. Attached is a script with minimal test case. Kind regards, Gasper Zejn </pre><br /><fieldset class="mimeAttachmentHeader"></fieldset><br /><pre wrap=""> </pre></blockquote><br /><pre class="moz-signature" cols="72">-- Konstantin Knizhnik Postgres Professional: <a class="moz-txt-link-freetext" href="http://www.postgrespro.com">http://www.postgrespro.com</a> The Russian Postgres Company </pre>
Re: Proposal for UPDATE: do not insert new tuple on heap if update does not change data
From
Tom Lane
Date:
Gasper Zejn <zelo.zejn@gmail.com> writes: > I was wondering if PostgreSQL adds new tuple if data is not changed > when using UPDATE. It turns out it does add them and I think it might > be beneficial not to add a new tuple in this case, since it causes a > great deal of maintenance: updating indexes, vacuuming table and > index, also heap fragmentation. This has been discussed in the past, and the conclusion was that expending cycles on every UPDATE to check for this case would be a net loss. How many real applications do no-op updates often enough that it's worth optimizing for? regards, tom lane
Re: Proposal for UPDATE: do not insert new tuple on heap if update does not change data
From
Kevin Grittner
Date:
On Wed, Jan 20, 2016 at 3:55 AM, Gasper Zejn <zelo.zejn@gmail.com> wrote: > I was wondering if PostgreSQL adds new tuple if data is not changed > when using UPDATE. It turns out it does add them and I think it might > be beneficial not to add a new tuple in this case, since it causes a > great deal of maintenance: updating indexes, vacuuming table and > index, also heap fragmentation. If you have one or more tables on which you routinely updated rows to the values they already have, you might want to attach an update trigger using the suppress_redundant_updates_trigger() function. http://www.postgresql.org/docs/current/interactive/functions-trigger.html A better solution, where possible, is to use the WHERE clause to avoid the update attempt where the new values are not distinct from the old ones. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company