Thread: Order of update

Order of update

From

"Peter J. Holzer"

Date:

20 April, 12:10:33

I've just read Laurenz' blog post about the differences between Oracle
and PostgreSQL[1].

One of the differences is that something like

    UPDATE tab SET id = id + 1;

tends to fail on PostgreSQL because the the primary key constraint is
checked for every row, so it will stumble over the temporary conflicts.

The solution is to define the constraint as deferrable.

But that got me to thinking about different ways ...

There won't be a conflict if the ids are updated in descending order.
Is there a way to force PostgreSQL to update the rows in a specific
order?

I came up with

    with a as (select id from t where id > 50 order by id desc)
    update t set id = a.id+1 from a where t.id = a.id;

which works in my simple test case, but it doesn't look like it's
guaranteed to work. The implicit join in «update t ... from a» could
produce rows in any order, especially for large tables.

So, is there a better way?

        hjp


[1] https://www.cybertec-postgresql.com/en/comparison-of-the-transaction-systems-of-oracle-and-postgresql/

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment

signature.asc

Order of update

From

Thiemo Kellner

Date:

20 April, 12:34:56

Very interesting. But is the sort overhead worth it? Why not make the constraint deferrable before the update and
switchback afterwards?

Re: Order of update

From

"Peter J. Holzer"

Date:

20 April, 13:59:06

On 2025-04-20 11:34:56 +0200, Thiemo Kellner wrote:
> Very interesting. But is the sort overhead worth it? Why not make the
> constraint deferrable before the update and switch back afterwards?

Mostly idle curiosity whether that's possible at all.

But there might be other reasons why you want to do updates in a
predictable order. For example to prevent deadlocks.

        hjp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment

signature.asc

Re: Order of update

From

Thiemo Kellner

Date:

20 April, 17:03:43

Hm, deadlocks preventing order by. Never had that problem. Then again, I mostly have Oracle experience and no need for
complicatedupdates. If I had, I'd rather think of chunking updates and orchestrate those before ordering within
updates.

Re: Order of update

From

Ron Johnson

Date:

20 April, 17:28:34

On Sun, Apr 20, 2025 at 5:35 AM Thiemo Kellner <thiemo@gelassene-pferde.biz> wrote:

Very interesting. But is the sort overhead worth it? Why not make the constraint deferrable before the update and switch back afterwards?

The role which runs the UPDATE might not have the priv to ALTER TABLE ... ALTER CONSTRAINT.

Death to <Redacted>, and butter sauce.

Don't boil me, I'm still alive.

<Redacted> lobster!

Re: Order of update

From

Thiemo Kellner

Date:

20 April, 17:52:25

Might that be a feature of or a flaw in the application design? I opt for the latter. Any application that needs updates, be it only in emergency cases, should take that into account.

Re: Order of update

From

Adrian Klaver

Date:

20 April, 18:28:22

On 4/20/25 02:10, Peter J. Holzer wrote:
> I've just read Laurenz' blog post about the differences between Oracle
> and PostgreSQL[1].
> 
> One of the differences is that something like
> 
>      UPDATE tab SET id = id + 1;
> 
> tends to fail on PostgreSQL because the the primary key constraint is
> checked for every row, so it will stumble over the temporary conflicts.
> 
> The solution is to define the constraint as deferrable.
> 
> But that got me to thinking about different ways ...
> 
> There won't be a conflict if the ids are updated in descending order.
> Is there a way to force PostgreSQL to update the rows in a specific
> order?
> 
> I came up with
> 
>      with a as (select id from t where id > 50 order by id desc)
>      update t set id = a.id+1 from a where t.id = a.id;
> 
> which works in my simple test case, but it doesn't look like it's
> guaranteed to work. The implicit join in «update t ... from a» could
> produce rows in any order, especially for large tables.

My read of this is that for the duration of the query a temporary table 
a is create that is ordered on `id desc` and that '... from a where t.id 
= a.id' will apply that order to the selection of t.id.


As example:

create table id_update(id integer primary key);

insert into id_update select a from generate_series(1, 100000) as t(a);
INSERT 0 100000

-- id(s) are temporarily in order.

update id_update set id = id where id between 50000 and 60000;
UPDATE 10001

-- The above move the 10001 values to 'end' of id_update

with a as (select id from id_update where id > 100 order by id desc) 
update id_update as t set id = a.id + 1 from a  where t.id = a.id;
UPDATE 99900

-- The UPDATE works even though the t.id(s) in id_update are not ordered 
-- by id


> 
> So, is there a better way?
> 
>          hjp
> 
> 
> [1] https://www.cybertec-postgresql.com/en/comparison-of-the-transaction-systems-of-oracle-and-postgresql/
> 

-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Order of update

From

"Peter J. Holzer"

Date:

21 April, 11:47:17

On 2025-04-20 08:28:22 -0700, Adrian Klaver wrote:
> On 4/20/25 02:10, Peter J. Holzer wrote:
> > I've just read Laurenz' blog post about the differences between Oracle
> > and PostgreSQL[1].
> >
> > One of the differences is that something like
> >
> >      UPDATE tab SET id = id + 1;
> >
> > tends to fail on PostgreSQL because the the primary key constraint is
> > checked for every row, so it will stumble over the temporary conflicts.
> >
> > The solution is to define the constraint as deferrable.
> >
> > But that got me to thinking about different ways ...
> >
> > There won't be a conflict if the ids are updated in descending order.
> > Is there a way to force PostgreSQL to update the rows in a specific
> > order?
> >
> > I came up with
> >
> >      with a as (select id from t where id > 50 order by id desc)
> >      update t set id = a.id+1 from a where t.id = a.id;
> >
> > which works in my simple test case, but it doesn't look like it's
> > guaranteed to work. The implicit join in «update t ... from a» could
> > produce rows in any order, especially for large tables.
>
> My read of this

Your read of the query, the PostgreSQL source or the SQL standard?

> is that for the duration of the query a temporary table a is create
> that is ordered on `id desc` and that '... from a where t.id = a.id'
> will apply that order to the selection of t.id.

Yes, that's the intention. In as I wrote it did work in my simple tests.
But is it guaranteed to work? Is there anything in the standard that
says that the order has to be preserved? Or failing that, is that the
way it's currently implemented and there are reasons to assume that it
will never be changed?


> As example:
>
> create table id_update(id integer primary key);
>
> insert into id_update select a from generate_series(1, 100000) as t(a);
> INSERT 0 100000
>
> -- id(s) are temporarily in order.
>
> update id_update set id = id where id between 50000 and 60000;
> UPDATE 10001
>
> -- The above move the 10001 values to 'end' of id_update
>
> with a as (select id from id_update where id > 100 order by id desc) update
> id_update as t set id = a.id + 1 from a  where t.id = a.id;
> UPDATE 99900

I note that this produces a hash join:

#v+

╔═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╗
║                                                      QUERY PLAN
║

╟───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╢
║ Update on id_update t  (cost=3179.42..8662.63 rows=0 width=0)
║
║   ->  Hash Join  (cost=3179.42..8662.63 rows=99899 width=38)
║
║         Hash Cond: (a.id = t.id)
║
║         ->  Subquery Scan on a  (cost=0.42..4971.64 rows=99899 width=32)
║
║               ->  Index Only Scan Backward using id_update_pkey on id_update  (cost=0.42..3972.65 rows=99899 width=4)
║
║                     Index Cond: (id > 100)
║
║         ->  Hash  (cost=1929.00..1929.00 rows=100000 width=10)
║
║               ->  Seq Scan on id_update t  (cost=0.00..1929.00 rows=100000 width=10)
║

╚═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╝
#v-

If the hash was the other way around it wouldn't work.

So let's try if we can get the optimizer to flip the plan by changing
the number of updated rows.

[a few minutes later]

#v+
hjp=> explain
with a as (select id from id_update where id > 90000 order by id desc)
update id_update as t set id = a.id + 1 from a  where a.id = t.id;

╔════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╗
║                                                         QUERY PLAN
    ║ 

╟────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╢
║ Update on id_update t  (cost=732.53..2675.61 rows=0 width=0)
    ║ 
║   ->  Hash Join  (cost=732.53..2675.61 rows=10006 width=38)
    ║ 
║         Hash Cond: (t.id = a.id)
    ║ 
║         ->  Seq Scan on id_update t  (cost=0.00..1443.00 rows=100000 width=10)
    ║ 
║         ->  Hash  (cost=607.46..607.46 rows=10006 width=32)
    ║ 
║               ->  Subquery Scan on a  (cost=0.29..607.46 rows=10006 width=32)
    ║ 
║                     ->  Index Only Scan Backward using id_update_pkey on id_update  (cost=0.29..507.40 rows=10006
width=4)║ 
║                           Index Cond: (id > 90000)
    ║ 

╚════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╝
#v-

Looks like we got it.

And indeed:

#v+
hjp=> with a as (select id from id_update where id > 90000 order by id desc)
      update id_update as t set id = a.id + 1 from a  where a.id = t.id;
ERROR:  duplicate key value violates unique constraint "id_update_pkey"
DETAIL:  Key (id)=(90002) already exists.
#v-

So, obviously that isn't guaranteed to work.

        hjp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment

signature.asc

Re: Order of update

From

Adrian Klaver

Date:

21 April, 18:43:11

On 4/21/25 01:47, Peter J. Holzer wrote:

> 
> If the hash was the other way around it wouldn't work.
> 
> So let's try if we can get the optimizer to flip the plan by changing
> the number of updated rows.
> 
> [a few minutes later]
> 
> #v+
> hjp=> explain
> with a as (select id from id_update where id > 90000 order by id desc)
> update id_update as t set id = a.id + 1 from a  where a.id = t.id;
>
╔════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╗
> ║                                                         QUERY PLAN
      ║
 
>
╟────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╢
> ║ Update on id_update t  (cost=732.53..2675.61 rows=0 width=0)
      ║
 
> ║   ->  Hash Join  (cost=732.53..2675.61 rows=10006 width=38)
      ║
 
> ║         Hash Cond: (t.id = a.id)
      ║
 
> ║         ->  Seq Scan on id_update t  (cost=0.00..1443.00 rows=100000 width=10)
      ║
 
> ║         ->  Hash  (cost=607.46..607.46 rows=10006 width=32)
      ║
 
> ║               ->  Subquery Scan on a  (cost=0.29..607.46 rows=10006 width=32)
      ║
 
> ║                     ->  Index Only Scan Backward using id_update_pkey on id_update  (cost=0.29..507.40 rows=10006
width=4)║
 
> ║                           Index Cond: (id > 90000)
      ║
 
>
╚════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╝
> #v-
> 
> Looks like we got it.
> 
> And indeed:
> 
> #v+
> hjp=> with a as (select id from id_update where id > 90000 order by id desc)
>        update id_update as t set id = a.id + 1 from a  where a.id = t.id;
> ERROR:  duplicate key value violates unique constraint "id_update_pkey"
> DETAIL:  Key (id)=(90002) already exists.
> #v-
> 
> So, obviously that isn't guaranteed to work.

I read from here:

https://www.postgresql.org/docs/current/sql-update.html

"Use of an ORDER BY clause allows the command to prioritize which rows 
will be updated; it can also prevent deadlock with other update 
operations if they use the same ordering."

I went back to those docs and realized I had missed the FOR UPDATE in 
the example.

explain with a as (select id from id_update where id > 90000 order by id 
desc for update) update id_update as t set id = a.id + 1 from a  where 
a.id = t.id;
                                                  QUERY PLAN 

-------------------------------------------------------------------------------------------------------------
  Update on id_update t  (cost=3609.71..3856.94 rows=0 width=0)
    CTE a
      ->  LockRows  (cost=0.29..872.71 rows=9840 width=10)
            ->  Index Scan Backward using id_update_pkey on id_update 
(cost=0.29..774.31 rows=9840 width=10)
                  Index Cond: (id > 90000)
    ->  Hash Join  (cost=2737.00..2984.23 rows=9840 width=38)
          Hash Cond: (a.id = t.id)
          ->  CTE Scan on a  (cost=0.00..196.80 rows=9840 width=32)
          ->  Hash  (cost=1487.00..1487.00 rows=100000 width=10)
                ->  Seq Scan on id_update t  (cost=0.00..1487.00 
rows=100000 width=10)
(10 rows)

and then:

with a as (select id from id_update where id > 90000 order by id desc 
for update) update id_update as t set id = a.id + 1 from a  where a.id = 
t.id;
UPDATE 10000

Though at this point I would agree with you on the no guarantee point.


> 
>          hjp
> 

-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Order of update

From

Thiemo Kellner

Date:

21 April, 19:12:13

I wonder if that is a corner case. Updating a unique key sounds to me like a design flaw in the first place.

Re: Order of update

From

Adrian Klaver

Date:

21 April, 19:44:20

On 4/21/25 09:12, Thiemo Kellner wrote:
> I wonder if that is a corner case. Updating a unique key sounds to me like a design flaw in the first place.
> 

Check out this the thread below for discussion on that topic:

https://www.postgresql.org/message-id/dkbnfi$7g5$1@sea.gmane.org

-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Order of update

From

Thiemo Kellner

Date:

21 April, 20:58:56

Thanks for the pointer. I feel my doubts reflected. For such reasons, I prefer the UUID as surrogate key. No point in
tryingto establish an order or even id arithmetics.
 

21.04.2025 18:44:27 Adrian Klaver <adrian.klaver@aklaver.com>:

> On 4/21/25 09:12, Thiemo Kellner wrote:
>> I wonder if that is a corner case. Updating a unique key sounds to me like a design flaw in the first place.
>> 
> 
> Check out this the thread below for discussion on that topic:
> 
> https://www.postgresql.org/message-id/dkbnfi$7g5$1@sea.gmane.org
> 
> -- 
> Adrian Klaver
> adrian.klaver@aklaver.com

Re: Order of update

From

"Peter J. Holzer"

Date:

24 April, 09:26:09

On 2025-04-21 18:12:13 +0200, Thiemo Kellner wrote:
> I wonder if that is a corner case. Updating a unique key sounds to me like a design flaw in the first place.

I agree that changing a surrogate key is almost always a mistake.

But there might be situations where a column should be unique but isn't
an id.

For example, many years ago it was a popular[1] programming pattern to
represent trees as nested ranges (i.e. if two children of a parent had
the ranges (a, b) and (b+1, c) then the parent had (a-1, c+1).
Insert-operations then need to update those columns. You want an index
on those columns (since you search for them a lot), and you might want
to make it a unique index, since that covers part of the invariant
(although not the complete invariant). If you do that you run into the
update problem.

There are probably other use-cases. Anything where you need a unique
order which can change, I guess?

Anyway, I don't have a pressing need for this, as I said I was just
curious.

        hjp

[1] Mostly in MySQL I think, since it didn't have recursive queries of
    any kind.

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment

signature.asc