Home > mailing lists

Re: a misbehavior of partition row movement (?) - Mailing list pgsql-hackers

From	Amit Langote
Subject	Re: a misbehavior of partition row movement (?)
Date	December 20, 2021 13:10:30
Msg-id	CA+HiwqEi9S2_1qcG-v98jPEEoNhs+Goonfz+Cjb4nqwf4CkcpQ@mail.gmail.com Whole thread Raw
In response to	Re: a misbehavior of partition row movement (?) (Amit Langote <amitlangote09@gmail.com>)
Responses	Re: a misbehavior of partition row movement (?)
List	pgsql-hackers

Tree view

On Thu, Oct 14, 2021 at 6:00 PM Amit Langote <amitlangote09@gmail.com> wrote:
> On Mon, Sep 20, 2021 at 3:32 PM Amit Langote <amitlangote09@gmail.com> wrote:
> > The problem was that the tuplestore
> > (afterTriggers.query_stack[query_level].tuplestore) that I decided to
> > use to store the AFTER trigger tuples of a partitioned table that is
> > the target of an cross-partition update lives only for the duration of
> > a given query.  So that approach wouldn't work if the foreign key
> > pointing into that partitioned table is marked INITIALLY DEFERRED.  To
> > fix, I added a List field to AfterTriggersData that stores the
> > tuplestores to store the tuples of partitioned tables that undergo
> > cross-partition updates in a transaction and are pointed to by
> > INITIALLY DEFERRED foreign key constraints.  I couldn't understand one
> > comment about why using a tuplestore for such cases *might not* work,
> > which as follows:
> >
> >  * Note that we need triggers on foreign tables to be fired in exactly the
> >  * order they were queued, so that the tuples come out of the tuplestore in
> >  * the right order.  To ensure that, we forbid deferrable (constraint)
> >  * triggers on foreign tables.  This also ensures that such triggers do not
> >  * get deferred into outer trigger query levels, meaning that it's okay to
> >  * destroy the tuplestore at the end of the query level.
> >
> > I tried to break the approach using various test cases (some can be
> > seen added by the patch to foreign_key.sql), but couldn't see the
> > issue alluded to in the above comment.  So I've marked the comment
> > with an XXX note as follows:
> >
> > - * Note that we need triggers on foreign tables to be fired in exactly the
> > - * order they were queued, so that the tuples come out of the tuplestore in
> > - * the right order.  To ensure that, we forbid deferrable (constraint)
> > - * triggers on foreign tables.  This also ensures that such triggers do not
> > - * get deferred into outer trigger query levels, meaning that it's okay to
> > - * destroy the tuplestore at the end of the query level.
> > + * Note that we need triggers on foreign and partitioned tables to be fired in
> > + * exactly the order they were queued, so that the tuples come out of the
> > + * tuplestore in the right order.  To ensure that, we forbid deferrable
> > + * (constraint) triggers on foreign tables.  This also ensures that such
> > + * triggers do not get deferred into outer trigger query levels, meaning that
> > + * it's okay to destroy the tuplestore at the end of the query level.
> > + * XXX - update this paragraph if the new approach, whereby tuplestores in
> > + * afterTriggers.deferred_tuplestores outlive any given query, can be proven
> > + * to not really break any assumptions mentioned here.
> >
> > If anyone reading this can think of the issue the original comment
> > seems to be talking about, please let me know.
>
> I brought this up in an off-list discussion with Robert and he asked
> why I hadn't considered not using tuplestores to remember the tuples
> in the first place, specifically pointing out that it may lead to
> unnecessarily consuming a lot of memory when such updates move a bunch
> of rows around.  Like, we could simply remember the tuples to be
> passed to the trigger function using their CTIDs as is done for normal
> (single-heap-relation) updates, though in this case also remember the
> OIDs of the source and the destination partitions of a particular
> cross-partition update.
>
> I had considered that idea before but I think I had overestimated the
> complexity of that approach so didn't go with it.  I tried and the
> resulting patch doesn't look all that complicated, with the added
> bonus that the bulk update case shouldn't consume so much memory with
> this approach like the previous tuplestore version would have.
>
> Updated patches attached.

Patch 0001 conflicted with some pg_dump changes that were recently
committed, so rebased.

-- 
Amit Langote
EDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

From: Masahiko Sawada
Date: 20 December 2021, 12:59:15
Subject: Re: parallel vacuum comments

From: Amit Langote
Date: 20 December 2021, 13:17:16
Subject: Re: simplifying foreign key/RI checks

Re: a misbehavior of partition row movement (?) - Mailing list pgsql-hackers

Attachment

Previous

Next