Re: Problem with transition tables on partitioned tables with foreign-table partitions - Mailing list pgsql-hackers

From Etsuro Fujita
Subject Re: Problem with transition tables on partitioned tables with foreign-table partitions
Date
Msg-id CAPmGK16baPbJU78qpH1v7xjUdhr3S5xm_bu4AgiwU3CVRxDkZQ@mail.gmail.com
Whole thread Raw
In response to Re: Problem with transition tables on partitioned tables with foreign-table partitions  (Amit Langote <amitlangote09@gmail.com>)
List pgsql-hackers
On Wed, Jul 2, 2025 at 10:05 PM Amit Langote <amitlangote09@gmail.com> wrote:
> On Wed, Jul 2, 2025 at 7:05 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> > On Tue, Jul 1, 2025 at 4:42 PM Amit Langote <amitlangote09@gmail.com> wrote:
> > > On Tue, Jul 1, 2025 at 11:55 AM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> > > > While working on something else, I noticed that while we disallow
> > > > transition tables on foreign tables, we allow transition tables on
> > > > partitioned tables with foreign-table partitions, which produces
> > > > incorrect results.  Here is an example using postgres_fdw:
> > > >
> > > > create table parent (a text, b int) partition by list (a);
> > > > create table loct (a text, b int);
> > > > create foreign table child (a text, b int)
> > > >   server loopback options (table_name 'loct');
> > > > alter table parent attach partition child for values in ('AAA');
> > > >
> > > > create function dump_insert() returns trigger language plpgsql as
> > > > $$
> > > >   begin
> > > >     raise notice 'trigger = %, new table = %',
> > > >                  TG_NAME,
> > > >                  (select string_agg(new_table::text, ', ' order by a)
> > > > from new_table);
> > > >     return null;
> > > >   end;
> > > > $$;
> > > > create trigger parent_insert_trig
> > > >   after insert on parent referencing new table as new_table
> > > >   for each statement execute procedure dump_insert();
> > > >
> > > > create function intercept_insert() returns trigger language plpgsql as
> > > > $$
> > > >   begin
> > > >     new.b = new.b + 1000;
> > > >     return new;
> > > >   end;
> > > > $$;
> > > > create trigger intercept_insert_loct
> > > >   before insert on loct
> > > >   for each row execute procedure intercept_insert();
> > > >
> > > > insert into parent values ('AAA', 42);
> > > > NOTICE:  trigger = parent_insert_trig, new table = (AAA,42)
> > > > INSERT 0 1
> > > >
> > > > The trigger shows the original tuple created by the core, not the
> > > > actual tuple inserted into the foreign-table partition, as
> > > > postgres_fdw does not collect the actual tuple, of course!
> > >
> > > Maybe I'm missing something, but given that the intercept_insert()
> > > function is applied during the "remote" operation, isn't it expected
> > > that the parent table's trigger for a "local" operation shows the
> > > original tuple?
> >
> > That is the question of how we define the after image of a row
> > inserted into a foreign table, but consider the case where the
> > partition is a plain table:
> >
> > create table parent (a text, b int) partition by list (a);
> > create table child partition of parent for values in ('AAA');
> > create trigger intercept_insert_child
> >   before insert on child
> >   for each row execute procedure intercept_insert();
> > insert into parent values ('AAA', 42);
> > NOTICE:  trigger = parent_insert_trig, new table = (AAA,1042)
> > INSERT 0 1
> >
> > The trigger shows the final tuple, not the original tuple.  So from a
> > consistency perspective, I thought it would be good if the trigger
> > does so even in the case where the partition is a foreign table.
>
> Ok, but if you define the trigger on the foreign table partition
> (child) as follows, you do get what I think is the expected result?
>
> create trigger intercept_insert_foreign_child
>   before insert on child
>   for each row execute procedure intercept_insert();
>
> insert into parent values ('AAA', 42);
> NOTICE:  trigger = parent_insert_trig, new table = (AAA,1042)
>
> -- 2042, because row modified by both triggers
> table parent;
>   a  |  b
> -----+------
>  AAA | 2042
> (1 row)
>
> Or perhaps you're saying that the row returned by this line in ExecInsert():
>
>        /*
>          * insert into foreign table: let the FDW do it
>          */
>         slot = resultRelInfo->ri_FdwRoutine->ExecForeignInsert(estate,
>                                                                resultRelInfo,
>                                                                slot,
>                                                                planSlot);
>
> is not the expected "after image", and thus should not be added to the
> parent's transition table?

Yes, that is what I mean by "postgres_fdw does not collect the actual
tuple, of course!" above.  My explanation was not good, though.

> IIUC, to prevent that, we now hit the following error in:
>
> void
> ExecARInsertTriggers(EState *estate, ResultRelInfo *relinfo,
>                      TupleTableSlot *slot, List *recheckIndexes,
>                      TransitionCaptureState *transition_capture)
> {
>     TriggerDesc *trigdesc = relinfo->ri_TrigDesc;
>
>     if (relinfo->ri_FdwRoutine && transition_capture &&
>         transition_capture->tcs_insert_new_table)
>     {
>         Assert(relinfo->ri_RootResultRelInfo);
>         ereport(ERROR,
>                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>                  errmsg("cannot collect transition tuples from child
> foreign tables")));
>     }

Right, I added the check for that.

Thanks!

Best regards,
Etsuro Fujita



pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: Conflict detection for update_deleted in logical replication
Next
From: Amit Kapila
Date:
Subject: Re: Conflict detection for update_deleted in logical replication