Re: speedup COPY TO for partitioned table. - Mailing list pgsql-hackers

From jian he
Subject Re: speedup COPY TO for partitioned table.
Date
Msg-id CACJufxEheV10DpjFf+J1OabMgRe6CH+4c6d8ca3Wh1v8Twh3ZA@mail.gmail.com
Whole thread Raw
In response to Re: speedup COPY TO for partitioned table.  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: speedup COPY TO for partitioned table.
Re: speedup COPY TO for partitioned table.
List pgsql-hackers
On Thu, Oct 9, 2025 at 9:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> > please check the attached v15.
> >
> Thank you for working on this!  I've reviewed the v15 patch, and here
> are review comments:
>
> ---
> +           children = find_all_inheritors(RelationGetRelid(rel),
> +                                          AccessShareLock,
> +                                          NULL);
> +
> +           foreach_oid(childreloid, children)
> +           {
> +               char         relkind = get_rel_relkind(childreloid);
>
> I think it's better to write some comments summarizing what we're
> doing in the loop.

sure.

>
> ---
> +                   relation_name = get_rel_name(childreloid);
> +                   ereport(ERROR,
> +                           errcode(ERRCODE_WRONG_OBJECT_TYPE),
> +                           errmsg("cannot copy from foreign table
> \"%s\"", relation_name),
> +                           errdetail("Partition \"%s\" is a foreign
> table in the partitioned table \"%s\"",
> +                                     relation_name,
> RelationGetRelationName(rel)),
> +                           errhint("Try the COPY (SELECT ...) TO variant."));
>
> I think we don't need "the" in the error message.
>
> It's conventional to put all err*() macros in parentheses (i.e.,
> "(errcode(), ...)", it's technically omittable though.
>

https://www.postgresql.org/docs/current/error-message-reporting.html
QUOTE:
<<<<>>>>>
The extra parentheses were required before PostgreSQL version 12, but
are now optional.
Here is a more complex example:
.....
<<<<>>>>>

related commit:
https://git.postgresql.org/cgit/postgresql.git/commit/?id=e3a87b4991cc2d00b7a3082abb54c5f12baedfd1
Less parenthesis is generally more readable, I think.


> ---
> +               if (RELKIND_HAS_PARTITIONS(relkind))
> +                   continue;
> +
> +               scan_oids = lappend_oid(scan_oids, childreloid);
>
> find_all_inheritors() returns a list of OIDs of child relations. I
> think we can delete relations whose kind is RELKIND_HAS_PARTITIONS()
> from the list instead of creating a new list scan_oids. Then, we can
> set cstate->partition to the list.
>

yech, we can use foreach_delete_current to delete list elements on the fly.

> ---
>         tupDesc = RelationGetDescr(cstate->rel);
> +       cstate->partitions = list_copy(scan_oids);
>     }
>
> Why do we need to copy the list here?
>

yech, list_copy is not needed.

> ---
> With the patch we have:
>
>    /*
>     * If COPY TO source table is a partitioned table, then open each
>     * partition and process each individual partition.
>     */
>    if (cstate->rel && cstate->rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
>    {
>        foreach_oid(scan_oid, cstate->partitions)
>        {
>            Relation        scan_rel;
>
>            /* We already got the needed lock in BeginCopyTo */
>            scan_rel = table_open(scan_oid, NoLock);
>            CopyRelTo(cstate, scan_rel, cstate->rel, &processed);
>            table_close(scan_rel, NoLock);
>        }
>    }
>    else if (cstate->rel)
>        CopyRelTo(cstate, cstate->rel, NULL, &processed);
>    else
>    {
>        /* run the plan --- the dest receiver will send tuples */
>
> I think we can refactor the code structure as follow:
>
> if (cstate->rel)
> {
>     if (cstate->rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
>     {
>         do CopyRelTo() for each OIDs in cstate->partition here.
>     }
>     else
>         CopyRelTo(cstate, cstate->rel, NULL, &processed);
> }
> else
> {
>  ...

sure, this may increase readability.

> +   if (root_rel != NULL)
> +   {
> +       rootdesc = RelationGetDescr(root_rel);
> +       root_slot = table_slot_create(root_rel, NULL);
> +       map = build_attrmap_by_name_if_req(rootdesc, tupdesc, false);
> +   }
>
> rootdesc can be declared inside this if statement or we can directly
> pass 'RelationGetDescr(root_rel)' to build_attrmap_by_name_if_req().
>
sure. good idea.

> ---
> +       /* Deconstruct the tuple ... */
> +       if (map != NULL)
> +           copyslot = execute_attr_map_slot(map, slot, root_slot);
> +       else
> +       {
> +           slot_getallattrs(slot);
> +           copyslot = slot;
> +       }
>
> ISTM that the comment "Deconstruct the tuple" needs to move to before
> slot_getallattrs(slot).
>
ok.

> How about doing "slot = execute_attr_map_slot(map, slot, root_slot);"
> instead? (i.e., no need to have 'copyslot')
>

I tried but it seems not possible.
table_scan_getnextslot function require certain type of "slot", if we do
"slot = execute_attr_map_slot(map, slot, root_slot);"
then pointer "slot" type becomes virtual slot, then
it will fail on second time call table_scan_getnextslot


> ---
> +   if (root_slot != NULL)
> +       ExecDropSingleTupleTableSlot(root_slot);
> +   table_endscan(scandesc);
>
> We might want to pfree 'map' if we create it.
>
ok.


Please check the attached v16.

Attachment

pgsql-hackers by date:

Previous
From: Richard Guo
Date:
Subject: Re: Eager aggregation, take 3
Next
From: Chao Li
Date:
Subject: Re: Reorganize GUC structs