Re: speedup COPY TO for partitioned table. - Mailing list pgsql-hackers

From Chao Li
Subject Re: speedup COPY TO for partitioned table.
Date
Msg-id F03A6BD2-6971-4C00-A282-2AEED640DC49@gmail.com
Whole thread Raw
In response to Re: speedup COPY TO for partitioned table.  (jian he <jian.universality@gmail.com>)
Responses Re: speedup COPY TO for partitioned table.
List pgsql-hackers
Hi Jian,

Thanks for the patch. After reviewing it, I got a few small comments:

On Oct 9, 2025, at 15:10, jian he <jian.universality@gmail.com> wrote:


Please check the attached v16.
<v16-0001-support-COPY-partitioned_table-TO.patch>


1
```
+ List   *partitions; /* oid list of partition oid for copy to */
```

The comment doesn’t look very good. First, it repeats “oid”; second, as “List *partitions” implies multiple partitions, the comment should use plural OIDs. Maybe change the comment to “/* list of partition OIDs for COPY TO */" 

2
```
+ /*
+ * Collect a list of partitions containing data, so that later
+ * DoCopyTo can copy the data from them.
+ */
+ children = find_all_inheritors(RelationGetRelid(rel),
+   AccessShareLock,
+   NULL);
+
+ foreach_oid(childreloid, children)
+ {
+ char relkind = get_rel_relkind(childreloid);
+
+ if (relkind == RELKIND_FOREIGN_TABLE)
+ {
+ char   *relation_name;
+
+ relation_name = get_rel_name(childreloid);
+ ereport(ERROR,
+ errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("cannot copy from foreign table \"%s\"", relation_name),
+ errdetail("Partition \"%s\" is a foreign table in the partitioned table \"%s\"",
+  relation_name, RelationGetRelationName(rel)),
+ errhint("Try the COPY (SELECT ...) TO variant."));
+ }
+
+ if (RELKIND_HAS_PARTITIONS(relkind))
+ children = foreach_delete_current(children, childreloid);
+ }
```

Is it better to move the RELKIND_HAS_PARTIONS() check to before FOREIGH_TABLE check and continue after foreach_delete_current()? Now every childreloid goes through the both checks, if we do the movement, then HAS_PARTIONS child will go through 1 check. This is a tiny optimization.

3
```
+ if (RELKIND_HAS_PARTITIONS(relkind))
+ children = foreach_delete_current(children, childreloid);
+ }
```

I wonder if there is any specially consideration of using RELKIND_HAS_PARTITIONS() here? Because according to the function comment of find_all_inheritors(), it will only return OIDs of relations; while RELKIND_HAS_PARTITIONS checks for both relations and views. Logically using this macro works, but it may lead to some confusion to code readers.

4
```
@@ -722,6 +754,7 @@ BeginCopyTo(ParseState *pstate,
  DestReceiver *dest;
 
  cstate->rel = NULL;
+ cstate->partitions = NIL;
```

Both NULL assignment are not needed as cstate is allocated by palloc0().  

5
```
+static void
+CopyRelTo(CopyToState cstate, Relation rel, Relation root_rel,
+  uint64 *processed)
```

Instead of using a pointer to pass out processed count, I think it’s better to return the process count. I understand the current implementation allows continuous increment while calling this function in a loop. However, it’s a bit error-prone, a caller must make sure “processed” is well initialized. With returning a unit64, the caller’s code is still simple:

```
processed += CopyRelTo(cstate, …);
``` 

6. In BeginCopyTo(), “children” list is created before “cstate” is created, it is not allocated under “cstate->copycontext”, so in EndCopyTo(), we should also free memory of “cstate->partitions”.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/




pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: [BUG] temporary file usage report with extended protocol and unnamed portals
Next
From: John Naylor
Date:
Subject: Re: get rid of RM_HEAP2_ID