Home > mailing lists

Re: speedup COPY TO for partitioned table. - Mailing list pgsql-hackers

From	Chao Li
Subject	Re: speedup COPY TO for partitioned table.
Date	October 10 04:02:26
Msg-id	8316BFA1-D32A-4EDF-A190-A555B98D1EE4@gmail.com Whole thread Raw
In response to	Re: speedup COPY TO for partitioned table. (jian he <jian.universality@gmail.com>)
Responses	Re: speedup COPY TO for partitioned table.
List	pgsql-hackers

Tree view

On Oct 9, 2025, at 22:50, jian he <jian.universality@gmail.com> wrote:
3
```
+ if (RELKIND_HAS_PARTITIONS(relkind))
+ children = foreach_delete_current(children, childreloid);
+ }
```

I wonder if there is any specially consideration of using RELKIND_HAS_PARTITIONS() here? Because according to the function comment of find_all_inheritors(), it will only return OIDs of relations; while RELKIND_HAS_PARTITIONS checks for both relations and views. Logically using this macro works, but it may lead to some confusion to code readers.

find_all_inheritors comments says:
* Returns a list of relation OIDs including the given rel plus
* all relations that inherit from it, directly or indirectly.

CREATE TABLE pp (id int,val int) PARTITION BY RANGE (id);
CREATE TABLE pp_1 (val int, id int) PARTITION BY RANGE (id);
ALTER TABLE pp ATTACH PARTITION pp_1 FOR VALUES FROM (1) TO (5);

If we copy partitioned table "pp" data out, but partitioned table "pp_1"
don't have storage, so we have to skip it, using RELKIND_HAS_PARTITIONS
to skip it should be fine.

My point is that RELKIND_HAS_PARTITIONS is defined as:

#define RELKIND_HAS_PARTITIONS(relkind) \
 ((relkind) == RELKIND_PARTITIONED_TABLE || \
 (relkind) == RELKIND_PARTITIONED_INDEX)

It just checks relkind to be table or index. The example in your explanation seems to not address my concern. Why do we need to check against index?

4
```
@@ -722,6 +754,7 @@ BeginCopyTo(ParseState *pstate,
DestReceiver *dest;

cstate->rel = NULL;
+ cstate->partitions = NIL;
```

Both NULL assignment are not needed as cstate is allocated by palloc0().

I guess this is just a code convention. Such not necessary is quite common
within the codebase.

I don’t agree. cstate has a lot of more fields with pointer types, why don’t set NULL to them?

5
```
+static void
+CopyRelTo(CopyToState cstate, Relation rel, Relation root_rel,
+ uint64 *processed)
```

Instead of using a pointer to pass out processed count, I think it’s better to return the process count. I understand the current implementation allows continuous increment while calling this function in a loop. However, it’s a bit error-prone, a caller must make sure “processed” is well initialized. With returning a unit64, the caller’s code is still simple:

```
processed += CopyRelTo(cstate, …);
```

pgstat_progress_update_param was within CopyRelTo.
so we have to pass (uint64 *processed) to CopyRelTo.
Am I missing something?

Make sense. I didn’t notice postage_progress_update_param. So, “processed” is both input and output. In that case, I think the comment for parameter “processed” should be enhanced, for example:

```

* processed: on entry, contains the current count of processed count;

* this function increments it by the number of rows copied

* from this relation and writes back the updated total.

```

Or a short version:

```

* processed: input/output; cumulative count of tuples processed, incremented here.

```

Best regards,

Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

pgsql-hackers by date:

From: Michael Paquier
Date: 10 October, 03:52:36
Subject: Re: Replace O_EXCL with O_TRUNC for creation of state.tmp in SaveSlotToPath

From: Thomas Munro
Date: 10 October, 04:44:04
Subject: Re: Potential deadlock in pgaio_io_wait()

Re: speedup COPY TO for partitioned table. - Mailing list pgsql-hackers

Previous

Next