Re: speedup COPY TO for partitioned table. - Mailing list pgsql-hackers
From | jian he |
---|---|
Subject | Re: speedup COPY TO for partitioned table. |
Date | |
Msg-id | CACJufxF0DuUuOE=3ru1AzAVtt+dex6NdAFF50kjT7Paz0a4-Sg@mail.gmail.com Whole thread Raw |
In response to | Re: speedup COPY TO for partitioned table. (Chao Li <li.evan.chao@gmail.com>) |
Responses |
Re: speedup COPY TO for partitioned table.
|
List | pgsql-hackers |
On Fri, Oct 10, 2025 at 9:02 AM Chao Li <li.evan.chao@gmail.com> wrote: > > > > On Oct 9, 2025, at 22:50, jian he <jian.universality@gmail.com> wrote: > > 3 > ``` > + if (RELKIND_HAS_PARTITIONS(relkind)) > + children = foreach_delete_current(children, childreloid); > + } > ``` > > I wonder if there is any specially consideration of using RELKIND_HAS_PARTITIONS() here? Because according to the functioncomment of find_all_inheritors(), it will only return OIDs of relations; while RELKIND_HAS_PARTITIONS checks forboth relations and views. Logically using this macro works, but it may lead to some confusion to code readers. > > > find_all_inheritors comments says: > * Returns a list of relation OIDs including the given rel plus > * all relations that inherit from it, directly or indirectly. > > CREATE TABLE pp (id int,val int) PARTITION BY RANGE (id); > CREATE TABLE pp_1 (val int, id int) PARTITION BY RANGE (id); > ALTER TABLE pp ATTACH PARTITION pp_1 FOR VALUES FROM (1) TO (5); > > If we copy partitioned table "pp" data out, but partitioned table "pp_1" > don't have storage, so we have to skip it, using RELKIND_HAS_PARTITIONS > to skip it should be fine. > > My point is that RELKIND_HAS_PARTITIONS is defined as: > > #define RELKIND_HAS_PARTITIONS(relkind) \ > ((relkind) == RELKIND_PARTITIONED_TABLE || \ > (relkind) == RELKIND_PARTITIONED_INDEX) > > It just checks relkind to be table or index. The example in your explanation seems to not address my concern. Why do weneed to check against index? > the macro name RELKIND_HAS_PARTITIONS improves the readability, I think. also we don't need to worry about partitioned index here, because we are in the `` else if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) { } `` loop. sure we can change it ``if (relkind == RELKIND_PARTITIONED_TABLE)``. > 5 > ``` > +static void > +CopyRelTo(CopyToState cstate, Relation rel, Relation root_rel, > + uint64 *processed) > ``` > > Instead of using a pointer to pass out processed count, I think it’s better to return the process count. I understand thecurrent implementation allows continuous increment while calling this function in a loop. However, it’s a bit error-prone,a caller must make sure “processed” is well initialized. With returning a unit64, the caller’s code is stillsimple: > > ``` > processed += CopyRelTo(cstate, …); > ``` > > pgstat_progress_update_param was within CopyRelTo. > so we have to pass (uint64 *processed) to CopyRelTo. > Am I missing something? > > > Make sense. I didn’t notice postage_progress_update_param. So, “processed” is both input and output. In that case, I thinkthe comment for parameter “processed” should be enhanced, for example: > if your function is: static processed CopyRelationTo(CopyToState cstate, Relation rel, Relation root_rel, uint64 *processed); where function return value is also passed as function argument, I think it will lead to more confusion.
pgsql-hackers by date: