Thread: Can we rely on the ordering of paths in pathlist?
As the comment above add_path() says, 'The pathlist is kept sorted by
total_cost, with cheaper paths at the front.' And it seems that
get_cheapest_parallel_safe_total_inner() relies on this ordering
(without being mentioned in the comments, which I think we should do).
I'm wondering if this is the right thing to do, as in other places
cheapest total cost is found by compare_path_costs(), which would
consider startup cost if paths have the same total cost, and that seems
more sensible.
Attach a trivial patch to make get_cheapest_parallel_safe_total_inner
act this way.
Thanks
Richard
total_cost, with cheaper paths at the front.' And it seems that
get_cheapest_parallel_safe_total_inner() relies on this ordering
(without being mentioned in the comments, which I think we should do).
I'm wondering if this is the right thing to do, as in other places
cheapest total cost is found by compare_path_costs(), which would
consider startup cost if paths have the same total cost, and that seems
more sensible.
Attach a trivial patch to make get_cheapest_parallel_safe_total_inner
act this way.
Thanks
Richard
Attachment
On Tue, Apr 11, 2023 at 11:03 AM Richard Guo <guofenglinux@gmail.com> wrote:
As the comment above add_path() says, 'The pathlist is kept sorted by
total_cost, with cheaper paths at the front.' And it seems that
get_cheapest_parallel_safe_total_inner() relies on this ordering
(without being mentioned in the comments, which I think we should do).
add_partial_path, we have
* add_partial_path
*
* As in add_path, the partial_pathlist is kept sorted with the cheapest
* total path in front. This is depended on by multiple places, which
* just take the front entry as the cheapest path without searching.
*
.
I'm wondering if this is the right thing to do, as in other places
cheapest total cost is found by compare_path_costs(), which would
consider startup cost if paths have the same total cost, and that seems
more sensible.
Attach a trivial patch to make get_cheapest_parallel_safe_total_inner
act this way.
"consider startup cost if paths have the same total cost", we still can
rely on the ordering and stop iterating when the total_cost becomes
bigger to avoid scanning all the paths in pathlist.
But if you are complaining the function prototype, where is the pathlist
may be not presorted, I think the better way maybe changes it from:
Path *get_cheapest_parallel_safe_total_inner(List *paths) to
Path *get_cheapest_parallel_safe_total_inner(RelOptInfo *rel);
and scan the rel->pathlist in the function body.
Best Regards
Andy Fan
On Tue, Apr 11, 2023 at 11:43 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:
On Tue, Apr 11, 2023 at 11:03 AM Richard Guo <guofenglinux@gmail.com> wrote:As the comment above add_path() says, 'The pathlist is kept sorted by
total_cost, with cheaper paths at the front.' And it seems that
get_cheapest_parallel_safe_total_inner() relies on this ordering
(without being mentioned in the comments, which I think we should do).I think the answer for ${subject} should be yes. Per the comments in
add_partial_path, we have
* add_partial_path
*
* As in add_path, the partial_pathlist is kept sorted with the cheapest
* total path in front. This is depended on by multiple places, which
* just take the front entry as the cheapest path without searching.
*
I'm not sure about this conclusion. Surely we can depend on that the
partial_pathlist is kept sorted by total_cost ASC. This is emphasized
in the comment of add_partial_path, and also leveraged in practice, such
as in many places we just use linitial(rel->partial_pathlist) as the
cheapest partial path.
But get_cheapest_parallel_safe_total_inner works on pathlist not
partial_pathlist. And for pathlist, I'm not sure if it's a good
practice to depend on its ordering. Because
1) the comment of add_path only mentions that add_path_precheck relies
on this ordering, but it does not mention that other functions also do;
2) other than add_path_precheck, I haven't observed any other functions
that rely on this ordering. The only exception as far as I notice is
get_cheapest_parallel_safe_total_inner.
On the other hand, if we declare that we can rely on the pathlist being
sorted in ascending order by total_cost, we should update the comment
for add_path to highlight this aspect. We should also include a comment
for get_cheapest_parallel_safe_total_inner to clarify why an early exit
is possible, similar to what we do for add_path_precheck. Additionally,
in several places, we can optimize our code by taking advantage of this
fact and terminate the iteration through the pathlist early when looking
for the cheapest path of a certain kind.
Thanks
Richard
partial_pathlist is kept sorted by total_cost ASC. This is emphasized
in the comment of add_partial_path, and also leveraged in practice, such
as in many places we just use linitial(rel->partial_pathlist) as the
cheapest partial path.
But get_cheapest_parallel_safe_total_inner works on pathlist not
partial_pathlist. And for pathlist, I'm not sure if it's a good
practice to depend on its ordering. Because
1) the comment of add_path only mentions that add_path_precheck relies
on this ordering, but it does not mention that other functions also do;
2) other than add_path_precheck, I haven't observed any other functions
that rely on this ordering. The only exception as far as I notice is
get_cheapest_parallel_safe_total_inner.
On the other hand, if we declare that we can rely on the pathlist being
sorted in ascending order by total_cost, we should update the comment
for add_path to highlight this aspect. We should also include a comment
for get_cheapest_parallel_safe_total_inner to clarify why an early exit
is possible, similar to what we do for add_path_precheck. Additionally,
in several places, we can optimize our code by taking advantage of this
fact and terminate the iteration through the pathlist early when looking
for the cheapest path of a certain kind.
Thanks
Richard
Hi Richard Guo
Is it necessary to add some comments here?
Is it necessary to add some comments here?
+ if (!innerpath->parallel_safe ||
+ !bms_is_empty(PATH_REQ_OUTER(innerpath)))
+ continue;
+
+ if (matched_path != NULL &&
+ compare_path_costs(matched_path, innerpath, TOTAL_COST) <= 0)
+ continue;
+
+ matched_path = innerpath;
+ !bms_is_empty(PATH_REQ_OUTER(innerpath)))
+ continue;
+
+ if (matched_path != NULL &&
+ compare_path_costs(matched_path, innerpath, TOTAL_COST) <= 0)
+ continue;
+
+ matched_path = innerpath;
Richard Guo <guofenglinux@gmail.com> 于2024年1月10日周三 15:08写道:
On Tue, Apr 11, 2023 at 11:43 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:On Tue, Apr 11, 2023 at 11:03 AM Richard Guo <guofenglinux@gmail.com> wrote:As the comment above add_path() says, 'The pathlist is kept sorted by
total_cost, with cheaper paths at the front.' And it seems that
get_cheapest_parallel_safe_total_inner() relies on this ordering
(without being mentioned in the comments, which I think we should do).I think the answer for ${subject} should be yes. Per the comments in
add_partial_path, we have
* add_partial_path
*
* As in add_path, the partial_pathlist is kept sorted with the cheapest
* total path in front. This is depended on by multiple places, which
* just take the front entry as the cheapest path without searching.
*I'm not sure about this conclusion. Surely we can depend on that the
partial_pathlist is kept sorted by total_cost ASC. This is emphasized
in the comment of add_partial_path, and also leveraged in practice, such
as in many places we just use linitial(rel->partial_pathlist) as the
cheapest partial path.
But get_cheapest_parallel_safe_total_inner works on pathlist not
partial_pathlist. And for pathlist, I'm not sure if it's a good
practice to depend on its ordering. Because
1) the comment of add_path only mentions that add_path_precheck relies
on this ordering, but it does not mention that other functions also do;
2) other than add_path_precheck, I haven't observed any other functions
that rely on this ordering. The only exception as far as I notice is
get_cheapest_parallel_safe_total_inner.
On the other hand, if we declare that we can rely on the pathlist being
sorted in ascending order by total_cost, we should update the comment
for add_path to highlight this aspect. We should also include a comment
for get_cheapest_parallel_safe_total_inner to clarify why an early exit
is possible, similar to what we do for add_path_precheck. Additionally,
in several places, we can optimize our code by taking advantage of this
fact and terminate the iteration through the pathlist early when looking
for the cheapest path of a certain kind.
Thanks
Richard
Hi Richard Guo
Today is the last day of the commitfest cycle.I think this patch should be commented ,Except for the comments, I tested it good to me
Thanks
Today is the last day of the commitfest cycle.I think this patch should be commented ,Except for the comments, I tested it good to me
Thanks
wenhui qiu <qiuwenhuifx@gmail.com> 于2024年7月25日周四 16:18写道:
Hi Richard Guo
Is it necessary to add some comments here?+ if (!innerpath->parallel_safe ||
+ !bms_is_empty(PATH_REQ_OUTER(innerpath)))
+ continue;
+
+ if (matched_path != NULL &&
+ compare_path_costs(matched_path, innerpath, TOTAL_COST) <= 0)
+ continue;
+
+ matched_path = innerpath;Richard Guo <guofenglinux@gmail.com> 于2024年1月10日周三 15:08写道:On Tue, Apr 11, 2023 at 11:43 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:On Tue, Apr 11, 2023 at 11:03 AM Richard Guo <guofenglinux@gmail.com> wrote:As the comment above add_path() says, 'The pathlist is kept sorted by
total_cost, with cheaper paths at the front.' And it seems that
get_cheapest_parallel_safe_total_inner() relies on this ordering
(without being mentioned in the comments, which I think we should do).I think the answer for ${subject} should be yes. Per the comments in
add_partial_path, we have
* add_partial_path
*
* As in add_path, the partial_pathlist is kept sorted with the cheapest
* total path in front. This is depended on by multiple places, which
* just take the front entry as the cheapest path without searching.
*I'm not sure about this conclusion. Surely we can depend on that the
partial_pathlist is kept sorted by total_cost ASC. This is emphasized
in the comment of add_partial_path, and also leveraged in practice, such
as in many places we just use linitial(rel->partial_pathlist) as the
cheapest partial path.
But get_cheapest_parallel_safe_total_inner works on pathlist not
partial_pathlist. And for pathlist, I'm not sure if it's a good
practice to depend on its ordering. Because
1) the comment of add_path only mentions that add_path_precheck relies
on this ordering, but it does not mention that other functions also do;
2) other than add_path_precheck, I haven't observed any other functions
that rely on this ordering. The only exception as far as I notice is
get_cheapest_parallel_safe_total_inner.
On the other hand, if we declare that we can rely on the pathlist being
sorted in ascending order by total_cost, we should update the comment
for add_path to highlight this aspect. We should also include a comment
for get_cheapest_parallel_safe_total_inner to clarify why an early exit
is possible, similar to what we do for add_path_precheck. Additionally,
in several places, we can optimize our code by taking advantage of this
fact and terminate the iteration through the pathlist early when looking
for the cheapest path of a certain kind.
Thanks
Richard
Hi Richard Guo
I tried to changed the comment, can you help me to check if this is correct?Many thanks.
- /*
- * get_cheapest_parallel_safe_total_inner
- * Find the unparameterized parallel-safe path with the least total cost.
- */
+ /* get_cheapest_parallel_safe_total_inner
+ * Skip paths that do not meet the criteria,find the unparameterized parallel-safe path with the least total cost and return NULL if it does not exist.
+ *
+ */
- * get_cheapest_parallel_safe_total_inner
- * Find the unparameterized parallel-safe path with the least total cost.
- */
+ /* get_cheapest_parallel_safe_total_inner
+ * Skip paths that do not meet the criteria,find the unparameterized parallel-safe path with the least total cost and return NULL if it does not exist.
+ *
+ */
Thanks
wenhui qiu <qiuwenhuifx@gmail.com> 于2024年7月31日周三 09:21写道:
Hi Richard Guo
Today is the last day of the commitfest cycle.I think this patch should be commented ,Except for the comments, I tested it good to me
Thankswenhui qiu <qiuwenhuifx@gmail.com> 于2024年7月25日周四 16:18写道:Hi Richard Guo
Is it necessary to add some comments here?+ if (!innerpath->parallel_safe ||
+ !bms_is_empty(PATH_REQ_OUTER(innerpath)))
+ continue;
+
+ if (matched_path != NULL &&
+ compare_path_costs(matched_path, innerpath, TOTAL_COST) <= 0)
+ continue;
+
+ matched_path = innerpath;Richard Guo <guofenglinux@gmail.com> 于2024年1月10日周三 15:08写道:On Tue, Apr 11, 2023 at 11:43 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:On Tue, Apr 11, 2023 at 11:03 AM Richard Guo <guofenglinux@gmail.com> wrote:As the comment above add_path() says, 'The pathlist is kept sorted by
total_cost, with cheaper paths at the front.' And it seems that
get_cheapest_parallel_safe_total_inner() relies on this ordering
(without being mentioned in the comments, which I think we should do).I think the answer for ${subject} should be yes. Per the comments in
add_partial_path, we have
* add_partial_path
*
* As in add_path, the partial_pathlist is kept sorted with the cheapest
* total path in front. This is depended on by multiple places, which
* just take the front entry as the cheapest path without searching.
*I'm not sure about this conclusion. Surely we can depend on that the
partial_pathlist is kept sorted by total_cost ASC. This is emphasized
in the comment of add_partial_path, and also leveraged in practice, such
as in many places we just use linitial(rel->partial_pathlist) as the
cheapest partial path.
But get_cheapest_parallel_safe_total_inner works on pathlist not
partial_pathlist. And for pathlist, I'm not sure if it's a good
practice to depend on its ordering. Because
1) the comment of add_path only mentions that add_path_precheck relies
on this ordering, but it does not mention that other functions also do;
2) other than add_path_precheck, I haven't observed any other functions
that rely on this ordering. The only exception as far as I notice is
get_cheapest_parallel_safe_total_inner.
On the other hand, if we declare that we can rely on the pathlist being
sorted in ascending order by total_cost, we should update the comment
for add_path to highlight this aspect. We should also include a comment
for get_cheapest_parallel_safe_total_inner to clarify why an early exit
is possible, similar to what we do for add_path_precheck. Additionally,
in several places, we can optimize our code by taking advantage of this
fact and terminate the iteration through the pathlist early when looking
for the cheapest path of a certain kind.
Thanks
Richard