Home > mailing lists

Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning - Mailing list pgsql-hackers

From	Ashutosh Bapat
Subject	Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning
Date	February 25 14:04:28
Msg-id	CAExHW5scMxyFRqOFE6ODmBiW2rnVBEmeEcA-p4W_CyuEikURdA@mail.gmail.com Whole thread Raw
In response to	Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Responses	Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning
List	pgsql-hackers

Tree view

Hi,

On Thu, Feb 20, 2025 at 5:28 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Tue, Feb 4, 2025 at 4:07 PM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > If we are not interested in saving memory, there is a simpler way to
> > improve planning time by adding a hash table per equivalence class to
> > store the derived clauses, instead of a linked list, when the number
> > of derived clauses is higher than a threshold (say 32 same as the
> > threshold for join_rel_list. Maybe that approach will yield stable
> > planning time.
> >

I implemented the above idea in attached patches. I also added the
following query, inspired from Alvaro's query, to summarise the
results.
with master_avgs as
(select code_tag, num_parts, num_joins, pwj, avg(planning_time_ms)
avg_pt, stddev(planning_time_ms) stddev_pt
from msmts where code_tag = 'master'
group by code_tag, num_parts, num_joins, pwj),
patched_avgs as
(select code_tag, num_parts, num_joins, pwj, avg(planning_time_ms)
avg_pt, stddev(planning_time_ms) stddev_pt
from msmts where code_tag = 'patched'
group by code_tag, num_parts, num_joins, pwj)
select num_parts,
num_joins,
format('s=%s%% md=%s%% pd=%s%%',
((m.avg_pt - p.avg_pt)/m.avg_pt * 100)::numeric(6, 2),
(m.stddev_pt/m.avg_pt * 100)::numeric(6, 2),
(p.stddev_pt/p.avg_pt * 100)::numeric(6, 2))
from master_avgs m join patched_avgs p using (num_parts, num_joins,
pwj) where not pwj order by 1, 2, 3;
\crosstabview 1 2 3

not pwj in the last line should be changed to pwj to get results with
enable_partitionwise_join = true.

With the attached patches, I observe following results

With PWJ disabled
 num_parts |               2               |              3
   |             4              |              5

-----------+-------------------------------+------------------------------+----------------------------+-----------------------------
         0 | s=-4.44% md=17.91% pd=23.05%  | s=-0.83% md=11.10%
pd=19.58% | s=0.87% md=4.04% pd=7.91%  | s=-35.24% md=7.63% pd=9.69%
        10 | s=30.13% md=118.18% pd=37.44% | s=-3.49% md=0.58%
pd=0.49%   | s=-0.83% md=0.29% pd=0.35% | s=-0.24% md=0.35% pd=0.32%
       100 | s=1.94% md=13.19% pd=4.08%    | s=-0.27% md=0.18%
pd=0.44%   | s=7.04% md=3.05% pd=3.11%  | s=12.75% md=1.69% pd=0.81%
       500 | s=4.39% md=1.71% pd=1.33%     | s=10.17% md=1.28%
pd=1.90%   | s=23.04% md=0.24% pd=0.58% | s=30.87% md=0.30% pd=1.11%
      1000 | s=4.27% md=1.21% pd=1.97%     | s=13.97% md=0.44%
pd=0.79%   | s=24.05% md=0.63% pd=1.02% | s=30.77% md=0.77% pd=0.17%

Each cell is a triple (s, md, pd) where s is improvement in planning
time using the patches in % as compared to the master (higher the
better), md = standard deviation as % of the average planning time on
master, pd = is standard deviation as % of the average planning time
with patches.

With PWJ enabled
 num_parts |              2               |              3
  |              4              |              5

-----------+------------------------------+------------------------------+-----------------------------+------------------------------
         0 | s=-94.25% md=6.98% pd=56.03% | s=44.10% md=141.13%
pd=9.32% | s=42.71% md=46.00% pd=6.55% | s=-26.12% md=6.72% pd=15.20%
        10 | s=-25.89% md=4.29% pd=63.75% | s=-1.34% md=3.15% pd=3.26%
  | s=0.31% md=4.13% pd=4.34%   | s=-1.34% md=3.10% pd=6.73%
       100 | s=-2.83% md=0.94% pd=1.31%   | s=-2.17% md=4.57% pd=4.41%
  | s=0.98% md=1.59% pd=1.81%   | s=1.87% md=1.10% pd=0.79%
       500 | s=1.57% md=3.01% pd=1.70%    | s=6.99% md=1.58% pd=1.68%
  | s=11.11% md=0.24% pd=0.62%  | s=11.65% md=0.18% pd=0.90%
      1000 | s=3.59% md=0.98% pd=1.78%    | s=10.83% md=0.88% pd=0.46%
  | s=15.62% md=0.46% pd=0.13%  | s=16.38% md=0.63% pd=0.29%

Same numbers measured for previous set of patches [1], which improves
both memory consumption as well as planning time.

With PWJ disabled
 num_parts |               2               |              3
   |              4               |               5

-----------+-------------------------------+------------------------------+------------------------------+--------------------------------
         0 | s=4.68% md=18.17% pd=22.09%   | s=-2.54% md=12.00%
pd=13.81% | s=-2.02% md=3.84% pd=4.43%   | s=-69.14% md=11.06%
pd=126.87%
        10 | s=-24.85% md=20.42% pd=35.69% | s=-4.31% md=0.73%
pd=1.53%   | s=-14.97% md=0.32% pd=31.90% | s=-0.57% md=0.79% pd=0.50%
       100 | s=0.27% md=4.69% pd=1.55%     | s=4.16% md=0.29% pd=0.18%
   | s=11.76% md=0.85% pd=0.49%   | s=15.76% md=1.64% pd=2.32%
       500 | s=0.54% md=1.88% pd=1.81%     | s=9.36% md=1.17% pd=0.87%
   | s=21.45% md=0.74% pd=0.88%   | s=30.47% md=0.17% pd=1.17%
      1000 | s=3.22% md=1.36% pd=0.99%     | s=14.74% md=0.86%
pd=0.44%   | s=24.50% md=0.36% pd=0.31%   | s=27.97% md=0.27% pd=0.25%

With PWJ enabled
 num_parts |              2              |             3
|             4              |              5

-----------+-----------------------------+----------------------------+----------------------------+-----------------------------
         0 | s=11.07% md=19.28% pd=8.70% | s=-1.18% md=5.88% pd=4.31%
| s=-2.25% md=8.42% pd=3.77% | s=25.07% md=11.48% pd=3.87%
        10 | s=-9.07% md=2.65% pd=14.58% | s=0.55% md=3.10% pd=3.41%
| s=3.89% md=3.94% pd=3.79%  | s=7.25% md=2.87% pd=3.03%
       100 | s=-4.53% md=0.49% pd=8.53%  | s=2.24% md=4.24% pd=3.96%
| s=6.70% md=1.30% pd=2.08%  | s=9.09% md=1.39% pd=1.50%
       500 | s=-1.65% md=1.59% pd=1.44%  | s=6.31% md=0.89% pd=1.11%
| s=12.72% md=0.20% pd=0.29% | s=15.02% md=0.28% pd=0.83%
      1000 | s=1.53% md=1.01% pd=1.66%   | s=11.80% md=0.66% pd=0.71%
| s=16.23% md=0.58% pd=0.18% | s=17.16% md=0.67% pd=0.68%

There are a few things to notice
1. There is not much difference in the planning time improvement by
both the patchsets. But the patchset attached to [1] improves memory
consumption as well. So it looks more attractive.
2. The performance regressions usually coincide with higher standard
deviation. This indicates that both the performance gains or losses
seen at lower numbers of partitions and joins are not real and
possibly ignored. I have run the script multiple times but some or
other combination of lower number of partitions and lower number of
joins shows higher deviation and thus unstable results. I have not be
able to find a way where all the combinations show a stable result.

I think the first patch in the attached set is worth committing, just
to tighten things up.

[1] https://www.postgresql.org/message-id/CAExHW5t6NpGaif6ZO_P+L1cPZk27+Ye2LxfqRVXf0OiSsW9WSg@mail.gmail.com
--
Best Wishes,
Ashutosh Bapat

Attachment

pgsql-hackers by date:

From: Amit Kapila
Date: 25 February, 13:42:10
Subject: Re: long-standing data loss bug in initial sync of logical replication

From: John Naylor
Date: 25 February, 14:05:17
Subject: Re: Improve CRC32C performance on SSE4.2

Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning - Mailing list pgsql-hackers

Attachment

Previous

Next