Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning - Mailing list pgsql-hackers
From | Ashutosh Bapat |
---|---|
Subject | Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning |
Date | |
Msg-id | CAExHW5scMxyFRqOFE6ODmBiW2rnVBEmeEcA-p4W_CyuEikURdA@mail.gmail.com Whole thread Raw |
In response to | Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>) |
Responses |
Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning
|
List | pgsql-hackers |
Hi, On Thu, Feb 20, 2025 at 5:28 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > On Tue, Feb 4, 2025 at 4:07 PM Ashutosh Bapat > <ashutosh.bapat.oss@gmail.com> wrote: > > > > If we are not interested in saving memory, there is a simpler way to > > improve planning time by adding a hash table per equivalence class to > > store the derived clauses, instead of a linked list, when the number > > of derived clauses is higher than a threshold (say 32 same as the > > threshold for join_rel_list. Maybe that approach will yield stable > > planning time. > > I implemented the above idea in attached patches. I also added the following query, inspired from Alvaro's query, to summarise the results. with master_avgs as (select code_tag, num_parts, num_joins, pwj, avg(planning_time_ms) avg_pt, stddev(planning_time_ms) stddev_pt from msmts where code_tag = 'master' group by code_tag, num_parts, num_joins, pwj), patched_avgs as (select code_tag, num_parts, num_joins, pwj, avg(planning_time_ms) avg_pt, stddev(planning_time_ms) stddev_pt from msmts where code_tag = 'patched' group by code_tag, num_parts, num_joins, pwj) select num_parts, num_joins, format('s=%s%% md=%s%% pd=%s%%', ((m.avg_pt - p.avg_pt)/m.avg_pt * 100)::numeric(6, 2), (m.stddev_pt/m.avg_pt * 100)::numeric(6, 2), (p.stddev_pt/p.avg_pt * 100)::numeric(6, 2)) from master_avgs m join patched_avgs p using (num_parts, num_joins, pwj) where not pwj order by 1, 2, 3; \crosstabview 1 2 3 not pwj in the last line should be changed to pwj to get results with enable_partitionwise_join = true. With the attached patches, I observe following results With PWJ disabled num_parts | 2 | 3 | 4 | 5 -----------+-------------------------------+------------------------------+----------------------------+----------------------------- 0 | s=-4.44% md=17.91% pd=23.05% | s=-0.83% md=11.10% pd=19.58% | s=0.87% md=4.04% pd=7.91% | s=-35.24% md=7.63% pd=9.69% 10 | s=30.13% md=118.18% pd=37.44% | s=-3.49% md=0.58% pd=0.49% | s=-0.83% md=0.29% pd=0.35% | s=-0.24% md=0.35% pd=0.32% 100 | s=1.94% md=13.19% pd=4.08% | s=-0.27% md=0.18% pd=0.44% | s=7.04% md=3.05% pd=3.11% | s=12.75% md=1.69% pd=0.81% 500 | s=4.39% md=1.71% pd=1.33% | s=10.17% md=1.28% pd=1.90% | s=23.04% md=0.24% pd=0.58% | s=30.87% md=0.30% pd=1.11% 1000 | s=4.27% md=1.21% pd=1.97% | s=13.97% md=0.44% pd=0.79% | s=24.05% md=0.63% pd=1.02% | s=30.77% md=0.77% pd=0.17% Each cell is a triple (s, md, pd) where s is improvement in planning time using the patches in % as compared to the master (higher the better), md = standard deviation as % of the average planning time on master, pd = is standard deviation as % of the average planning time with patches. With PWJ enabled num_parts | 2 | 3 | 4 | 5 -----------+------------------------------+------------------------------+-----------------------------+------------------------------ 0 | s=-94.25% md=6.98% pd=56.03% | s=44.10% md=141.13% pd=9.32% | s=42.71% md=46.00% pd=6.55% | s=-26.12% md=6.72% pd=15.20% 10 | s=-25.89% md=4.29% pd=63.75% | s=-1.34% md=3.15% pd=3.26% | s=0.31% md=4.13% pd=4.34% | s=-1.34% md=3.10% pd=6.73% 100 | s=-2.83% md=0.94% pd=1.31% | s=-2.17% md=4.57% pd=4.41% | s=0.98% md=1.59% pd=1.81% | s=1.87% md=1.10% pd=0.79% 500 | s=1.57% md=3.01% pd=1.70% | s=6.99% md=1.58% pd=1.68% | s=11.11% md=0.24% pd=0.62% | s=11.65% md=0.18% pd=0.90% 1000 | s=3.59% md=0.98% pd=1.78% | s=10.83% md=0.88% pd=0.46% | s=15.62% md=0.46% pd=0.13% | s=16.38% md=0.63% pd=0.29% Same numbers measured for previous set of patches [1], which improves both memory consumption as well as planning time. With PWJ disabled num_parts | 2 | 3 | 4 | 5 -----------+-------------------------------+------------------------------+------------------------------+-------------------------------- 0 | s=4.68% md=18.17% pd=22.09% | s=-2.54% md=12.00% pd=13.81% | s=-2.02% md=3.84% pd=4.43% | s=-69.14% md=11.06% pd=126.87% 10 | s=-24.85% md=20.42% pd=35.69% | s=-4.31% md=0.73% pd=1.53% | s=-14.97% md=0.32% pd=31.90% | s=-0.57% md=0.79% pd=0.50% 100 | s=0.27% md=4.69% pd=1.55% | s=4.16% md=0.29% pd=0.18% | s=11.76% md=0.85% pd=0.49% | s=15.76% md=1.64% pd=2.32% 500 | s=0.54% md=1.88% pd=1.81% | s=9.36% md=1.17% pd=0.87% | s=21.45% md=0.74% pd=0.88% | s=30.47% md=0.17% pd=1.17% 1000 | s=3.22% md=1.36% pd=0.99% | s=14.74% md=0.86% pd=0.44% | s=24.50% md=0.36% pd=0.31% | s=27.97% md=0.27% pd=0.25% With PWJ enabled num_parts | 2 | 3 | 4 | 5 -----------+-----------------------------+----------------------------+----------------------------+----------------------------- 0 | s=11.07% md=19.28% pd=8.70% | s=-1.18% md=5.88% pd=4.31% | s=-2.25% md=8.42% pd=3.77% | s=25.07% md=11.48% pd=3.87% 10 | s=-9.07% md=2.65% pd=14.58% | s=0.55% md=3.10% pd=3.41% | s=3.89% md=3.94% pd=3.79% | s=7.25% md=2.87% pd=3.03% 100 | s=-4.53% md=0.49% pd=8.53% | s=2.24% md=4.24% pd=3.96% | s=6.70% md=1.30% pd=2.08% | s=9.09% md=1.39% pd=1.50% 500 | s=-1.65% md=1.59% pd=1.44% | s=6.31% md=0.89% pd=1.11% | s=12.72% md=0.20% pd=0.29% | s=15.02% md=0.28% pd=0.83% 1000 | s=1.53% md=1.01% pd=1.66% | s=11.80% md=0.66% pd=0.71% | s=16.23% md=0.58% pd=0.18% | s=17.16% md=0.67% pd=0.68% There are a few things to notice 1. There is not much difference in the planning time improvement by both the patchsets. But the patchset attached to [1] improves memory consumption as well. So it looks more attractive. 2. The performance regressions usually coincide with higher standard deviation. This indicates that both the performance gains or losses seen at lower numbers of partitions and joins are not real and possibly ignored. I have run the script multiple times but some or other combination of lower number of partitions and lower number of joins shows higher deviation and thus unstable results. I have not be able to find a way where all the combinations show a stable result. I think the first patch in the attached set is worth committing, just to tighten things up. [1] https://www.postgresql.org/message-id/CAExHW5t6NpGaif6ZO_P+L1cPZk27+Ye2LxfqRVXf0OiSsW9WSg@mail.gmail.com -- Best Wishes, Ashutosh Bapat
Attachment
pgsql-hackers by date: