Re: Use merge-based matching for MCVs in eqjoinsel - Mailing list pgsql-hackers
From | Ilia Evdokimov |
---|---|
Subject | Re: Use merge-based matching for MCVs in eqjoinsel |
Date | |
Msg-id | 16ce0876-cd6b-4136-9ce9-21381de9c916@tantorlabs.com Whole thread Raw |
In response to | Re: Use merge-based matching for MCVs in eqjoinsel (David Geier <geidav.pg@gmail.com>) |
Responses |
Re: Use merge-based matching for MCVs in eqjoinsel
|
List | pgsql-hackers |
To evaluate it, I ran benchmarks on JOB with three variants: $ ./benchmark.sh master $ ./benchmark.sh merge $ ./benchmark.sh hash I compared total planning time across all 113 queries.Was this running with optimizations? How did you extract the planning time?
I save all query plans using EXPLAIN SUMMARY, then go through all the plans, read the 'Planning Time' for each, and sum them up.
I would have expected the delta between the "merge" and "hash" variant to be bigger, especially for default_statistics_target=10000. My small test also showed that. Any idea why this is not showing in your results?
So would I. With default_statistics_target = 10000 and the selectivity in the JOB queries being close to zero, the difference should be noticeable. I can only explain the previous results by cache-related effects on my machine.
I reran the benchmark on a clean cluster and collected the top slowest JOB queries — now the effect is clearly visible.
Merge (sum of all JOB queries)
==================
default_statistics_target | Planner Speedup (×) | Planner Before (ms) | Planner After (ms)
--------------------------------------------------------------------------------
100 | 1.00 | 1888.105 | 1879.431
1000 | 1.14 | 2282.239 | 2009.114
2500 | 2.10 | 5595.030 | 2668.530
5000 | 5.56 | 18544.933 | 3333.252
7500 | 9.17 | 37390.956 | 4076.390
10000 | 16.10 | 69319.479 | 4306.417
HashMap (sum of all JOB queries)
==================
default_statistics_target | Planner Speedup (×) | Planner Before (ms) | Planner After (ms)
--------------------------------------------------------------------------------
100 | 1.03 | 1888.105 | 1828.088
1000 | 1.18 | 2282.239 | 1939.884
2500 | 2.64 | 5595.030 | 2117.872
5000 | 7.80 | 18544.933 | 2377.206
7500 | 13.80 | 37390.956 | 2709.973
10000 | 23.32 | 69319.479 | 2973.073
Top 10 slowest JOB queries (default_statistics_target = 10000)
Query | master (ms) | merge (ms) | Hash (ms)
------+-------------+------------+-----------
29c | 1904.586 | 144.135 | 100.473
29b | 1881.392 | 117.891 | 89.028
29a | 1868.805 | 112.242 | 83.913
31c | 1867.234 | 76.498 | 56.140
30c | 1646.630 | 88.494 | 62.549
30b | 1608.820 | 84.821 | 64.603
31a | 1573.964 | 75.978 | 56.140
28a | 1457.738 | 95.939 | 77.309
28b | 1455.052 | 99.383 | 73.065
30a | 1416.699 | 91.057 | 62.549
BTW, the hashmap from your patch could also be applied to eqjoinsel_semi() function.
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com
pgsql-hackers by date: