Re: apply_scanjoin_target_to_paths and partitionwise join - Mailing list pgsql-hackers
From | Jakub Wartak |
---|---|
Subject | Re: apply_scanjoin_target_to_paths and partitionwise join |
Date | |
Msg-id | CAKZiRmyaFFvxyEYGG_hu0F-EVEcqcnveH23MULhW6UY_jwykGw@mail.gmail.com Whole thread Raw |
In response to | Re: apply_scanjoin_target_to_paths and partitionwise join (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>) |
Responses |
Re: apply_scanjoin_target_to_paths and partitionwise join
|
List | pgsql-hackers |
Hi Ashutosh & hackers, On Mon, Apr 15, 2024 at 9:00 AM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > Here's patch with > [..] > Adding to the next commitfest but better to consider this for the next set of minor releases. 1. The patch does not pass cfbot - https://cirrus-ci.com/task/5486258451906560 on master due to test failure "not ok 206 + partition_join" 2. Without the patch applied, the result of the meson test on master was clean (no failures , so master is fine). After applying patch there were expected some hunk failures (as the patch was created for 15_STABLE): patching file src/backend/optimizer/plan/planner.c Hunk #1 succeeded at 7567 (offset 468 lines). Hunk #2 succeeded at 7593 (offset 468 lines). patching file src/test/regress/expected/partition_join.out Hunk #1 succeeded at 4777 (offset 56 lines). Hunk #2 succeeded at 4867 (offset 56 lines). patching file src/test/regress/sql/partition_join.sql Hunk #1 succeeded at 1136 (offset 1 line). 3. Without patch there is performance regression/bug on master (cost is higher with enable_partitionwise_join=on that without it): data preparation: -- Test the process_outer_partition() code path CREATE TABLE plt1_adv (a int, b int, c text) PARTITION BY LIST (c); CREATE TABLE plt1_adv_p1 PARTITION OF plt1_adv FOR VALUES IN ('0000', '0001', '0002'); CREATE TABLE plt1_adv_p2 PARTITION OF plt1_adv FOR VALUES IN ('0003', '0004'); INSERT INTO plt1_adv SELECT i, i, to_char(i % 5, 'FM0000') FROM generate_series(0, 24) i; ANALYZE plt1_adv; CREATE TABLE plt2_adv (a int, b int, c text) PARTITION BY LIST (c); CREATE TABLE plt2_adv_p1 PARTITION OF plt2_adv FOR VALUES IN ('0002'); CREATE TABLE plt2_adv_p2 PARTITION OF plt2_adv FOR VALUES IN ('0003', '0004'); INSERT INTO plt2_adv SELECT i, i, to_char(i % 5, 'FM0000') FROM generate_series(0, 24) i WHERE i % 5 IN (2, 3, 4); ANALYZE plt2_adv; CREATE TABLE plt3_adv (a int, b int, c text) PARTITION BY LIST (c); CREATE TABLE plt3_adv_p1 PARTITION OF plt3_adv FOR VALUES IN ('0001'); CREATE TABLE plt3_adv_p2 PARTITION OF plt3_adv FOR VALUES IN ('0003', '0004'); INSERT INTO plt3_adv SELECT i, i, to_char(i % 5, 'FM0000') FROM generate_series(0, 24) i WHERE i % 5 IN (1, 3, 4); ANALYZE plt3_adv; off: EXPLAIN SELECT t1.a, t1.c, t2.a, t2.c, t3.a, t3.c FROM (plt1_adv t1 LEFT JOIN plt2_adv t2 ON (t1.c = t2.c)) FULL JOIN plt3_adv t3 ON (t1.c = t3.c) WHERE coalesce(t1.a, 0) % 5 != 3 AND coalesce(t1.a, 0) % 5 != 4 ORDER BY t1.c, t1.a, t2.a, t3.a; QUERY PLAN ----------------------------------------------------------------------------------------------- Sort (cost=22.02..22.58 rows=223 width=27) Sort Key: t1.c, t1.a, t2.a, t3.a -> Hash Full Join (cost=4.83..13.33 rows=223 width=27) [..] with enable_partitionwise_join=ON (see the jump from cost 22.02 -> 27.65): EXPLAIN SELECT t1.a, t1.c, t2.a, t2.c, t3.a, t3.c FROM (plt1_adv t1 LEFT JOIN plt2_adv t2 ON (t1.c = t2.c)) FULL JOIN plt3_adv t3 ON (t1.c = t3.c) WHERE coalesce(t1.a, 0) % 5 != 3 AND coalesce(t1.a, 0) % 5 != 4 ORDER BY t1.c, t1.a, t2.a, t3.a; QUERY PLAN ----------------------------------------------------------------------------------------------- Sort (cost=27.65..28.37 rows=289 width=27) Sort Key: t1.c, t1.a, t2.a, t3.a -> Append (cost=2.23..15.83 rows=289 width=27) -> Hash Full Join (cost=2.23..4.81 rows=41 width=27) [..] -> Hash Full Join (cost=2.45..9.57 rows=248 width=27) [..] However with the patch applied the plan with minimal cost is always chosen ("22"): explain SELECT t1.a, t1.c, t2.a, t2.c, t3.a, t3.c FROM (plt1_adv t1 LEFT JOIN plt2_adv t2 ON (t1.c = t2.c)) FULL JOIN plt3_adv t3 ON (t1.c = t3.c) WHERE coalesce(t1.a, 0 ) % 5 != 3 AND coalesce(t1.a, 0) % 5 != 4 ORDER BY t1.c, t1.a, t2.a, t3.a; QUERY PLAN ----------------------------------------------------------------------------------------------- Sort (cost=22.02..22.58 rows=223 width=27) Sort Key: t1.c, t1.a, t2.a, t3.a -> Hash Full Join (cost=4.83..13.33 rows=223 width=27) [..] set enable_partitionwise_join to on; explain SELECT t1.a, t1.c, t2.a, t2.c, t3.a, t3.c FROM (plt1_adv t1 LEFT JOIN plt2_adv t2 ON (t1.c = t2.c)) FULL JOIN plt3_adv t3 ON (t1.c = t3.c) WHERE coalesce(t1.a, 0 ) % 5 != 3 AND coalesce(t1.a, 0) % 5 != 4 ORDER BY t1.c, t1.a, t2.a, t3.a; QUERY PLAN ----------------------------------------------------------------------------------------------- Sort (cost=22.02..22.58 rows=223 width=27) Sort Key: t1.c, t1.a, t2.a, t3.a -> Hash Full Join (cost=4.83..13.33 rows=223 width=27) [..] with the patch applied, the minimal cost (with toggle on or off) the cost always stays the minimal from the available ones. We cannot provide a reproducer for real performance regression, but for the affected customer it took 530+s (with enable_partitionwise_join=on) and without that GUC it it was ~23s. 4. meson test ends up with failures like below: 4/290 postgresql:regress / regress/regress ERROR 32.67s 6/290 postgresql:pg_upgrade / pg_upgrade/002_pg_upgrade ERROR 56.96s 35/290 postgresql:recovery / recovery/027_stream_regress ERROR 40.20s (all due to "regression tests pass" failures) the partition_join.sql is failing for test 206, so for this: -- partitionwise join with fractional paths CREATE TABLE fract_t (id BIGINT, PRIMARY KEY (id)) PARTITION BY RANGE (id); CREATE TABLE fract_t0 PARTITION OF fract_t FOR VALUES FROM ('0') TO ('1000'); CREATE TABLE fract_t1 PARTITION OF fract_t FOR VALUES FROM ('1000') TO ('2000'); -- insert data INSERT INTO fract_t (id) (SELECT generate_series(0, 1999)); ANALYZE fract_t; -- verify plan; nested index only scans SET max_parallel_workers_per_gather = 0; SET enable_partitionwise_join = on; the testsuite was expecting the below with enable_partitionwise_join = on; EXPLAIN (COSTS OFF) SELECT x.id, y.id FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY x.id ASC LIMIT 10; QUERY PLAN ----------------------------------------------------------------------- Limit -> Merge Append Sort Key: x.id -> Merge Left Join Merge Cond: (x_1.id = y_1.id) -> Index Only Scan using fract_t0_pkey on fract_t0 x_1 -> Index Only Scan using fract_t0_pkey on fract_t0 y_1 -> Merge Left Join Merge Cond: (x_2.id = y_2.id) -> Index Only Scan using fract_t1_pkey on fract_t1 x_2 -> Index Only Scan using fract_t1_pkey on fract_t1 y_2 but actually with patch it gets this (here with costs): EXPLAIN (COSTS) SELECT x.id, y.id FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY x.id ASC LIMIT 10; QUERY PLAN ------------------------------------------------------------------------------------------------------------- Limit (cost=1.10..2.21 rows=10 width=16) -> Merge Left Join (cost=1.10..223.10 rows=2000 width=16) Merge Cond: (x.id = y.id) -> Append (cost=0.55..96.55 rows=2000 width=8) [..] -> Append (cost=0.55..96.55 rows=2000 width=8) [..] if you run it without patch and again with enable_partitionwise_join=on: EXPLAIN SELECT x.id, y.id FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY x.id ASC LIMIT 10; QUERY PLAN ------------------------------------------------------------------------------------------------------------- Limit (cost=1.11..2.22 rows=10 width=16) -> Merge Append (cost=1.11..223.11 rows=2000 width=16) Sort Key: x.id -> Merge Left Join (cost=0.55..101.55 rows=1000 width=16) [..] -> Merge Left Join (cost=0.55..101.55 rows=1000 width=16) [..] So with the patch that SQL does not use partitionwise join as it finds it more optimal to stick to a plan with cost of "1.10..2.21" instead of "1.11..2.22" (w/ partition_join), nitpicking but still a failure technically. Perhaps it could be even removed? (it's pretty close to noise?). -J.
pgsql-hackers by date: