Re: [HACKERS] advanced partition matching algorithm forpartition-wise join - Mailing list pgsql-hackers
From | Mark Dilger |
---|---|
Subject | Re: [HACKERS] advanced partition matching algorithm forpartition-wise join |
Date | |
Msg-id | 4F6838E1-0FE8-4B92-B126-809866A064DE@enterprisedb.com Whole thread Raw |
In response to | Re: [HACKERS] advanced partition matching algorithm forpartition-wise join (Etsuro Fujita <etsuro.fujita@gmail.com>) |
Responses |
Re: [HACKERS] advanced partition matching algorithm forpartition-wise join
|
List | pgsql-hackers |
> On Tue, Dec 10, 2019 at 7:30 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote: >> I don't see any issues in the latest version, but I think we >> need to polish the patch, so I'll do that. > > I noticed some issues. :-( I think we should address it before > polishing the patch. One thing I noticed is: the patch heavily > modifies the existing test cases in partition_join.sql to test the new > partition-matching algorithm, but I think we should leave those test > cases alone because we would handle the exiting test cases (except one > negative test case) as before (see the try_partitionwise_join() > change in the patch), so those test cases would be still needed to > test that. Attached is a proposed patch for that > (v30-0001-Improve-partition-matching-for-partitionwise-join.patch) > that 1) avoids modifying the existing test cases and 2) adds a > slightly modified version of the test cases proposed in the previous > patch to test the new algorithm. Though I omitted some test cases > that seem redundant to me and added a bit more test cases involving > NULL partitions and/or default partitions. The elapsed time to run > the partition_join.sql regression test increased from 741 ms (HEAD) to > 1086 ms in my environment, but I think that would be acceptable. I > fixed one white space issue, but other than that, no code/comment > changes. > > Another thing I noticed while working on the above is: the patch fails > to apply PWJ to this case: > > CREATE TABLE plt1_ad (a int, b int, c text) PARTITION BY LIST (c); > CREATE TABLE plt1_ad_p1 PARTITION OF plt1_ad FOR VALUES IN ('0001', '0003'); > CREATE TABLE plt1_ad_p2 PARTITION OF plt1_ad FOR VALUES IN ('0004', '0006'); > CREATE TABLE plt1_ad_p3 PARTITION OF plt1_ad FOR VALUES IN ('0008', '0009'); > CREATE TABLE plt1_ad_extra PARTITION OF plt1_ad FOR VALUES IN (NULL); > INSERT INTO plt1_ad SELECT i, i, to_char(i % 10, 'FM0000') FROM > generate_series(1, 299) i WHERE i % 10 NOT IN (0, 2, 5, 7); > INSERT INTO plt1_ad VALUES (-1, -1, NULL); > ANALYZE plt1_ad; > CREATE TABLE plt2_ad (a int, b int, c text) PARTITION BY LIST (c); > CREATE TABLE plt2_ad_p1 PARTITION OF plt2_ad FOR VALUES IN ('0002', '0003'); > CREATE TABLE plt2_ad_p2 PARTITION OF plt2_ad FOR VALUES IN ('0004', '0006'); > CREATE TABLE plt2_ad_p3 PARTITION OF plt2_ad FOR VALUES IN ('0007', '0009'); > CREATE TABLE plt2_ad_extra PARTITION OF plt2_ad FOR VALUES IN (NULL); > INSERT INTO plt2_ad SELECT i, i, to_char(i % 10, 'FM0000') FROM > generate_series(1, 299) i WHERE i % 10 NOT IN (0, 1, 5, 8); > INSERT INTO plt2_ad VALUES (-1, -1, NULL); > ANALYZE plt2_ad; > > EXPLAIN (COSTS OFF) > SELECT t1.a, t1.c, t2.a, t2.c FROM plt1_ad t1 LEFT JOIN plt2_ad t2 ON > (t1.a = t2.a AND t1.c = t2.c) WHERE t1.b < 10 ORDER BY t1.a; > QUERY PLAN > -------------------------------------------------------- > Sort > Sort Key: t1.a > -> Hash Right Join > Hash Cond: ((t2.a = t1.a) AND (t2.c = t1.c)) > -> Append > -> Seq Scan on plt2_ad_p1 t2_1 > -> Seq Scan on plt2_ad_p2 t2_2 > -> Seq Scan on plt2_ad_p3 t2_3 > -> Seq Scan on plt2_ad_extra t2_4 > -> Hash > -> Append > -> Seq Scan on plt1_ad_p1 t1_1 > Filter: (b < 10) > -> Seq Scan on plt1_ad_p2 t1_2 > Filter: (b < 10) > -> Seq Scan on plt1_ad_p3 t1_3 > Filter: (b < 10) > -> Seq Scan on plt1_ad_extra t1_4 > Filter: (b < 10) > (19 rows) > > because merge_null_partitions() does not consider matching the NULL > partitions from both sides, but matches the NULL partition on the > plt1_ad side and a dummy partition, resulting in a non-PWJ plan (see > [1]). I overlooked this case when modifying that function. :-( > Another patch attached to fix this issue > (v30-0002-Fix-handling-of-NULL-partitions.patch). (We would not need > to fix this, if we could handle the case where a dummy partition is on > the nullable side of an outer join [1], but we can't, so I think it > would be a good idea at least for now to match the NULL partitions > from both sides to do PWJ.) > > Best regards, > Etsuro Fujita > > [1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=7ad6498fd5a654de6e743814c36cf619a3b5ddb6 > > <v30-0001-Improve-partition-matching-for-partitionwise-join.patch><v30-0002-Fix-handling-of-NULL-partitions.patch> Fujita-san, With respect to these two patches: They apply, compile, and pass all the regression tests. The code looks reasonable. There is stray whitespace in v30-0002: src/backend/partitioning/partbounds.c:4557: space before tab in indent. + outer_null_unmerged = true; I have added tests checking correctness and showing some partition pruning limitations. Find three patches, attached. The v31-0001-… patch merely applies your patches as a starting point for the next two patches. It is your work, not mine. The v31-0002-… patch adds more regression tests for range partitioning. The commit message contains my comments about that. The v31-0003-… patch adds more regression tests for list partitioning, and again, the commit message contains my commentsabout that. I hope this review is helpful. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
pgsql-hackers by date: