Clamping reulst row number of joins. - Mailing list pgsql-hackers
From | Kyotaro HORIGUCHI |
---|---|
Subject | Clamping reulst row number of joins. |
Date | |
Msg-id | 20150306.173341.188111041.horiguchi.kyotaro@lab.ntt.co.jp Whole thread Raw |
Responses |
Re: Clamping reulst row number of joins.
|
List | pgsql-hackers |
Hello, I had a report that a query gets wired estimated row numbers and it makes subsequent plans wrong. Although the detailed explanation is shown later in this mail, the problem here was that a query like following makes the path with apparently wrong row number. EXPLAIN SELECT a FROM (SELECT a FROM t1 UNION ALL SELECT 0) t JOIN (VALUES (1)) tt(x) ON tt.x = t.a; 0> Nested Loop (cost=0.43..8.51 rows=68732 width=4) 1> -> Values Scan on "*VALUES*" (cost=0.00..0.01 rows=1 width=4) 2> -> Append (cost=0.43..8.48 rows=2 width=4) 3> -> Index Only Scan using i_t1_a on t1 4> (cost=0.43..8.46 rows=1 width=4) 5> Index Cond: (a = "*VALUES*".column1) 6> -> Subquery Scan on "*SELECT* 2" (cost=0.00..0.02 rows=1 width=4) 7> Filter: ("*VALUES*".column1 = "*SELECT* 2"."?column?") 8> -> Result (cost=0.00..0.01 rows=1 width=0) The Nested Loop at line 0 says that it emits 68832 rows even though the underlying nodes, Values at line 1 and Append at line 2 are giving 1 and 2 respectively as their reults. This query actually returns only 1 row. It is seemingly wrong and causes the plans above go wrong. This is caused by the Append node, which don't have statistics, so eqjoinsel takes 0.5 as the default selectivity for the nested loop. Even though it is defficult to get statistics for Append node, the nested loop could get more sane row nubmer if it doesn't ignore the parameterized path selected for the scan on t1. I suppose that join nodes can safely clamp the number of its result rows by the product of the row numbers of underlying two paths. And if it is correct, the attached patch can correct the estimation like this. > Nested Loop (cost=0.43..16.51 rows=2 width=4) > -> Values Scan on "*VALUES*" (cost=0.00..0.01 rows=1 width=4) > -> Append (cost=0.43..16.48 rows=2 width=4) > -> Index Only Scan using i_t1_a on t1 > (cost=0.43..16.45 rows=1 width=4) > Index Cond: (a = "*VALUES*".column1) > -> Subquery Scan on "*SELECT* 2" (cost=0.00..0.02 rows=1 width=4) > Filter: ("*VALUES*".column1 = "*SELECT* 2"."?column?") > -> Result (cost=0.00..0.01 rows=1 width=0) Is it correct? And Does it have no bad side effects? And is this applicable for 9.5? The detailed test environment is described below. Before this fix, the inner path of the top-level join is doing hash because the inner path says that it will give 68732 rows. (But only 1 actually). After this fix, the outer path declaring that it gives 2 rows so the top-level join becomes nested loop and the inner path becomes index scan. ======= CREATE TABLE t1 (a int); INSERT INTO t1 (SELECT (a%2)*500000 + a / 2 FROM generate_series(0, 1000000) a); CREATE INDEX i_t1_a ON t1 (a); CREATE TABLE t2 AS SELECT * from t1; ANALYZE t1; ANALYZE t2; -- emulating the problematic situation SET join_collapse_limit TO 1; SET random_page_cost TO 8.0; SET seq_page_cost TO 0.5; EXPLAIN SELECT tt2.a FROM (SELECT a FROM (SELECT a FROM t1 UNION ALL SELECT 0) t JOIN (VALUES (1))tt(x) ON tt.x = t.a) tt2 JOIN t2 ON (tt2.a = t2.a); ======== The plan before fixHash Join (cost=30832.46..36230.60 rows=68732 width=4) (actual time=1258.880..1260.519rows=1 loops=1) Hash Cond: ("*VALUES*".column1 = t2.a) -> Nested Loop (cost=0.43..8.51 rows=68732width=8) (actual time=0.071..0.104 rows=1 loops=1) -> Values Scan on "*VALUES*" (cost=0.00..0.01rows=1 width=4) (actual time=0.002..0.003 rows=1 loops=1) -> Append (cost=0.43..8.48 rows=2 width=4) (actual time=0.063..0.093 rows=1 loops=1) -> IndexOnly Scan using i_t1_a on t1 (cost=0.43..8.46 rows=1 width=4) (actual time=0.059..0.075rows=1 loops=1) Index Cond: (a = "*VALUES*".column1) Heap Fetches:2 -> Subquery Scan on "*SELECT* 2" (cost=0.00..0.02 rows=1 width=4) (actual time=0.010..0.010 rows=0 loops=1) Filter: ("*VALUES*".column1 = "*SELECT* 2"."?column?") Rows Removed by Filter: 1 -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.002 rows=1 loops=1) -> Hash (cost=14425.01..14425.01 rows=1000001width=4) (actual time=1250.274..1250.274 rows=1000001 loops=1) Buckets: 131072 Batches: 16 Memory Usage: 3227kB -> Seq Scan on t2 (cost=0.00..14425.01 rows=1000001 width=4) (actual time=0.021..608.245 rows=1000001 loops=1)Planning time: 0.319 msExecution time: 1260.792 ms ======== The plan after fixNested Loop (cost=0.86..17.43 rows=2 width=4) (actual time=0.103..0.118 rows=1 loops=1) Join Filter: ("*VALUES*".column1 = t2.a) -> Nested Loop (cost=0.43..16.51 rows=2 width=8) (actualtime=0.062..0.073 rows=1 loops=1) -> Values Scan on "*VALUES*" (cost=0.00..0.01 rows=1 width=4) (actual time=0.003..0.004 rows=1 loops=1) -> Append (cost=0.43..16.48 rows=2 width=4) (actual time=0.051..0.060 rows=1 loops=1) -> Index Only Scan using i_t1_a on t1 (cost=0.43..16.45 rows=1 width=4) (actual time=0.045..0.046 rows=1 loops=1) Index Cond: (a = "*VALUES*".column1) Heap Fetches: 1 -> Subquery Scan on "*SELECT*2" (cost=0.00..0.02 rows=1 width=4) (actual time=0.005..0.005 rows=0 loops=1) Filter: ("*VALUES*".column1 = "*SELECT* 2"."?column?") Rows Removed by Filter:1 -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.002..0.002rows=1 loops=1) -> Index Only Scan using i_t2_a on t2 (cost=0.42..0.45 rows=1 width=4) (actual time=0.026..0.027 rows=1 loops=1) Index Cond: (a = t1.a) Heap Fetches: 1Planningtime: 0.968 msExecution time: 0.239 ms ========= regards, -- Kyotaro Horiguchi NTT Open Source Software Center
pgsql-hackers by date: