Home > mailing lists
Clamping reulst row number of joins. - Mailing list pgsql-hackers

From	Kyotaro HORIGUCHI
Subject	Clamping reulst row number of joins.
Date	March 6, 2015 08:35:16
Msg-id	20150306.173341.188111041.horiguchi.kyotaro@lab.ntt.co.jp Whole thread Raw
Responses	Re: Clamping reulst row number of joins.
List	pgsql-hackers
Tree view
Hello, I had a report that a query gets wired estimated row
numbers and it makes subsequent plans wrong.

Although the detailed explanation is shown later in this mail,
the problem here was that a query like following makes the path
with apparently wrong row number.

EXPLAIN SELECT a FROM (SELECT a FROM t1 UNION ALL SELECT 0) t        JOIN (VALUES (1)) tt(x) ON tt.x = t.a;

0> Nested Loop  (cost=0.43..8.51 rows=68732 width=4)
1>   ->  Values Scan on "*VALUES*"  (cost=0.00..0.01 rows=1 width=4)
2>   ->  Append  (cost=0.43..8.48 rows=2 width=4)
3>         ->  Index Only Scan using i_t1_a on t1
4>               (cost=0.43..8.46 rows=1 width=4)
5>               Index Cond: (a = "*VALUES*".column1)
6>         ->  Subquery Scan on "*SELECT* 2"  (cost=0.00..0.02 rows=1 width=4)
7>               Filter: ("*VALUES*".column1 = "*SELECT* 2"."?column?")
8>               ->  Result  (cost=0.00..0.01 rows=1 width=0)

The Nested Loop at line 0 says that it emits 68832 rows even
though the underlying nodes, Values at line 1 and Append at line
2 are giving 1 and 2 respectively as their reults. This query
actually returns only 1 row. It is seemingly wrong and causes the
plans above go wrong.

This is caused by the Append node, which don't have statistics,
so eqjoinsel takes 0.5 as the default selectivity for the nested
loop. Even though it is defficult to get statistics for Append
node, the nested loop could get more sane row nubmer if it
doesn't ignore the parameterized path selected for the scan on
t1.

I suppose that join nodes can safely clamp the number of its
result rows by the product of the row numbers of underlying two
paths. And if it is correct, the attached patch can correct the
estimation like this.

>  Nested Loop  (cost=0.43..16.51 rows=2 width=4)
>    ->  Values Scan on "*VALUES*"  (cost=0.00..0.01 rows=1 width=4)
>    ->  Append  (cost=0.43..16.48 rows=2 width=4)
>          ->  Index Only Scan using i_t1_a on t1 
>                        (cost=0.43..16.45 rows=1 width=4)
>                Index Cond: (a = "*VALUES*".column1)
>          ->  Subquery Scan on "*SELECT* 2"  (cost=0.00..0.02 rows=1 width=4)
>                Filter: ("*VALUES*".column1 = "*SELECT* 2"."?column?")
>                ->  Result  (cost=0.00..0.01 rows=1 width=0)

Is it correct? And Does it have no bad side effects?
And is this applicable for 9.5?


The detailed test environment is described below.

Before this fix, the inner path of the top-level join is doing
hash because the inner path says that it will give 68732
rows. (But only 1 actually).

After this fix, the outer path declaring that it gives 2 rows so
the top-level join becomes nested loop and the inner path becomes
index scan.

=======
CREATE TABLE t1 (a int);
INSERT INTO t1 (SELECT (a%2)*500000 + a / 2 FROM generate_series(0, 1000000) a);
CREATE INDEX i_t1_a ON t1 (a);
CREATE TABLE t2 AS SELECT * from t1;
ANALYZE t1;
ANALYZE t2;
-- emulating the problematic situation
SET join_collapse_limit TO 1;
SET random_page_cost TO 8.0;
SET seq_page_cost TO 0.5;

EXPLAIN SELECT tt2.a FROM (SELECT a       FROM (SELECT a FROM t1             UNION ALL SELECT 0) t       JOIN (VALUES
(1))tt(x) ON tt.x = t.a) tt2 JOIN t2 ON (tt2.a = t2.a);
 


======== The plan before fixHash Join  (cost=30832.46..36230.60 rows=68732 width=4)           (actual
time=1258.880..1260.519rows=1 loops=1)  Hash Cond: ("*VALUES*".column1 = t2.a)  ->  Nested Loop  (cost=0.43..8.51
rows=68732width=8)                   (actual time=0.071..0.104 rows=1 loops=1)        ->  Values Scan on "*VALUES*"
(cost=0.00..0.01rows=1 width=4)                                     (actual time=0.002..0.003 rows=1 loops=1)        ->
Append  (cost=0.43..8.48 rows=2 width=4)                    (actual time=0.063..0.093 rows=1 loops=1)              ->
IndexOnly Scan using i_t1_a on t1                       (cost=0.43..8.46 rows=1 width=4)                       (actual
time=0.059..0.075rows=1 loops=1)                    Index Cond: (a = "*VALUES*".column1)                    Heap
Fetches:2              ->  Subquery Scan on "*SELECT* 2"                     (cost=0.00..0.02 rows=1 width=4)
        (actual time=0.010..0.010 rows=0 loops=1)                    Filter: ("*VALUES*".column1 = "*SELECT*
2"."?column?")                   Rows Removed by Filter: 1                    ->  Result  (cost=0.00..0.01 rows=1
width=0)                               (actual time=0.001..0.002 rows=1 loops=1)  ->  Hash  (cost=14425.01..14425.01
rows=1000001width=4)            (actual time=1250.274..1250.274 rows=1000001 loops=1)        Buckets: 131072  Batches:
16 Memory Usage: 3227kB        ->  Seq Scan on t2  (cost=0.00..14425.01 rows=1000001 width=4)
(actual time=0.021..608.245 rows=1000001 loops=1)Planning time: 0.319 msExecution time: 1260.792 ms
 

======== The plan after fixNested Loop  (cost=0.86..17.43 rows=2 width=4)             (actual time=0.103..0.118 rows=1
loops=1) Join Filter: ("*VALUES*".column1 = t2.a)  ->  Nested Loop  (cost=0.43..16.51 rows=2 width=8)
(actualtime=0.062..0.073 rows=1 loops=1)        ->  Values Scan on "*VALUES*"  (cost=0.00..0.01 rows=1 width=4)
                            (actual time=0.003..0.004 rows=1 loops=1)        ->  Append  (cost=0.43..16.48 rows=2
width=4)                   (actual time=0.051..0.060 rows=1 loops=1)              ->  Index Only Scan using i_t1_a on
t1                   (cost=0.43..16.45 rows=1 width=4)                    (actual time=0.045..0.046 rows=1 loops=1)
              Index Cond: (a = "*VALUES*".column1)                    Heap Fetches: 1              ->  Subquery Scan on
"*SELECT*2"                     (cost=0.00..0.02 rows=1 width=4)                    (actual time=0.005..0.005 rows=0
loops=1)                   Filter: ("*VALUES*".column1 = "*SELECT* 2"."?column?")                    Rows Removed by
Filter:1                    ->  Result  (cost=0.00..0.01 rows=1 width=0)                                (actual
time=0.002..0.002rows=1 loops=1)  ->  Index Only Scan using i_t2_a on t2  (cost=0.42..0.45 rows=1 width=4)
                  (actual time=0.026..0.027 rows=1 loops=1)        Index Cond: (a = t1.a)        Heap Fetches:
1Planningtime: 0.968 msExecution time: 0.239 ms
 
=========

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
pgsql-hackers by date:
From: Noah Misch
Date: 06 March 2015, 08:33:40
Subject: Re: Rethinking pg_dump's function sorting code
From: Dean Rasheed
Date: 06 March 2015, 09:04:41
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Clamping reulst row number of joins. - Mailing list pgsql-hackers

Previous

Next