Thread: [GENERAL] hash join performance question

[GENERAL] hash join performance question

From

armand pirvu

Date:

19 July 2017, 06:05:24

Hi

testdb3=# \d csischema.dim_company;
               Table "csischema.dim_company"
     Column      |            Type             | Modifiers
-----------------+-----------------------------+-----------
 company_id      | integer                     | not null
 company_name    | character varying(100)      |
 city            | character varying(100)      |
 state           | character varying(100)      |
 postal_code     | character varying(100)      |
 country         | character varying(100)      |
 latitude        | double precision            |
 longitude       | double precision            |
 update_datetime | timestamp without time zone |
 company_source  | character varying(1)        |
Indexes:
    "dim_company_pkey" PRIMARY KEY, btree (company_id)

testdb3=# \d woc.dim_company;
                  Table "woc.dim_company"
     Column      |            Type             | Modifiers
-----------------+-----------------------------+-----------
 company_id      | integer                     | not null
 company_name    | character varying(100)      |
 city            | character varying(100)      |
 state           | character varying(100)      |
 postal_code     | character varying(100)      |
 country         | character varying(100)      |
 latitude        | double precision            |
 longitude       | double precision            |
 update_datetime | timestamp without time zone |
 company_source  | character varying(1)        |
Indexes:
    "dim_company_pkey" PRIMARY KEY, btree (company_id)

testdb3=# select count(*) from csischema.dim_company;
  count
---------
 1786376
(1 row)

testdb3=# select count(*) from woc.dim_company;
 count
-------
 18980
(1 row)

woc.dim_company is a subset of csischema.dim_company meaning all company_id from woc.dim_company  are in
csischema.dim_company
Ratio is around 1%




SELECT a.company_id  FROM csischema.dim_company a, woc.dim_company b
WHERE a.company_id = b.company_id;


testdb3=# explain analyze SELECT a.company_id  FROM csischema.dim_company a, woc.dim_company b
testdb3-# WHERE a.company_id = b.company_id;
                                                          QUERY PLAN
       

-------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=711.05..54938.35 rows=18980 width=4) (actual time=34.067..1118.603 rows=18980 loops=1)
   Hash Cond: (a.company_id = b.company_id)
   ->  Seq Scan on dim_company a  (cost=0.00..47097.82 rows=1850582 width=4) (actual time=0.013..523.249 rows=1786376
loops=1)
   ->  Hash  (cost=473.80..473.80 rows=18980 width=4) (actual time=20.203..20.203 rows=18980 loops=1)
         Buckets: 32768  Batches: 1  Memory Usage: 924kB
         ->  Seq Scan on dim_company b  (cost=0.00..473.80 rows=18980 width=4) (actual time=0.007..10.076 rows=18980
loops=1)
 Planning time: 0.511 ms
 Execution time: 1121.068 ms
(8 rows)


I was expecting at least the PK of csischema.dim_company to be used . In another DBMS that was the case. The larger
table, csischema.dim_company used the PK. 


Any hints, thoughts what am I not seing ?


Thank you
Armand

Re: [GENERAL] hash join performance question

From

Tom Lane

Date:

19 July 2017, 06:30:30

armand pirvu <armand.pirvu@gmail.com> writes:
> testdb3=# explain analyze SELECT a.company_id  FROM csischema.dim_company a, woc.dim_company b
> testdb3-# WHERE a.company_id = b.company_id;
>                                                           QUERY PLAN
         
>
-------------------------------------------------------------------------------------------------------------------------------
>  Hash Join  (cost=711.05..54938.35 rows=18980 width=4) (actual time=34.067..1118.603 rows=18980 loops=1)
>    Hash Cond: (a.company_id = b.company_id)
>    ->  Seq Scan on dim_company a  (cost=0.00..47097.82 rows=1850582 width=4) (actual time=0.013..523.249 rows=1786376
loops=1)
>    ->  Hash  (cost=473.80..473.80 rows=18980 width=4) (actual time=20.203..20.203 rows=18980 loops=1)
>          Buckets: 32768  Batches: 1  Memory Usage: 924kB
>          ->  Seq Scan on dim_company b  (cost=0.00..473.80 rows=18980 width=4) (actual time=0.007..10.076 rows=18980
loops=1)
>  Planning time: 0.511 ms
>  Execution time: 1121.068 ms
> (8 rows)

> I was expecting at least the PK of csischema.dim_company to be used . In another DBMS that was the case. The larger
table, csischema.dim_company used the PK. 

That looks like a perfectly reasonable plan to me.  If you think it isn't,
perhaps because you're assuming that both tables are fully cached in RAM,
then you should reduce random_page_cost to teach the planner that that's
the execution scenario you're expecting.  Everything always in RAM would
correspond to random_page_cost = 1, and some rough calculations suggest
that that would reduce the estimated cost of a
nestloop-with-inner-indexscan enough to make the planner choose that way.

            regards, tom lane

Re: [GENERAL] hash join performance question

From

armand pirvu

Date:

20 July 2017, 22:07:02

> On Jul 18, 2017, at 10:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> armand pirvu <armand.pirvu@gmail.com> writes:
>> testdb3=# explain analyze SELECT a.company_id  FROM csischema.dim_company a, woc.dim_company b
>> testdb3-# WHERE a.company_id = b.company_id;
>>                                                          QUERY PLAN
         
>>
-------------------------------------------------------------------------------------------------------------------------------
>> Hash Join  (cost=711.05..54938.35 rows=18980 width=4) (actual time=34.067..1118.603 rows=18980 loops=1)
>>   Hash Cond: (a.company_id = b.company_id)
>>   ->  Seq Scan on dim_company a  (cost=0.00..47097.82 rows=1850582 width=4) (actual time=0.013..523.249 rows=1786376
loops=1)
>>   ->  Hash  (cost=473.80..473.80 rows=18980 width=4) (actual time=20.203..20.203 rows=18980 loops=1)
>>         Buckets: 32768  Batches: 1  Memory Usage: 924kB
>>         ->  Seq Scan on dim_company b  (cost=0.00..473.80 rows=18980 width=4) (actual time=0.007..10.076 rows=18980
loops=1)
>> Planning time: 0.511 ms
>> Execution time: 1121.068 ms
>> (8 rows)
>
>> I was expecting at least the PK of csischema.dim_company to be used . In another DBMS that was the case. The larger
table, csischema.dim_company used the PK. 
>
> That looks like a perfectly reasonable plan to me.  If you think it isn't,
> perhaps because you're assuming that both tables are fully cached in RAM,
> then you should reduce random_page_cost to teach the planner that that's
> the execution scenario you're expecting.  Everything always in RAM would
> correspond to random_page_cost = 1, and some rough calculations suggest
> that that would reduce the estimated cost of a
> nestloop-with-inner-indexscan enough to make the planner choose that way.
>
>             regards, tom lane


Thank you Tom

Made a bit reading about the random_page_cost value
I understand not all optimizers are equal
But for example in Ingres world



                        K Join(col1)
                        Heap
                        Pages 57 Tups 18981
                        D696 C1139
             /                      \
            Proj-rest               $tk1
            Sorted(col1)            I(a)
            Pages 76 Tups 18981     B-Tree(col1)
            D25 C190                Pages 2140 Tups 426435
 /
$tk2
I(b)
B-Tree(NU)
Pages 98 Tups 18981
ds8(armandp):/u1/sys_admin/armandp> time sql -uirs testdb <foo.sql > /dev/null

real    0m0.37s
user    0m0.04s
sys     0m0.01s


And that is pretty constant, whether pages are in the cache or not
More important IMHO , rather than scan the smaller table , I just scan it’s PK which is an index at the end of the day,
whichthen I join with the larger table PK 
Now granted I have hash joins disabled on Ingres so not sure this is a true apple to apple .
And that what made me raise the question
I would like to know why in Postgres smaller table gets scanned  as opposed to use it’s PK

After all , one column is far less expensive to traverse top to bottom than all columns


Thank you
Armand