Superfluous merge/sort - Mailing list pgsql-performance
From | Anuradha Ratnaweera |
---|---|
Subject | Superfluous merge/sort |
Date | |
Msg-id | 20030225041143.GA4886@aratnaweera.virtusa.com Whole thread Raw |
Responses |
Re: Superfluous merge/sort
|
List | pgsql-performance |
Question in brief: does the planner/optimizer take into account the foreign key constraints? If the answer is "no", please stop reading here. Here comes the details. Let me give a simple example to illustrate the situation. 1. We have two tables t1 and t2. create table t1 ( id integer primary key, dummy integer ); create table t2 ( id integer, dummy integer ); 2. We create indexes on all the non-pkey fields. create index t1_dummy_idx on t1(dummy); create index t2_id_idx on t2(id); create index t2_dummy_idx on t2(dummy); 3. We make t2(id) a foreign key of t1(id). alter table t2 add constraint t2_fkey foreign key (id) references t1(id); 4. Populate "t1" with unique "id"s from 0 to 19999 with a dummy value. copy "t1" from stdin; 0 654 1 86097 ... 19998 93716 19999 9106 \. 5. Populate "t2" with 50000 "id"s with a normal distribution. copy "t2" from stdin; 8017 98659 11825 5946 ... 8202 35994 8436 19729 \. Now we are ready to go ... - First query is to find the "dummy" values with highest frequency. => explain select dummy from t2 group by dummy order by count(*) desc limit 10; NOTICE: QUERY PLAN: Limit (cost=2303.19..2303.19 rows=10 width=4) -> Sort (cost=2303.19..2303.19 rows=5000 width=4) -> Aggregate (cost=0.00..1996.00 rows=5000 width=4) -> Group (cost=0.00..1871.00 rows=50000 width=4) -> Index Scan using t2_dummy_idx on t2 (cost=0.00..1746.00 rows=50000 width=4) EXPLAIN - Second query is esseitially the same, but we do a merge with "t1" on the foreign key (just for the sake of illustrating the point). => explain select t2.dummy from t1, t2 where t1.id = t2.id group by t2.dummy order by count(*) desc limit 10; NOTICE: QUERY PLAN: Limit (cost=7643.60..7643.60 rows=10 width=12) -> Sort (cost=7643.60..7643.60 rows=5000 width=12) -> Aggregate (cost=7086.41..7336.41 rows=5000 width=12) -> Group (cost=7086.41..7211.41 rows=50000 width=12) -> Sort (cost=7086.41..7086.41 rows=50000 width=12) -> Merge Join (cost=0.00..3184.00 rows=50000 width=12) -> Index Scan using t1_pkey on t1 (cost=0.00..638.00 rows=20000 width=4) -> Index Scan using t2_id_idx on t2 (cost=0.00..1746.00 rows=50000 width=8) EXPLAIN Does this mean that the planner/optimizer doesn't take into account the foreign key constraint? Anuradha -- Debian GNU/Linux (kernel 2.4.21-pre4) Don't worry about the world coming to an end today. It's already tomorrow in Australia. -- Charles Schulz
pgsql-performance by date: