Explain explained - Mailing list pgsql-novice
From | Markus Stocker |
---|---|
Subject | Explain explained |
Date | |
Msg-id | a9dfaf710803040653u14d98a0eic19821bc0f3d7443@mail.gmail.com Whole thread Raw |
Responses |
Re: Explain explained
Re: Explain explained numeric definition advice |
List | pgsql-novice |
Hi, I'm trying to figure out how to interpret the query plan explanation returned by postgresql. There are a couple things that I can't explain to myself, so help is appreciated. This is the query plan returned by the query I'm inspecting. Nested Loop (cost=0.00..450.39 rows=1 width=128) -> Nested Loop (cost=0.00..444.71 rows=1 width=140) -> Nested Loop (cost=0.00..443.72 rows=1 width=148) -> Nested Loop (cost=0.00..442.40 rows=1 width=80) -> Nested Loop (cost=0.00..440.28 rows=1 width=72) -> Nested Loop (cost=0.00..438.97 rows=1 width=4) -> Seq Scan on individual_name name_0 (cost=0.00..430.68 rows=1 width=4) Filter: (name = 'http://www.University0.edu'::text) -> Index Scan using object_role_assertion_b_role_idx on object_role_assertion ora_0 (cost=0.00..8.28 rows=1 width=8) Index Cond: ((name_0.id = ora_0.b) AND (ora_0.object_role = 69)) -> Index Scan using individual_name_pkey on individual_name name_1 (cost=0.00..1.30 rows=1 width=68) Index Cond: (name_1.id = ora_0.a) -> Index Scan using object_role_assertion_b_role_idx on object_role_assertion ora_1 (cost=0.00..2.10 rows=1 width=8) Index Cond: ((ora_0.a = ora_1.b) AND (ora_1.object_role = 44)) -> Index Scan using individual_name_pkey on individual_name name_2 (cost=0.00..1.30 rows=1 width=68) Index Cond: (name_2.id = ora_1.a) -> Index Scan using concept_assertion_pkey on concept_assertion ca_0 (cost=0.00..0.98 rows=1 width=4) Index Cond: ((ca_0.concept = 5) AND (ca_0.individual = ora_1.a)) -> Index Scan using concept_assertion_pkey on concept_assertion ca_1 (cost=0.00..5.67 rows=1 width=4) Index Cond: ((ca_1.concept = 20) AND (ora_0.a = ca_1.individual)) Now, the first question is, how to read this. My understanding is that the plan contains some nested loops and the first constrain is on the table individual_name by filtering the name attribute with the value http://www.University0.edu. This is done by a sequential scan. Next we have an index scan on object_role_assertion using the object_role_assertion_b_role_idx. The result set returned by the sequential and index scan is then joined with the index scan on individual_name using individual_name_pkey with the constrain name_1.id = ora_0.a ... and so on. If this sequence is correct, I'm wondering about the following: 1/ How does postgresql know that the constrain individual_name.name = 'http://www.University0.edu' matches one row (which is in fact true)? I'm aware that databases rely on statistics of data distribution but it doesn't seem possible to me to exactly know the statistics about each possible value for attributes. Further, why expects postgresql for each operation a row size of 1? 2/ Sequential scans seem to me more expensive compared to index scans. I'm wondering why the sequential scan on individual_name is the first executed in the plan. Having or not having an index on individual_name.name doesn't change anything to the query plan. Why? 3/ There is one condition in the query, i.e. concept_assertion.concept = 5 with an empty result set, i.e. selectivity 0. In fact, the last index scan on concept_assertion ca_1 in the plan is never executed (this is what 'explain analyze' tells me). I'm wondering, why this constrain is not executed first. By executing this first, we could just never execute everything else. Postgresql seems to have fine-grained statistics, why does it get the estimate for individual_name.name = 'http://www.University0.edu' exactly and is wrong on concept_assertion.concept = 5? I did execute analyze before. Thanks for your explanations, markus
pgsql-novice by date: