Re: Using multiple extended statistics for estimates - Mailing list pgsql-hackers
From | Mark Dilger |
---|---|
Subject | Re: Using multiple extended statistics for estimates |
Date | |
Msg-id | e9b8a0a2-024c-8819-ca09-9773d3719844@gmail.com Whole thread Raw |
In response to | Re: Using multiple extended statistics for estimates (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Responses |
Re: Using multiple extended statistics for estimates
|
List | pgsql-hackers |
On 11/14/19 12:04 PM, Tomas Vondra wrote: > On Thu, Nov 14, 2019 at 10:23:44AM -0800, Mark Dilger wrote: >> >> >> On 11/14/19 7:55 AM, Tomas Vondra wrote: >>> On Wed, Nov 13, 2019 at 10:04:36AM -0800, Mark Dilger wrote: >>>> >>>> >>>> On 11/13/19 7:28 AM, Tomas Vondra wrote: >>>>> Hi, >>>>> >>>>> here's an updated patch, with some minor tweaks based on the review >>>>> and >>>>> added tests (I ended up reworking those a bit, to make them more like >>>>> the existing ones). >>>> >>>> Thanks, Tomas, for the new patch set! >>>> >>>> Attached are my review comments so far, in the form of a patch >>>> applied on top of yours. >>>> >>> >>> Thanks. >>> >>> 1) It's not clear to me why adding 'const' to the List parameters would >>> be useful? Can you explain? >> >> When I first started reviewing the functions, I didn't know if those >> lists were intended to be modified by the function. Adding 'const' >> helps document that the function does not intend to change them. >> > > Hmmm, ok. I'll think about it, but we're not really using const* in this > way very much I think - at least not in the surrounding code. > >>> 2) I think you're right we can change find_strongest_dependency to do >>> >>> /* also skip weaker dependencies when attribute count matches */ >>> if (strongest->nattributes == dependency->nattributes && >>> strongest->degree >= dependency->degree) >>> continue; >>> >>> That'll skip some additional dependencies, which seems OK. >>> >>> 3) It's not clear to me what you mean by >>> >>> * TODO: Improve this code comment. Specifically, why would we >>> * ignore that no rows will match? It seems that such a discovery >>> * would allow us to return an estimate of 0 rows, and that would >>> * be useful. >>> >>> added to dependencies_clauselist_selectivity. Are you saying we >>> should also compute selectivity estimates for individual clauses and >>> use Min() as a limit? Maybe, but that seems unrelated to the patch. >> >> I mean that the comment right above that TODO is hard to understand. >> You seem to be saying that it is good and proper to only take the >> selectivity estimate from the final clause in the list, but then go on >> to say that other clauses might prove that no rows will match. So >> that implies that by ignoring all but the last clause, we're ignoring >> such other clauses that prove no rows can match. But why would we be >> ignoring those? >> >> I am not arguing that your code is wrong. I'm just critiquing the >> hard-to-understand phrasing of that code comment. >> > > Aha, I think I understand now - thanks for the explanation. You're right > the comment is trying to explain why just taking the last clause for a > given attnum is fine. I'll try to make the comment clearer. Are you planning to submit a revised patch for this? -- Mark Dilger
pgsql-hackers by date: