Home > mailing lists

Re: Using multiple extended statistics for estimates - Mailing list pgsql-hackers

From	Mark Dilger
Subject	Re: Using multiple extended statistics for estimates
Date	November 30, 2019 23:01:31
Msg-id	e9b8a0a2-024c-8819-ca09-9773d3719844@gmail.com Whole thread Raw
In response to	Re: Using multiple extended statistics for estimates (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses	Re: Using multiple extended statistics for estimates
List	pgsql-hackers

Tree view


On 11/14/19 12:04 PM, Tomas Vondra wrote:
> On Thu, Nov 14, 2019 at 10:23:44AM -0800, Mark Dilger wrote:
>>
>>
>> On 11/14/19 7:55 AM, Tomas Vondra wrote:
>>> On Wed, Nov 13, 2019 at 10:04:36AM -0800, Mark Dilger wrote:
>>>>
>>>>
>>>> On 11/13/19 7:28 AM, Tomas Vondra wrote:
>>>>> Hi,
>>>>>
>>>>> here's an updated patch, with some minor tweaks based on the review 
>>>>> and
>>>>> added tests (I ended up reworking those a bit, to make them more like
>>>>> the existing ones).
>>>>
>>>> Thanks, Tomas, for the new patch set!
>>>>
>>>> Attached are my review comments so far, in the form of a patch 
>>>> applied on top of yours.
>>>>
>>>
>>> Thanks.
>>>
>>> 1) It's not clear to me why adding 'const' to the List parameters would
>>>   be useful? Can you explain?
>>
>> When I first started reviewing the functions, I didn't know if those 
>> lists were intended to be modified by the function.  Adding 'const' 
>> helps document that the function does not intend to change them.
>>
> 
> Hmmm, ok. I'll think about it, but we're not really using const* in this
> way very much I think - at least not in the surrounding code.
> 
>>> 2) I think you're right we can change find_strongest_dependency to do
>>>
>>>    /* also skip weaker dependencies when attribute count matches */
>>>    if (strongest->nattributes == dependency->nattributes &&
>>>        strongest->degree >= dependency->degree)
>>>        continue;
>>>
>>>   That'll skip some additional dependencies, which seems OK.
>>>
>>> 3) It's not clear to me what you mean by
>>>
>>>     * TODO: Improve this code comment.  Specifically, why would we
>>>     * ignore that no rows will match?  It seems that such a discovery
>>>     * would allow us to return an estimate of 0 rows, and that would
>>>     * be useful.
>>>
>>>   added to dependencies_clauselist_selectivity. Are you saying we
>>>   should also compute selectivity estimates for individual clauses and
>>>   use Min() as a limit? Maybe, but that seems unrelated to the patch.
>>
>> I mean that the comment right above that TODO is hard to understand. 
>> You seem to be saying that it is good and proper to only take the 
>> selectivity estimate from the final clause in the list, but then go on 
>> to say that other clauses might prove that no rows will match.  So 
>> that implies that by ignoring all but the last clause, we're ignoring 
>> such other clauses that prove no rows can match.  But why would we be 
>> ignoring those?
>>
>> I am not arguing that your code is wrong.  I'm just critiquing the 
>> hard-to-understand phrasing of that code comment.
>>
> 
> Aha, I think I understand now - thanks for the explanation. You're right
> the comment is trying to explain why just taking the last clause for a
> given attnum is fine. I'll try to make the comment clearer.

Are you planning to submit a revised patch for this?



-- 
Mark Dilger

pgsql-hackers by date:

From: David Fetter
Date: 30 November 2019, 22:23:15
Subject: Re: Make autovacuum sort tables in descending order of xid_age

From: Mark Dilger
Date: 30 November 2019, 23:22:26
Subject: Re: Should we add xid_current() or a int8->xid cast?

Re: Using multiple extended statistics for estimates - Mailing list pgsql-hackers

Previous

Next