Home > mailing lists

Re: GiST support for inet datatypes - Mailing list pgsql-hackers

From	Andreas Karlsson
Subject	Re: GiST support for inet datatypes
Date	January 31, 2014 01:58:12
Msg-id	52EB0331.5090500@proxel.se Whole thread Raw
In response to	Re: GiST support for inet datatypes (Emre Hasegeli <emre@hasegeli.com>)
Responses	Re: GiST support for inet datatypes
List	pgsql-hackers

Tree view

On 01/23/2014 11:22 PM, Emre Hasegeli wrote:
 > inet-gist
 > ---------
 > I realized that create extension btree_gist fails with the patch.
>
> ERROR:  could not make operator class "gist_inet_ops" be default for type inet
> DETAIL:  Operator class "inet_ops" already is the default.
>
> I think making the new operator class default makes more sense. Name conflict
> can be a bigger problem. Should I rename the new operator class?

I agree that the new operator class should be the default one, it is
more useful and also not buggy. No idea about the naming.

>> The only thing is that I think your index would do poorly on the case where
>> all values share a prefix before the netmask but have different values after
>> the netmask, since gist_union and gist_penalty does not care about the bits
>> after the netmask. Am I correct? Have you thought anything about this?
>
> Yes, I have tried to construct the index with ip_maxbits() instead of ip_bits()
> but I could not make it work with the cidr type. It can be done by adding
> operator as a parameter for union, penalty and picksplit functions on the
> GiST framework. I am not sure it worths the trouble. It would only help
> the basic comparison operators and it would slow down other operators because
> network length comparison before network address would not be possible for
> the intermediate nodes. I mentioned this behavior of the operator class on
> the paragraph added to the documentation.

I think this is fine since it does not seem like a very useful case to
support to me. Would be worth doing only if it is easy to do.

A separate concern: we still have not come to a decision about operators
for inet here. I do not like the fact that the operators differ between
ranges and inet. But I feel this might be out of scope for this patch.

Does any third party want to chime in with an opinion?

Current inet/cidr
<<     is contained within
<<=     is contained within or equals
 >>     contains
 >>=     contains or equals

Range types
@>     contains range
<@     range is contained by
&&     overlap (have points in common)
<<     strictly left of
 >>     strictly right of

>> inet-selfuncs
>> -------------
>>
>> Here I have to honestly admit I am not sure if I can tell if your
>> selectivity function is correct. Your explanation sounds convincing and the
>> general idea of looking at the histogram is correct. I am just have no idea
>> if the details are right.
>
> I tried to improve it in this version. I hope it is more readable now. I used
> the word inclusion instead of overlap, made some changes on the algorithm
> to make it more suitable to the other operators.
>
> Using the histogram for inclusion which is originally for basic comparison,
> is definitely not correct. It is an empirical approach. I have tried several
> versions of it, tested them with different data sets and thought it is better
> than nothing.

Thanks for the updates. The code in this version is cleaner and easier
to follow.

I am not convinced of your approach to calculating the selectivity from
the histogram. The thing I have the problem with is the clever trickery
involved with how you handle different operator types. I prefer the
clearer code of the range types with how calc_hist_selectivity_scalar is
used. Is there any reason for why that approach would not work here or
result in worse code?

I have attached a patch where I improved the language of your comment
describing the workings of the selectivity estimation. Could you please
check it so I did not change the meaning of any of it?

I see from the tests that you still are missing selectivity functions
for operators, what is your plan for this?

--
Andreas Karlsson

Attachment

inet-selfuncs-commentfix.patch

pgsql-hackers by date:

From: Vik Fearing
Date: 31 January 2014, 01:14:56
Subject: Re: [PATCH] pg_sleep(interval)

From: Peter Geoghegan
Date: 31 January 2014, 02:33:32
Subject: Re: Making strxfrm() blobs in indexes work

Re: GiST support for inet datatypes - Mailing list pgsql-hackers

Attachment

Previous

Next