Thread: SP-GiST support for inet datatypes
Attached patches add SP-GiST support to the inet datatypes. The operator class comes with a small change on the SP-GiST framework to allow fixed number of child nodes.
The index is like prefix tree except that it doesn't bother to split the addresses into parts as text is split. It also doesn't use labels to know the part after the prefix, but relies on static node numbers.
> hasegeli=# SELECT DISTINCT route INTO hmm FROM routes_view WHERE asn = 2914;
> SELECT 732
>
> hasegeli=# EXPLAIN ANALYZE SELECT routes.route FROM routes JOIN hmm ON routes.route && hmm.route;
> QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------------------------------
> Nested Loop (cost=0.41..571742.27 rows=2248 width=7) (actual time=12.643..20474.813 rows=8127 loops=1)
> -> Seq Scan on hmm (cost=0.00..11.32 rows=732 width=7) (actual time=0.017..0.524 rows=732 loops=1)
> -> Index Only Scan using route_gist on routes (cost=0.41..552.05 rows=22900 width=7) (actual time=4.851..27.948 rows=11 loops=732)
> Index Cond: (route && (hmm.route)::inet)
> Heap Fetches: 8127
> Planning time: 1.507 ms
> Execution time: 20475.605 ms
> (7 rows)
>
> hasegeli=# DROP INDEX route_gist;
> DROP INDEX
>
> hasegeli=# CREATE INDEX route_spgist ON routes USING spgist (route);
> CREATE INDEX
>
> hasegeli=# EXPLAIN ANALYZE SELECT routes.route FROM routes JOIN hmm ON routes.route && hmm.route;
> QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------------------
> Nested Loop (cost=0.41..588634.27 rows=2248 width=7) (actual time=0.081..16.961 rows=8127 loops=1)
> -> Seq Scan on hmm (cost=0.00..11.32 rows=732 width=7) (actual time=0.022..0.079 rows=732 loops=1)
> -> Index Only Scan using route_spgist on routes (cost=0.41..575.13 rows=22900 width=7) (actual time=0.014..0.021 rows=11 loops=732)
> Index Cond: (route && (hmm.route)::inet)
> Heap Fetches: 8127
> Planning time: 1.376 ms
> Execution time: 15.936 ms
Attachment
On Wed, Mar 2, 2016 at 11:56 PM, Emre Hasegeli <emre@hasegeli.com> wrote:
Attached patches add SP-GiST support to the inet datatypes. The operator class comes with a small change on the SP-GiST framework to allow fixed number of child nodes.The index is like prefix tree except that it doesn't bother to split the addresses into parts as text is split. It also doesn't use labels to know the part after the prefix, but relies on static node numbers.
Thanks, Emre for interesting spgist. We are bit busy and will take a look on your patches when come to our spgist patch.
The GiST index released with version 9.4 performs really bad with real world data. SP-GiST works much better with the query posted to the performance list [1] a while ago:> hasegeli=# SELECT DISTINCT route INTO hmm FROM routes_view WHERE asn = 2914;> SELECT 732>> hasegeli=# EXPLAIN ANALYZE SELECT routes.route FROM routes JOIN hmm ON routes.route && hmm.route;> QUERY PLAN> ----------------------------------------------------------------------------------------------------------------------------------------> Nested Loop (cost=0.41..571742.27 rows=2248 width=7) (actual time=12.643..20474.813 rows=8127 loops=1)> -> Seq Scan on hmm (cost=0.00..11.32 rows=732 width=7) (actual time=0.017..0.524 rows=732 loops=1)> -> Index Only Scan using route_gist on routes (cost=0.41..552.05 rows=22900 width=7) (actual time=4.851..27.948 rows=11 loops=732)> Index Cond: (route && (hmm.route)::inet)> Heap Fetches: 8127> Planning time: 1.507 ms> Execution time: 20475.605 ms> (7 rows)>> hasegeli=# DROP INDEX route_gist;> DROP INDEX>> hasegeli=# CREATE INDEX route_spgist ON routes USING spgist (route);> CREATE INDEX>> hasegeli=# EXPLAIN ANALYZE SELECT routes.route FROM routes JOIN hmm ON routes.route && hmm.route;> QUERY PLAN> -----------------------------------------------------------------------------------------------------------------------------------------> Nested Loop (cost=0.41..588634.27 rows=2248 width=7) (actual time=0.081..16.961 rows=8127 loops=1)> -> Seq Scan on hmm (cost=0.00..11.32 rows=732 width=7) (actual time=0.022..0.079 rows=732 loops=1)> -> Index Only Scan using route_spgist on routes (cost=0.41..575.13 rows=22900 width=7) (actual time=0.014..0.021 rows=11 loops=732)> Index Cond: (route && (hmm.route)::inet)> Heap Fetches: 8127> Planning time: 1.376 ms> Execution time: 15.936 ms
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Mar 3, 2016 at 8:51 AM, Oleg Bartunov <obartunov@gmail.com> wrote:
On Wed, Mar 2, 2016 at 11:56 PM, Emre Hasegeli <emre@hasegeli.com> wrote:Attached patches add SP-GiST support to the inet datatypes. The operator class comes with a small change on the SP-GiST framework to allow fixed number of child nodes.The index is like prefix tree except that it doesn't bother to split the addresses into parts as text is split. It also doesn't use labels to know the part after the prefix, but relies on static node numbers.Thanks, Emre for interesting spgist. We are bit busy and will take a look on your patches when come to our spgist patch.
Emre, I checked original thread and didn't find sample data. Could you provide them for testing ?
--The GiST index released with version 9.4 performs really bad with real world data. SP-GiST works much better with the query posted to the performance list [1] a while ago:> hasegeli=# SELECT DISTINCT route INTO hmm FROM routes_view WHERE asn = 2914;> SELECT 732>> hasegeli=# EXPLAIN ANALYZE SELECT routes.route FROM routes JOIN hmm ON routes.route && hmm.route;> QUERY PLAN> ----------------------------------------------------------------------------------------------------------------------------------------> Nested Loop (cost=0.41..571742.27 rows=2248 width=7) (actual time=12.643..20474.813 rows=8127 loops=1)> -> Seq Scan on hmm (cost=0.00..11.32 rows=732 width=7) (actual time=0.017..0.524 rows=732 loops=1)> -> Index Only Scan using route_gist on routes (cost=0.41..552.05 rows=22900 width=7) (actual time=4.851..27.948 rows=11 loops=732)> Index Cond: (route && (hmm.route)::inet)> Heap Fetches: 8127> Planning time: 1.507 ms> Execution time: 20475.605 ms> (7 rows)>> hasegeli=# DROP INDEX route_gist;> DROP INDEX>> hasegeli=# CREATE INDEX route_spgist ON routes USING spgist (route);> CREATE INDEX>> hasegeli=# EXPLAIN ANALYZE SELECT routes.route FROM routes JOIN hmm ON routes.route && hmm.route;> QUERY PLAN> -----------------------------------------------------------------------------------------------------------------------------------------> Nested Loop (cost=0.41..588634.27 rows=2248 width=7) (actual time=0.081..16.961 rows=8127 loops=1)> -> Seq Scan on hmm (cost=0.00..11.32 rows=732 width=7) (actual time=0.022..0.079 rows=732 loops=1)> -> Index Only Scan using route_spgist on routes (cost=0.41..575.13 rows=22900 width=7) (actual time=0.014..0.021 rows=11 loops=732)> Index Cond: (route && (hmm.route)::inet)> Heap Fetches: 8127> Planning time: 1.376 ms> Execution time: 15.936 ms
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
> Emre, I checked original thread and didn't find sample data. Could you provide them for testing ? I found it on the Git history: https://github.com/job/irrexplorer/blob/9e8b5330d7ef0022abbe1af18291257e044eb24b/data/irrexplorer_dump.sql.gz?raw=true
<div dir="ltr"><br /><div class="gmail_extra"><br /><div class="gmail_quote">On Thu, Mar 3, 2016 at 11:45 AM, Emre Hasegeli<span dir="ltr"><<a href="mailto:emre@hasegeli.com" target="_blank">emre@hasegeli.com</a>></span> wrote:<br/><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><spanclass="">> Emre, I checked original thread and didn't find sample data. Couldyou provide them for testing ?<br /><br /></span>I found it on the Git history:<br /><br /><a href="https://github.com/job/irrexplorer/blob/9e8b5330d7ef0022abbe1af18291257e044eb24b/data/irrexplorer_dump.sql.gz?raw=true" rel="noreferrer" target="_blank">https://github.com/job/irrexplorer/blob/9e8b5330d7ef0022abbe1af18291257e044eb24b/data/irrexplorer_dump.sql.gz?raw=true</a><br /></blockquote></div><br/></div><div class="gmail_extra">Thanks !<br /><br /></div><div class="gmail_extra">spgist indexcreates 2 times faster than gist, but index size is noticeably bugger <br /><br />\di+ route_*<br /> List of relations<br /> Schema | Name | Type | Owner | Table | Size | Description<br/>--------+--------------+-------+----------+--------+--------+-------------<br /> public | route_gist |index | postgres | routes | 96 MB |<br /> public | route_spgist | index | postgres | routes | 132 MB |<br />(2 rows)<br/><br /></div><div class="gmail_extra">Spgist index tree is much better than gist - 12149 pages vs 1334760 !<br/></div><div class="gmail_extra"><br /></div><div class="gmail_extra"><br /><br />EXPLAIN (ANALYZE, buffers) SELECT routes.routeFROM routes JOIN hmm ON<br />routes.route && hmm.route;<br /> QUERY PLAN<br />----------------------------------------------------------------------------------------------------------------------------------------<br /> NestedLoop (cost=0.41..570430.27 rows=2338 width=7) (actual time=5.730..12085.747 rows=8127 loops=1)<br /> Buffers:shared hit=1334760<br /> -> Seq Scan on hmm (cost=0.00..11.32 rows=732 width=7) (actual time=0.013..0.528rows=732 loops=1)<br /> Buffers: shared hit=4<br /> -> Index Only Scan using route_gist onroutes (cost=0.41..550.26 rows=22900 width=7) (actual time=2.491..16.503 rows=11 loops=732)<br /> Index Cond:(route && (hmm.route)::inet)<br /> Heap Fetches: 8127<br /> Buffers: shared hit=1334756<br /> Planningtime: 0.827 ms<br /> Execution time: 12086.513 ms<br />(10 rows)<br /><br />EXPLAIN (ANALYZE, buffers) SELECTroutes.route FROM routes JOIN hmm ON<br />routes.route && hmm.route;<br /> QUERY PLAN<br />-----------------------------------------------------------------------------------------------------------------------------------------<br /> NestedLoop (cost=0.41..588634.27 rows=2338 width=7) (actual time=0.043..12.150 rows=8127 loops=1)<br /> Buffers: sharedhit=12149<br /> -> Seq Scan on hmm (cost=0.00..11.32 rows=732 width=7) (actual time=0.013..0.075 rows=732 loops=1)<br/> Buffers: shared hit=4<br /> -> Index Only Scan using route_spgist on routes (cost=0.41..575.13rows=22900 width=7) (actual time=0.011..0.015 rows=11 loops=732)<br /> Index Cond: (route &&(hmm.route)::inet)<br /> Heap Fetches: 8127<br /> Buffers: shared hit=12145<br /> Planning time:0.779 ms<br /> Execution time: 12.603 ms<br />(10 rows)<br /></div><div class="gmail_extra"><br /><br /></div><div class="gmail_extra"><br/><br /></div></div>
On Tue, Mar 8, 2016 at 11:17 PM, Oleg Bartunov <obartunov@gmail.com> wrote:
On Thu, Mar 3, 2016 at 11:45 AM, Emre Hasegeli <emre@hasegeli.com> wrote:> Emre, I checked original thread and didn't find sample data. Could you provide them for testing ?
I found it on the Git history:
https://github.com/job/irrexplorer/blob/9e8b5330d7ef0022abbe1af18291257e044eb24b/data/irrexplorer_dump.sql.gz?raw=trueThanks !spgist index creates 2 times faster than gist, but index size is noticeably bugger
\di+ route_*
List of relations
Schema | Name | Type | Owner | Table | Size | Description
--------+--------------+-------+----------+--------+--------+-------------
public | route_gist | index | postgres | routes | 96 MB |
public | route_spgist | index | postgres | routes | 132 MB |
(2 rows)Spgist index tree is much better than gist - 12149 pages vs 1334760 !
I also noticed, that spgist is much faster than gist for other inet operators. I'd like to see in 9.6.
EXPLAIN (ANALYZE, buffers) SELECT routes.route FROM routes JOIN hmm ON
routes.route && hmm.route;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.41..570430.27 rows=2338 width=7) (actual time=5.730..12085.747 rows=8127 loops=1)
Buffers: shared hit=1334760
-> Seq Scan on hmm (cost=0.00..11.32 rows=732 width=7) (actual time=0.013..0.528 rows=732 loops=1)
Buffers: shared hit=4
-> Index Only Scan using route_gist on routes (cost=0.41..550.26 rows=22900 width=7) (actual time=2.491..16.503 rows=11 loops=732)
Index Cond: (route && (hmm.route)::inet)
Heap Fetches: 8127
Buffers: shared hit=1334756
Planning time: 0.827 ms
Execution time: 12086.513 ms
(10 rows)
EXPLAIN (ANALYZE, buffers) SELECT routes.route FROM routes JOIN hmm ON
routes.route && hmm.route;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.41..588634.27 rows=2338 width=7) (actual time=0.043..12.150 rows=8127 loops=1)
Buffers: shared hit=12149
-> Seq Scan on hmm (cost=0.00..11.32 rows=732 width=7) (actual time=0.013..0.075 rows=732 loops=1)
Buffers: shared hit=4
-> Index Only Scan using route_spgist on routes (cost=0.41..575.13 rows=22900 width=7) (actual time=0.011..0.015 rows=11 loops=732)
Index Cond: (route && (hmm.route)::inet)
Heap Fetches: 8127
Buffers: shared hit=12145
Planning time: 0.779 ms
Execution time: 12.603 ms
(10 rows)
>> Spgist index tree is much better than gist - 12149 pages vs 1334760 ! I assume this is the reason why it is bigger. IP addresses are very well suited to SP-GiST. They naturally do not overlap. > I also noticed, that spgist is much faster than gist for other inet > operators. I'd like to see in 9.6. Unfortunately, I missed the deadline of the last commitfest. It is on the next one: https://commitfest.postgresql.org/10/571/
Emre Hasegeli <emre@hasegeli.com> writes: > Attached patches add SP-GiST support to the inet datatypes. I started to look at this patch. The reported speedup is pretty nice, but ... > The operator > class comes with a small change on the SP-GiST framework to allow fixed > number of child nodes. ... this part of the patch breaks the existing API for SP-GiST opclasses. That is a hard sell. It may only matter for one existing opclass in core, but unless we have reason to think nobody is using any custom SP-GiST opclasses, that is not a pleasant thing to do. How important is it really for this opclass? Several of the existing opclasses use fixed numbers of child nodes, so why does this need something they don't? regards, tom lane
> ... this part of the patch breaks the existing API for SP-GiST opclasses. > That is a hard sell. It may only matter for one existing opclass in core, > but unless we have reason to think nobody is using any custom SP-GiST > opclasses, that is not a pleasant thing to do. How important is it really > for this opclass? Several of the existing opclasses use fixed numbers of > child nodes, so why does this need something they don't? This addition is required to implement the operator class cleanly. We could store the id of the child node as a "label", and add nodes when labels are missing, but this complicates all operator class functions. Other designs are also possible using the labels, but a simple design with fixed number of nodes worked best for me. Currently, SP-GiST framework supports fixed number of child nodes, if the index is growing by page splits, not by prefix splits. This is not a fair API. I assume it is designed by only having the prefix tree in mind. The change makes SP-GiST more reusable. SP-GiST is not widely adopted in my experience. Breaking this part of the API would affect prefix tree implementations. I don't think there are much of them. Supporting the new interface is easy for them. We can try to support the old interface by side of the new one, but this wouldn't be very clean either. I thought we were breaking the C interface in similar ways on major releases.
Emre Hasegeli <emre@hasegeli.com> writes: >> ... Several of the existing opclasses use fixed numbers of >> child nodes, so why does this need something they don't? > Currently, SP-GiST framework supports fixed number of child nodes, if > the index is growing by page splits, not by prefix splits. This is > not a fair API. Hm, I see your point: the spgSplitTuple action only supports forming a new upper tuple with a single, labeled child node. That is pretty bogus. You could imagine doing repeated spgAddNode actions to enlarge the new upper tuple; but that's ridiculously inefficient, and it's also unclear how you'd avoid confusing searches that descend through a partially-split upper tuple (either concurrently, or after a crash that prevented finishing the split). > SP-GiST is not widely adopted in my experience. Breaking this part of > the API would affect prefix tree implementations. I don't think there > are much of them. Supporting the new interface is easy for them. We > can try to support the old interface by side of the new one, but this > wouldn't be very clean either. We could imagine defining a "spgSplitTupleMulti" action alongside the existing one, but I agree it's a bit of a wart --- nobody would have designed it that way in a green field. On the other hand, index AM-to- opclass APIs are things we don't like to break. We went out of our way to make past GiST and GIN API changes upward-compatible. Oleg, Teodor, do you have any idea how many people are using custom SPGiST opclasses that might be affected by an API break here? regards, tom lane
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: > Emre Hasegeli <emre@hasegeli.com> writes:>> Attached patches add SP-GiST support to the inet datatypes. Tom> I started to look at this patch. The reported speedup is prettyTom> nice, but ... The builtin gist support for inet seems quite surprisingly slow; ip4r beats it into the ground without difficulty. (I'd be curious how well this sp-gist version stacks up against ip4r.) -- Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: > Tom> I started to look at this patch. The reported speedup is pretty > Tom> nice, but ... > The builtin gist support for inet seems quite surprisingly slow; ip4r > beats it into the ground without difficulty. (I'd be curious how well > this sp-gist version stacks up against ip4r.) Yeah ... what I find interesting about this patch is that the opclass's data splitting rules don't seem all that much different from what the existing gist opclass knows how to do. I suspect that the speed differential arises because the gist opclass's penalty and picksplit algorithms are somehow extremely stupid, which hints that maybe they could be made better. In particular, IIRC the test cases that we looked at when making the gist opclass tended to have only one or a few netmask lengths, whereas in this data set the netmasks are all over the place. I theorize that that's what's breaking the gist algorithm. Not sure exactly how to make it better though. This example also seems to be showing up some lack of intelligence in SPGiST itself, in that there are an awful lot of pages with a lot of empty space on them. Not sure why. regards, tom lane
I've pushed this patch with mostly (not entirely) cosmetic adjustments. I still think it'd be worth looking into why the produced index is larger than the GiST equivalent, and for that matter exactly why the GiST equivalent is so much slower to search. regards, tom lane
I wrote: > I've pushed this patch with mostly (not entirely) cosmetic adjustments. > I still think it'd be worth looking into why the produced index is larger > than the GiST equivalent, and for that matter exactly why the GiST > equivalent is so much slower to search. I spent some time poking at this, and noticed that although the picksplit rule is guaranteed to produce a non-degenerate split unless the inputs are all identical, it's entirely capable of producing a very bad split. Here's some instrumentation printout showing the numbers of items assigned to the four subnodes, for the test case we've been working with: 58 inet picksplit (227 tuples): 0 0 127 100 65 inet picksplit (227 tuples): 0 0 223 4 69 inet picksplit (225 tuples):2 0 126 97 69 inet picksplit (227 tuples): 1 0 226 0 72 inet picksplit (227 tuples): 0 0 224 3 72 inet picksplit(227 tuples): 1 0 132 94 82 inet picksplit (226 tuples): 1 0 127 98 86 inet picksplit (227 tuples): 0 0 13097 95 inet picksplit (227 tuples): 0 0 2 225 99 inet picksplit (227 tuples): 0 0 225 2 117 inet picksplit (227tuples): 2 0 126 99 118 inet picksplit (227 tuples): 0 0 128 99 137 inet picksplit (227 tuples): 0 0 226 1 151inet picksplit (227 tuples): 0 0 1 226 270 inet picksplit (227 tuples): 1 0 127 99 499 inet picksplit (122 tuples):0 0 64 58 (This is from "sort | uniq -c" output, so the first column is the number of occurrences of identical split counts.) Aside from the disturbing frequency of 100-to-1 split ratios, it also looks like the inclusion of the masklen bit is hardly pulling its weight, though that might be a artifact of this data set. The bad splits seem to be a direct contributor to the index's relatively poor space efficiency; they lead to SPGiST having to move nearly all of a long leaf chain to another page, and then soon having to split it again, resulting in another mostly-empty page, lather rinse repeat. They can't be very helpful for search speed either. I think that it'd be worth investigating changing to a split rule that uses the next two or three or four bits of the address, not just the single next bit, to index into more than four subnodes. It's pretty clear how to adjust the insertion functions to support that, and a crude hack in that line suggested that the index does get noticeably smaller. However, I didn't quite grok how to adjust the search (consistent) functions. Obviously we could apply the same rules in a loop, considering each successive address bit, but there may be a better way. Even though I already committed the code, we're a very long way from v10 release, so I see no reason not to consider incompatible changes in the way this opclass works. I also noticed that the index build wastes some space because SPGIST remembers only one partly-full page within each of its page categories. A prototype patch to enlarge the size of that cache helped a good bit too. I'll clean that up and post it later, but I was hoping you'd work on improving this opclass's split rules. regards, tom lane
> Aside from the disturbing frequency of 100-to-1 split ratios, it also > looks like the inclusion of the masklen bit is hardly pulling its weight, > though that might be a artifact of this data set. I was expecting this. Including masklen bit to decision was something we could easily do. It doesn't make the index more complicated, even more simpler. I think it would be useful, when host addresses with masklen are indexed. IRRExplorer dataset is just networks. > I think that it'd be worth investigating changing to a split rule that > uses the next two or three or four bits of the address, not just the > single next bit, to index into more than four subnodes. It's pretty clear > how to adjust the insertion functions to support that, and a crude hack in > that line suggested that the index does get noticeably smaller. However, > I didn't quite grok how to adjust the search (consistent) functions. > Obviously we could apply the same rules in a loop, considering each > successive address bit, but there may be a better way. I have experimented with such designs before posting this patch, but couldn't make any of them work. It gets very complicated when more than one bit is used. When only 2 bits are used, there would be 12 child nodes: * network bits 00 * network bits 01 * network bits 10 * network bits 11 * network bit 0 and host bit 0 * network bit 0 and host bit 1 * network bit 1 and host bit 0 * network bit 1 and host bit 1 * host bits 00 * host bits 01 * host bits 10 * host bits 11 Another design would be a prefix tree. We could split by using a byte, and store that byte as label. Then there wouldn't be static number of nodes. We would need to iterate trough the labels and re-execute the condition on all of them. Surely this would perform better for some working sets, but I don't think on all them. I will experiment with this, if I get the chance. I belive the current index is useful as it is. The wasted space depends on the distribution of the keys: > # create table a3 as select (trunc(random() * 256)::text || '.' || trunc(random() * 8)::text || '.0.0')::inet from generate_series(1,1000000) as i; > SELECT 1000000 > # create table a4 as select (trunc(random() * 256)::text || '.' || trunc(random() * 16)::text || '.0.0')::inet from generate_series(1,1000000) as i; > SELECT 1000000 > # create table a5 as select (trunc(random() * 256)::text || '.' || trunc(random() * 32)::text || '.0.0')::inet from generate_series(1,1000000) as i; > SELECT 1000000 > # create table a6 as select (trunc(random() * 256)::text || '.' || trunc(random() * 64)::text || '.0.0')::inet from generate_series(1,1000000) as i; > SELECT 1000000 > # create index on a3 using spgist (inet); > CREATE INDEX > # create index on a4 using spgist (inet); > CREATE INDEX > # create index on a5 using spgist (inet); > CREATE INDEX > # create index on a6 using spgist (inet); > CREATE INDEX > # \di+ > List of relations > Schema | Name | Type | Owner | Table | Size | Description > -------+-------------+-------+----------+-------+-------+------------- > public | a3_inet_idx | index | hasegeli | a3 | 42 MB | > public | a4_inet_idx | index | hasegeli | a4 | 46 MB | > public | a5_inet_idx | index | hasegeli | a5 | 55 MB | > public | a6_inet_idx | index | hasegeli | a6 | 56 MB | > (4 rows) It also gets better with the number of keys: > # create table a6 as select (trunc(random() * 256)::text || '.' || trunc(random() * 64)::text || '.0.0')::inet from generate_series(1,1000000) as i; > SELECT 1000000 > # create table b6 as select (trunc(random() * 256)::text || '.' || trunc(random() * 64)::text || '.0.0')::inet from generate_series(1,2000000) as i; > SELECT 2000000 > # create table c6 as select (trunc(random() * 256)::text || '.' || trunc(random() * 64)::text || '.0.0')::inet from generate_series(1,4000000) as i; > SELECT 4000000 > # create table d6 as select (trunc(random() * 256)::text || '.' || trunc(random() * 64)::text || '.0.0')::inet from generate_series(1,8000000) as i; > SELECT 8000000 > # create index on a6 using spgist (inet); > CREATE INDEX > # create index on b6 using spgist (inet); > CREATE INDEX > # create index on c6 using spgist (inet); > CREATE INDEX > # create index on d6 using spgist (inet); > CREATE INDEX > # \di+ > List of relations > Schema | Name | Type | Owner | Table | Size | Description > -------+-------------+-------+----------+-------+--------+------------- > public | a6_inet_idx | index | hasegeli | a6 | 56 MB | > public | b6_inet_idx | index | hasegeli | b6 | 109 MB | > public | c6_inet_idx | index | hasegeli | c6 | 181 MB | > public | d6_inet_idx | index | hasegeli | a6 | 336 MB | > (4 rows) Thank you for committing this.