Home > mailing lists

Re: Patch: pg_trgm: gin index scan performance for similarity search - Mailing list pgsql-hackers

From	Alexander Korotkov
Subject	Re: Patch: pg_trgm: gin index scan performance for similarity search
Date	December 24, 2015 18:06:36
Msg-id	CAPpHfduvmuQRzmKUWG-i0EgAw=NhDH=3PfDQ6jdnpsxcSx0GvA@mail.gmail.com Whole thread Raw
In response to	Patch: pg_trgm: gin index scan performance for similarity search (Fornaroli Christophe <cfornaro@gmail.com>)
List	pgsql-hackers

Tree view

Hi, Christophe!

On Thu, Dec 24, 2015 at 6:28 PM, Fornaroli Christophe <cfornaro@gmail.com> wrote:

This code uses this upper bound for the similarity: ntrue / (nkeys - ntrue). But if there is ntrue trigrams in common, we know that the indexed string is at least ntrue trigrams long. We can then use a more aggressive upper bound: ntrue / (ntrue + nkeys - ntrue) or ntrue / nkeys. Attached is a patch that changes this.

Good catch, thank you! The estimate in pg_trgm was not optimal.

I think it would be good to add comment which would explicitly state why do we use this upper bound.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

pgsql-hackers by date:

From: Chapman Flack
Date: 24 December 2015, 18:01:45
Subject: missing "SPI_finish" that isn't missing

From: Alexander Korotkov
Date: 24 December 2015, 18:42:36
Subject: Re: Commit fest status for 2015-11

Re: Patch: pg_trgm: gin index scan performance for similarity search - Mailing list pgsql-hackers

Previous

Next