Re: WIP: BRIN multi-range indexes - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: WIP: BRIN multi-range indexes |
Date | |
Msg-id | d4aa7fa0-d06d-6584-9234-8c1696924dde@enterprisedb.com Whole thread Raw |
In response to | Re: WIP: BRIN multi-range indexes (John Naylor <john.naylor@enterprisedb.com>) |
Responses |
Re: WIP: BRIN multi-range indexes
|
List | pgsql-hackers |
On 1/26/21 7:52 PM, John Naylor wrote: > On Fri, Jan 22, 2021 at 10:59 PM Tomas Vondra > <tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>> > wrote: > > > > > > On 1/23/21 12:27 AM, John Naylor wrote: > > > > Still, it would be great if multi-minmax can be a drop in > replacement. I > > > know there was a sticking point of a distance function not being > > > available on all types, but I wonder if that can be remedied or worked > > > around somehow. > > > > > > > Hmm. I think Alvaro also mentioned he'd like to use this as a drop-in > > replacement for minmax (essentially, using these opclasses as the > > default ones, with the option to switch back to plain minmax). I'm not > > convinced we should do that - though. Imagine you have minmax indexes in > > your existing DB, it's working perfectly fine, and then we come and just > > silently change that during dump/restore. Is there some past example > > when we did something similar and it turned it to be OK? > > I was assuming pg_dump can be taught to insert explicit opclasses for > minmax indexes, so that upgrade would not cause surprises. If that's > true, only new indexes would have the different default opclass. > Maybe, I suppose we could do that. But I always found such changes happening silently in the background a bit suspicious, because it may be quite confusing. I certainly wouldn't expect such difference between creating a new index and index created by dump/restore. Did we do such changes in the past? That might be a precedent, but I don't recall any example ... > > As for the distance functions, I'm pretty sure there are data types > > without "natural" distance - like most strings, for example. We could > > probably invent something, but the question is how much we can rely on > > it working well enough in practice. > > > > Of course, is minmax even the right index type for such data types? > > Strings are usually "labels" and not queried using range queries, > > although sometimes people encode stuff as strings (but then it's very > > unlikely we'll define the distance definition well). So maybe for those > > types a hash / bloom would be a better fit anyway. > > Right. > > > But I do have an idea - maybe we can do without distances, in those > > cases. Essentially, the primary issue of minmax indexes are outliers, so > > what if we simply sort the values, keep one range in the middle and as > > many single points on each tail? > > That's an interesting idea. I think it would be a nice bonus to try to > do something along these lines. On the other hand, I'm not the one > volunteering to do the work, and the patch is useful as is. > IMO it's fairly small amount of code, so I'll take a stab at in in the next version of the patch. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: