Re: Proposal: Global Index - Mailing list pgsql-hackers
From | Jeremy Schneider |
---|---|
Subject | Re: Proposal: Global Index |
Date | |
Msg-id | 8cd6b125-2f3f-01bc-f41b-5c8343685a21@amazon.com Whole thread Raw |
In response to | Re: Proposal: Global Index (Andres Freund <andres@anarazel.de>) |
Responses |
Re: Proposal: Global Index
|
List | pgsql-hackers |
On 10/30/19 10:27, Andres Freund wrote: > On 2019-10-30 13:05:57 -0400, Tom Lane wrote: >> Peter Geoghegan <pg@bowt.ie> writes: >>> On Wed, Oct 30, 2019 at 9:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >>>> Well, the *effects* of the feature seem desirable, but that doesn't >>>> mean that we want an implementation that actually has a shared index. >>>> As soon as you do that, you've thrown away most of the benefits of >>>> having a partitioned data structure in the first place. >> >>> Right, but that's only the case for the global index. Global indexes >>> are useful when used judiciously. >> >> But ... why bother with partitioning then? To me, the main reasons >> why you might want a partitioned table are > > Quite commonly there's a lot of *other* indexes, often on a lot wider > data than the primary key, that don't need to be global. And whereas in > a lot of cases the primary key in a partitioned table has pretty good > locality (and thus will be mostly buffered IO), other indexes will often > not have that property (i.e. not have much correlation with table > position). I asked around a little bit and got some interesting responses. Thought I'd pass two of them along. One person worked on a payments network (150,000+ installed readers), the transaction table was date partitioned (1 per day) based on insert timestamp, but lookups and updates were typically by the unique transaction id. Oracle DB, they kept 180 daily partitions, several million rows per day. Transactions did not arrive in order, and could be delayed if some part of the network was slow (they opted to allow the $2 charge rather than reject sales) and when the cash transaction records were uploaded. Step one for their PG conversion created a read replica in PG 9.6, and the cost of doing the individual index lookups across 180 partitions (and 180 indexes) was very high, so they stored max and min txn id per partition and would generate a query with all the dates that a txn id could have been in so that only a small number of partition indexes would be accessed. They wanted a global index on txn id for performance, not for uniqueness – id generated on reader with guid-like semantics. A second person worked on several large-scale systems and he relayed that in some cases where they used Oracle global indexes on partitioned tables, they ended up deciding to reverse that decision as things scaled because of restrictive locking during partition maintenance (this is the exact issue Tom points out). So even on a database _with_ the option of using a global index, they've sometimes opted for "workaround" design patterns instead: * To solve uniqueness, manage serialization at the appliation level. Isolate operations (e.g. using a queue) and use that to make sure that two sessions don’t try to insert the same record at the same time. From an RDBMS, this looks like a separate, smaller table that is being used to manage work activity. * To solve the additional IO for a global table scan ... We often don’t need to do this because the load in this pattern is not typically highly concurrent. If we are looking for higher concurrency, we can usually add a hack/workaround that filters on a partition key to provide “pretty good” pruning. The net result is that you get 2-3x the IO due to the lack of global index (same workaround as first story above). Quote: "So ... I don’t actually like the idea of introducing this. Unless, someone can solve the ugly challenges we have had [around partition maintenance operations]." I actually don't think those challenges are so un-solvable. I think that global indexes will be irrelevant to most workloads. I'm not entirely convinced that they won't be useful for a few people with specific workloads and large amounts of data in PostgreSQL where the benefits outweigh the costs. I definitely agree that care needs to be taken around index maintenance operations if there's an effort here. >> * ability to cheaply add and remove partitions, primarily so that >> you can cheaply do things like "delete the oldest month's data". > > You can still do that to some degree with a global index. Imagine > e.g. keeping a 'partition id' as a sort-of column in the global > index. That allows you to drop the partition, without having to > immediately rebuild the index, by checking the partition id against the > live partitions during lookup. So sure, your'e wasting space for a bit > in the global index, but it'll also be space that is likely to be fairly > efficiently reclaimed the next time vacuum touches the index. And if > not the global index can be rebuilt concurrently without blocking > writes. Another idea might be to leverage PostgreSQL's partial indexes. If the index is created "where date>2020" and you're dropping an index from 2019 then you can entirely ignore the index. Not a panacea for every index maintenance operation, but for the super-common case of dropping the oldest partition you can now: 1) create new index concurrently "where dt>2020" 2) drop the old index 3) drop the 2019 partition doesn't solve world hunger but there's lots of benefit for such a simple hack. >> * ability to scale past our limits on the physical size of one table >> --- both the hard BlockNumber-based limit, and the performance >> constraints of e.g. vacuuming a very large table. > > For that to be a problem for a global index the global index (which will > often be something like two int4 or int8 columns) itself would need to > be above the block number based limit - which doesn't seem that close. > > WRT vacuuming - based on my observations the table itself isn't a > performance problem for vacuuming all that commonly anymore, it's the > associated index scans. So yea, that's a problem. I'm sure zheap will make all our dreams come true, right? :D -Jeremy -- Jeremy Schneider Database Engineer Amazon Web Services
pgsql-hackers by date: