Re: matview niceties: pick any two of these three - Mailing list pgsql-hackers
From | Kevin Grittner |
---|---|
Subject | Re: matview niceties: pick any two of these three |
Date | |
Msg-id | 1367606094.67129.YahooMailNeo@web162901.mail.bf1.yahoo.com Whole thread Raw |
In response to | Re: matview niceties: pick any two of these three (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: matview niceties: pick any two of these three
|
List | pgsql-hackers |
Tom Lane <tgl@sss.pgh.pa.us> wrote: > The current matview design gets around this problem by requiring > that transition between scannable and unscannable states involve > a complete table rewrite, and thus the transactionality issue can > be hidden behind a transactional update of the matview's > pg_class.relfilenode field. IMO, that is obviously a dead-end > design, because we are going to want scannability status updates > associated with partial updates of the matview's contents. I don't think the change between populated and non-populated needs to change without a new heap being swapped in. To do incremental maintenance, one must be starting with a populated heap. What will change without a new heap is the contents of the matview and whether a populated matview is "fresh" enough to be scanned for some specific usage. There's a lot of bikeshedding left to be done on what options we need there and how they will be stored; but I don't see how that threatens the ability to use a zero-length heap to indicate that a matview has not been populated. Scannability and the populated state only happen to be the same thing for the moment, but improving the scannability test does not require a new mechanism for determining whether the matview has been populated. A new "populated" mechanism is, we all agree, desirable; but it's not necessary to improve scannability testing. We'll get into all kinds of design trouble if we conflate these two separate concepts. > Kevin's summary is leaving out one key desirable property: > > (4) ability to change scannability state without a full table > rewrite. > > Putting the state into pg_class would preserve that property. I could probably add a hundred other desirable properties for a materialized view implementation; I was trying to list the ones which were still up-for-grabs for the initial 9.3 implementation. If we want to make progress on catching up to other major databases on this, it will take progress in many releases; and every one of those will lack something that someone wants. OK, a lot of things that a lot of people want. But if we can't commit something that is less than complete or less than perfect to make incremental change, we're not going to be getting anywhere very fast. I'm not sure why you would want to make a non-populated matview scannable, anyway; which seems like the only reason to have (4) without more refined maintenance techniques. I plan to submit a patch to the first 9.4 CF for differential updates (generating the new contents in a temporary heap and transactionally applying the changes to the existing matview heap) as a REFRESH option. I'm working on estimates for the amount of work needed to support delta-driven incremental changes to matview contents based on the declaration of the matview for various types of matviews. On a preliminary basis, I think that we can have fast incremental updates in 9.4 without any imperative coding for all non-recursive views, although including support in 9.4 for incremental updates for views with window functions is still iffy, and some desirable optimizations for aggregates might not make it. I'm really very sure that incremental updates for recursive matviews will not be a 9.4 feature, since that is best handled by the "Delete and Rederive" (DRed) algorithm, while incremental maintenance for non-recursive matviews is best handled with the "counting" algorithm, and it seems unlikely that we'll get both into one major release. If I thought that the current hack for tracking the populated state would be permanent, or that it would be so hard to replace in a later release that it would impede further development, I would be the first one to rip it out. I haven't been able to see such a risk, and nobody has adequately explained where they think the risk lies. The "crash during extension of the first page" argument doesn't hold up unless you conflate the populated state with the scannable state, and it seems likely that the scannable will not only be separate, but able to differ for among backend processes. If adding unlogged materialized views requires reworking support for unlogged relations in general in a way that puts the data into the catalog, and that becomes my issue to solve, I can deal with that -- but that time would come out of time I would otherwise be spending working on supporting incremental update of certain classes of materialized views, and progress there will be slower than I've been hoping. It's all about trade-offs. If the unlogged recovery changes wind up on my plate, for example, it might mean that the next release won't be able to have incremental updates for matviews with a NOT EXISTS clause, since support for that looks to be roughly the same amount of work. I will certainly go along with the consensus, but from my perspective, living with the current hack for determining whether a matview has been populated is worth the benefit of having unlogged matviews in 9.3. I don't expect that hack to be viable for more that one or two releases; I just think that it's a practical matter of working with the tools we have *now* for unlogged relations to try to create a new type of unlogged relation. I'm all for better tools and will be happy to use them once they exist. -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: