Re: Materialized views WIP patch - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Re: Materialized views WIP patch |
Date | |
Msg-id | CA+U5nMKSrF=j54fAns4pYQsrszK7L=Vkbqt2xxVT5pc738qegw@mail.gmail.com Whole thread Raw |
In response to | Re: Materialized views WIP patch (Kevin Grittner <kgrittn@ymail.com>) |
Responses |
Re: Materialized views WIP patch
|
List | pgsql-hackers |
On 5 March 2013 12:15, Kevin Grittner <kgrittn@ymail.com> wrote: > I don't think I disagree with any of what Simon says other than his > feelings about the planning cost. Imagine that there are ten MVs > that might apply to a complex query, but some of them are mutually > exclusive, so there are a large number of permutations of MVs which > could be used to replace parts of the original query. And maybe > some of base tables have indexes which could reduce execution cost > which aren't present in some or all of the MVs. And each MV has a > number of indexes. The combinatorial explosion of possible plans > would make it hard to constrain plan time without resorting to some > crude rules about what to choose. That's not an unsolvable > problem, but I see it as a pretty big problem. If we are proposing that we shouldn't try to optimise because its not usefully solvable, then I would disagree. If we are saying that more plans are possible with MVs, then I'd say, yes there *could* be - that's the one of the purposes. That represents more options for optimisation and we should be happy, not sad about that. Yes, we would need some thought to how those potential optimisations can be applied without additional planning cost, but I see that as a long term task, not a problem. The question is will the potential for additional planning time actually materialise into a planning problem? (See below). > I have no doubt that someone could take a big data warehouse and > add one or two MVs and show a dramatic improvement in the run time > of a query. Almost as big as if the query were rewritten to usee > the MV directly. It would make for a very nice presentation, and > as long as they are used sparingly this could be a useful tool for > a data warehouse environment which is playing with alternative ways > to optimize slow queries which pass a lot of data. In other > environments, I feel that it gets a lot harder to show a big win. Are there realistically going to be more options to consider? In practice, no, because in most cases we won't be considering both MVs and indexes. Splatting MVs on randomly won't show any more improvement than splatting indexes on randomly helps. Specific optimisations help in specific cases only. And of course, adding extra data structures that have no value certainly does increase planning time. Presumably we'd need some way of seeing how frequently MVs were picked, so we could drop unused MVs just like we can indexes. * Indexes are great at speeding up various kinds of search queries. If you don't have any queries like that, they help very little. * MVs help in specific and restricted use cases, but can otherwise be thought of as a new kind of index structure. MVs help with joins and aggregations, so if you don't do much of that, you'll see no benefit. That knowledge also allows us to develop heuristics for sane optimisation. If the MV has a GROUP BY in it, and a query doesn't, then it probably won't help much to improve query times. If it involves a join you aren't using, then that won't help either. My first guess would be that we don't even bother looking for MV plans unless it has an aggregated result, or a large scale join. We do something similar when we look for plans that use indexes when we have appropriate quals - no quals, no indexes. As a result, I don't see MVs increasing planning times for requests that would make little or no use of them. There will be more planning time on queries that could make use of them and that is good because we really care about that. Sure, a badly written planner might cost more time than it saves. All of this work requires investment from someone with the time and/or experience to make a good go at it. I'm not pushing Tom towards it, nor anyone else, but I do want to see the door kept open for someone to do this when possible (i.e. not GSoC). -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: