Re: Inheritance planner CPU and memory usage change since 9.3.2 - Mailing list pgsql-hackers
From | Dean Rasheed |
---|---|
Subject | Re: Inheritance planner CPU and memory usage change since 9.3.2 |
Date | |
Msg-id | CAEZATCVePyt1Xd5tZ5sW7VGMqquuRgufvyiqO0JkpHgBrvP8jQ@mail.gmail.com Whole thread Raw |
In response to | Re: Inheritance planner CPU and memory usage change since 9.3.2 (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Inheritance planner CPU and memory usage change since 9.3.2
|
List | pgsql-hackers |
On 18 June 2015 at 14:48, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Jun 17, 2015 at 9:56 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Wed, Jun 17, 2015 at 9:32 AM, Thomas Munro >> <thomas.munro@enterprisedb.com> wrote: >>> We saw a rather extreme performance problem in a cluster upgraded from >>> 9.1 to 9.3. It uses a largish number of child tables (partitions) and >>> many columns. Planning a simple UPDATE via the base table started >>> using a very large amount of memory and CPU time. >>> >>> My colleague Rushabh Lathia tracked the performance change down to >>> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=c03ad5602f529787968fa3201b35c119bbc6d782 >>> . >>> >>> The call to copyObject in the loop introduced here seems to be >>> problematic (copying append_rel_list for every element in >>> append_rel_list unconditionally), though we're still trying to figure >>> it out. Attached is a simple repro script, with variables to tweak. >>> >>> Quite a few others have posted about this sort of thing and been >>> politely reminded of the 100 table caveat [1][2] which is fair enough, >>> but the situations seems to have got dramatically worse for some users >>> after an upgrade. >> >> Yes. The copyObject() call introduced by this commit seems to have >> complexity O(T^2*C) where T is the number of child tables and C is the >> number of columns per child. That's because the List that is being >> copied is a list of AppendRelInfo nodes, each of which has a >> translated_vars member that is a list of every Var in one table, and >> we copy that list once per child. >> >> It appears that in a lot of cases this work is unnecessary. The >> second (long) for loop in inheritance_planner copies root->rowMarks >> and root->append_rel_list so as to be able to apply ChangeVarNodes() >> to the result, but we only actually do that if the rte is of type >> RTE_SUBQUERY or if it has security quals. In the common case where we >> reach inheritance_planner not because of UNION ALL but just because >> the table has a bunch of inheritance children (none of which have RLS >> policies applied), we copy everything and then modify none of it, >> using up startling large amounts of memory in ways that pre-9.2 >> versions did not. > > The attached patch helps. It does two things: > > 1. It arranges for inheritance_planner to throw away the memory > consumed by the subroot's rowMarks and append_rel_list after the call > to grouping_planner for that subroot returns. This prevents the > explosive growth of memory usage in all cases I've tested so far, but > planning is still really slow. > > 2. It arranges not to deep-copy append_rel_list when the root's > append_rel_list doesn't need to be modified for the subroot. This > makes planning much much faster in simple cases, like a simple update > on a table with many partitions. But if you do attach a FROM clause > containing a subquery to such an update, then this optimization > doesn't kick in any more and things are still very slow (though still > memory-bounded, due to part 1). > > I feel I might be missing a trick here. It seems unlikely to me that > we actually need the entire append_rel_list for every subquery; and we > almost certainly don't need to modify every element of the > append_rel_list for every subquery. Even the ones that no > ChangeVarNodes() call mutates still get deep-copied. > Yeah, you could probably pre-compute the indexes of the RTEs that need to copied, outside of the big loop, and store them in a bitmapset. Then, instead of copying the entire list of rowmarks/append_rel_infos each time, you could just copy the ones that referred to those RTE indexes (and only if the bitmapset was non-empty, which is the equivalent of your second optimisation). However, for AppendRelInfos, ChangeVarNodes() descends into the Vars in the translated_vars list, so short-cutting the copying of the AppendRelInfo isn't obviously safe. But, looking more closely, does ChangeVarNodes actually need to examine translated_vars (the fall-through case) when child_relid isn't the old rt_index? If not, that could be a big saving in cases like this. Regards, Dean
pgsql-hackers by date: