Re: parallel joins, and better parallel explain - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: parallel joins, and better parallel explain |
Date | |
Msg-id | CA+TgmoYo=V7Wyhr_M4GdFFRp=pa4etm5W62M9trcA2ESVTLAPA@mail.gmail.com Whole thread Raw |
In response to | Re: parallel joins, and better parallel explain (Dilip Kumar <dilipbalaut@gmail.com>) |
Responses |
Re: parallel joins, and better parallel explain
|
List | pgsql-hackers |
On Mon, Jan 4, 2016 at 8:52 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote: > One strange behaviour, after increasing number of processor for VM, > max_parallel_degree=0; is also performing better. So, you went from 6 vCPUs to 8? In general, adding more CPUs means that there is less contention for CPU time, but if you already had 6 CPUs and nothing else running, I don't know why the backend running the query would have had a problem getting a whole CPU to itself. If you previously only had 1 or 2 CPUs then there might have been some CPU competition with background processes, but if you had 6 then I don't know why the max_parallel_degree=0 case got faster with 8. Anyway, I humbly suggest that this query isn't the right place to put our attention. There's no reason why we can't improve things further in the future, and if it turns out that lots of people have problems with the cost estimates on multi-batch parallel hash joins, then we can revise the cost model. We wouldn't treat a single query where a non-parallel multi-batch hash join run slower than the costing would suggest as a reason to revise the cost model for that case, and I don't think this patch should be held to a higher standard. In this particular case, you can easily make the problem go away by tuning configuration parameters, which seems like an acceptable answer for people who run into this, unless it becomes clear that this particular problem is widespread and can't be solved without configuration changes that introduce other issues at the same time. Keep in mind that, right now, the patch is currently doing just about the simplest thing possible, and that's working pretty well. Anything we change at this point is going to be in the direction of adding more complexity than what I've got right now and more than we've got in the corresponding non-parallel case. That's OK, but I think it's appropriate that we only do that if we're pretty sure that those changes are going to be an improvement. And I think, by and large, that we don't have enough perspective on this to know that at this point. Until this gets some wider testing, which probably isn't going to happen very much until this gets committed, it's hard to say which problems are just things we're artificially creating in the lab and which ones are going to be annoyances in the real world. Barring strenuous objections or discovery of more serious problems with this than have turned up so far, I'm inclined to go ahead and commit it fairly soon, so that it attracts some more eyeballs while there's still a little time left in the development cycle to do something about whatever the systematic problems turn out to be. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: