Re: Identity projection - Mailing list pgsql-hackers
From | Kyotaro HORIGUCHI |
---|---|
Subject | Re: Identity projection |
Date | |
Msg-id | 20120914.151506.198925875.horiguchi.kyotaro@lab.ntt.co.jp Whole thread Raw |
In response to | Re: Identity projection (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Identity projection
Re: Identity projection |
List | pgsql-hackers |
Hello, Thank you for suggestions. > > This patch reduces run time of such queries by 45% when result > > recored has 30 columns and seems to have no harm for performance. > > This patch seems quite unsafe to me: it's not generally okay for > a plan node to return a slot that doesn't belong to it, because of > tuple-lifespan issues. It's possible that Result in particular > could do that safely, but if so we ought to hack nodeResult.c for > it, not the generic projection machinery. Hmm.. Concerning tuple-lifespan, almost every types of node which may do projection has the route for no-projInfo. This patch for nodeResult eventually does the same thing. If they are special cases and the operation could not be done generally, I should follow the real(or hidden amoung code lines?) lifespan regulation... From the another point of view, execution nodes may hold tupleslots which palloc'ed in projInfos of them just after receiving a result tuple from their childs. And thy are finally free'd in next projection on the same node or ExecEndNode() after the fiish of processing the entire current query. The life of the contents in the slots should be valid until next projection in upper nodes or sending the result tuple. The execution tree is bottom-spreaded and every node in it could not be executed under different ancestors, and no multi-threaded execution.. The above is the figure from my view. And I suppose these facts(is it correct?) are enough to ensure the tuple-lifeplan. And concerning genericity of 'identity projection', .. Perhaps you're right. I capsulated the logic into ExecProject but it is usable only from a few kind of nodes.. I'll revert modification on ExecProject and do identity projection on each nodes which can do that. > Something I'd been considering in connection with another example > is teaching the planner not to generate a Result node in the first > place, if the node is just doing an identity projection. There > are a couple of ways that could be done --- one is to get setrefs.c > to remove the node on-the-fly, similar to what it does with > SubqueryScan. But it might be better just to check whether the > node is actually needed before creating it in the first place. I completely agree with the last sentence regarding Result node. As I described in the previous message, it was a bit hard to find the way to do that. I'll seek for that with more effort. > Another point here is that the projection code already special-cases > simple projections, so it's a bit hard to believe that it's as slow as > you suggest above. I wonder if your test case is confusing that > optimization somehow. The whole table is on memory and query is very simple and the number of columns is relatively larger in this case. This is because I intended to improve retrieving a large part of partitioned table with many columns. In this case, the executor shuttles up and down in shallow tree and every level does almost nothing, but the result node does pfree/palloc and direct mapping up to 30 columns which seems rather heavy in the whole this execution. I could found no sign of failure of optimization in that so simple execution tree... And the effect of cource becomes smaller for fewer columns or more complex queries, or queries on tables hanging out of memory onto disks. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
pgsql-hackers by date: