Home > mailing lists

Re: Identity projection - Mailing list pgsql-hackers

From	Kyotaro HORIGUCHI
Subject	Re: Identity projection
Date	September 14, 2012 06:15:20
Msg-id	20120914.151506.198925875.horiguchi.kyotaro@lab.ntt.co.jp Whole thread Raw
In response to	Re: Identity projection (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Identity projection Re: Identity projection
List	pgsql-hackers

Tree view

Hello, Thank you for suggestions.

> > This patch reduces run time of such queries by 45% when result
> > recored has 30 columns and seems to have no harm for performance.
> 
> This patch seems quite unsafe to me: it's not generally okay for
> a plan node to return a slot that doesn't belong to it, because of
> tuple-lifespan issues.  It's possible that Result in particular
> could do that safely, but if so we ought to hack nodeResult.c for
> it, not the generic projection machinery.

Hmm.. 

Concerning tuple-lifespan, almost every types of node which may
do projection has the route for no-projInfo. This patch for
nodeResult eventually does the same thing. If they are special
cases and the operation could not be done generally, I should
follow the real(or hidden amoung code lines?) lifespan
regulation...

From the another point of view, execution nodes may hold
tupleslots which palloc'ed in projInfos of them just after
receiving a result tuple from their childs. And thy are finally
free'd in next projection on the same node or ExecEndNode() after
the fiish of processing the entire current query. The life of the
contents in the slots should be valid until next projection in
upper nodes or sending the result tuple. The execution tree is
bottom-spreaded and every node in it could not be executed under
different ancestors, and no multi-threaded execution..

The above is the figure from my view. And I suppose these
facts(is it correct?) are enough to ensure the tuple-lifeplan.


And concerning genericity of 'identity projection', .. Perhaps
you're right. I capsulated the logic into ExecProject but it is
usable only from a few kind of nodes.. I'll revert modification
on ExecProject and do identity projection on each nodes which can
do that.

> Something I'd been considering in connection with another example
> is teaching the planner not to generate a Result node in the first
> place, if the node is just doing an identity projection.  There
> are a couple of ways that could be done --- one is to get setrefs.c
> to remove the node on-the-fly, similar to what it does with
> SubqueryScan.  But it might be better just to check whether the
> node is actually needed before creating it in the first place.

I completely agree with the last sentence regarding Result
node. As I described in the previous message, it was a bit hard
to find the way to do that. I'll seek for that with more effort.

> Another point here is that the projection code already special-cases
> simple projections, so it's a bit hard to believe that it's as slow as
> you suggest above.  I wonder if your test case is confusing that
> optimization somehow.

The whole table is on memory and query is very simple and the
number of columns is relatively larger in this case. This is
because I intended to improve retrieving a large part of
partitioned table with many columns.

In this case, the executor shuttles up and down in shallow tree
and every level does almost nothing, but the result node does
pfree/palloc and direct mapping up to 30 columns which seems
rather heavy in the whole this execution. I could found no sign
of failure of optimization in that so simple execution tree...

And the effect of cource becomes smaller for fewer columns or
more complex queries, or queries on tables hanging out of memory
onto disks.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

pgsql-hackers by date:

From: Tom Lane
Date: 14 September 2012, 05:02:09
Subject: Cause of recent buildfarm failures on hamerkop

From: Feridun türk
Date: 14 September 2012, 06:22:37
Subject:

Re: Identity projection - Mailing list pgsql-hackers

Previous

Next