Home > mailing lists

Re: Hybrid Hash/Nested Loop joins and caching results from subplans - Mailing list pgsql-hackers

From	Andy Fan
Subject	Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Date	November 8, 2020 14:52:43
Msg-id	CAKU4AWpoQ_c9Rj2sTVPMGxAb+CAAe4z5ejcrL-89QirCxk=k+A@mail.gmail.com Whole thread Raw
In response to	Re: Hybrid Hash/Nested Loop joins and caching results from subplans (David Rowley <dgrowleyml@gmail.com>)
Responses	Re: Hybrid Hash/Nested Loop joins and caching results from subplans
List	pgsql-hackers

Tree view

On Mon, Nov 2, 2020 at 3:44 PM David Rowley <dgrowleyml@gmail.com> wrote:

On Tue, 20 Oct 2020 at 22:30, David Rowley <dgrowleyml@gmail.com> wrote:
>
> So far benchmarking shows there's still a regression from the v8
> version of the patch. This is using count(*). An earlier test [1] did
> show speedups when we needed to deform tuples returned by the nested
> loop node. I've not yet repeated that test again. I was disappointed
> to see v9 slower than v8 after having spent about 3 days rewriting the
> patch

I did some further tests this time with some tuple deforming. Again,
it does seem that v9 is slower than v8.

I run your test case on v8 and v9, I can produce a stable difference between them.

v8:
statement latencies in milliseconds:
1603.611 select count(*) from hundredk hk inner join lookup l on hk.thousand = l.a;

v9:

statement latencies in milliseconds:
1772.287 select count(*) from hundredk hk inner join lookup l on hk.thousand = l.a;

then I did a perf on the 2 version, Is it possible that you called tts_minimal_clear twice in

the v9 version? Both ExecClearTuple and ExecStoreMinimalTuple called tts_minimal_clear

on the same slot.

With the following changes:

diff --git a/src/backend/executor/execMRUTupleCache.c b/src/backend/executor/execMRUTupleCache.c
index 3553dc26cb..b82d8e98b8 100644
--- a/src/backend/executor/execMRUTupleCache.c
+++ b/src/backend/executor/execMRUTupleCache.c
@@ -203,10 +203,9 @@ prepare_probe_slot(MRUTupleCache *mrucache, MRUCacheKey *key)
TupleTableSlot *tslot = mrucache->tableslot;
int numKeys = mrucache->nkeys;

- ExecClearTuple(pslot);
-
if (key == NULL)
{
+ ExecClearTuple(pslot);
/* Set the probeslot's values based on the current parameter values */
for (int i = 0; i < numKeys; i++)
pslot->tts_values[i] = ExecEvalExpr(mrucache->param_exprs[i],
@@ -641,7 +640,7 @@ ExecMRUTupleCacheFetch(MRUTupleCache *mrucache)
{
mrucache->state = MRUCACHE_FETCH_NEXT_TUPLE;

- ExecClearTuple(mrucache->cachefoundslot);
+ // ExecClearTuple(mrucache->cachefoundslot);
slot = mrucache->cachefoundslot;
ExecStoreMinimalTuple(mrucache->last_tuple->mintuple, slot, false);
return slot;
@@ -740,7 +739,7 @@ ExecMRUTupleCacheFetch(MRUTupleCache *mrucache)
return NULL;
}

- ExecClearTuple(mrucache->cachefoundslot);
+ // ExecClearTuple(mrucache->cachefoundslot);
slot = mrucache->cachefoundslot;
ExecStoreMinimalTuple(mrucache->last_tuple->mintuple, slot, false);
return slot;

v9 has the following result:

1608.048 select count(*) from hundredk hk inner join lookup l on hk.thousand = l.a;

Graphs attached

Looking at profiles, I don't really see any obvious reason as to why
this is. I'm very much inclined to just pursue the v8 patch (separate
Result Cache node) and just drop the v9 idea altogether.

David

Best Regards

Andy Fan

pgsql-hackers by date:

From: Andy Fan
Date: 08 November 2020, 13:26:51
Subject: Re: Hybrid Hash/Nested Loop joins and caching results from subplans

From: Marina Polyakova
Date: 08 November 2020, 14:59:19
Subject: Re: pgbench stopped supporting large number of client connections on Windows

Re: Hybrid Hash/Nested Loop joins and caching results from subplans - Mailing list pgsql-hackers

Previous

Next