Currently, an executor node can hold onto large data structures until
ExecEndNode(). For complex plans, that means we can use a lot more
memory than we need to. For instance, see the SQL at the end of this
email.
(Note: this work is not intended to implement any kind of query-level
work_mem enforcement. It reduces the maximum number of times that
work_mem is used in certain kinds of complex plans, which may help with
that problem, but enforcing that is not my goal right now.)
The attached proof-of-concept patch (0001) makes the Hash node call
ExecShutdownNode() on the subplan when it's exhausted, and extends
ExecShutdownHash() to release memory. I added some code to measure the
peak memory usage (0002), and it went from about 200MB to about 150MB.
The savings roughly correspond to the memory used by the first Hash
node.
I'm not 100% sure that ExecShutdownNode() is the right place to do
this, but it seems better than introducing yet another executor API.
Thoughts?
We'd need to figure out what to do about rescan. One option is to just
say that if EXEC_FLAG_REWIND is used, then it would never free the
resources eagerly. But if it's under a memoize, then it's unlikely to
be called with the same parameters again, so whatever state it already
has might be useless anyway.
Also, are there any major challenges making this work with parallel
query?
Regards,
Jeff Davis
Example:
CREATE TABLE t1(id1 int8 primary key);
CREATE TABLE t2(id2 int8 primary key, id1 int8);
CREATE TABLE t3(id3 int8, id2 int8, n NUMERIC);
INSERT INTO t1 SELECT g FROM generate_series(1, 1000000) g;
INSERT INTO t2 SELECT g+1000000000, g
FROM generate_series(1, 1000000) g;
INSERT INTO t3 SELECT g+2000000000, g+1000000000, 3.14159
FROM generate_series(1, 1000000) g;
INSERT INTO t3 SELECT
random(1,1000000) + 3000000000,
random(1,1000000) + 3000000000,
-1
FROM generate_series(1, 10000000) g;
VACUUM ANALYZE;
SET work_mem = '1GB';
SET from_collapse_limit = 1;
SET enable_sort = off;
SET enable_nestloop = off;
SET max_parallel_workers = 0;
SET max_parallel_workers_per_gather = 0;
EXPLAIN (COSTS OFF)
SELECT id1, COUNT(*) FROM
t3,
(SELECT t1.id1, t2.id2 FROM t1, t2 WHERE t2.id1 = t1.id1) s
WHERE t3.id2 = s.id2
GROUP BY id1;
QUERY PLAN
--------------------------------------------------
HashAggregate
Group Key: t1.id1
-> Hash Join
Hash Cond: (t3.id2 = t2.id2)
-> Seq Scan on t3
-> Hash
-> Hash Join
Hash Cond: (t2.id1 = t1.id1)
-> Seq Scan on t2
-> Hash
-> Seq Scan on t1