Home > mailing lists

Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem - Mailing list pgsql-hackers

From	Kyotaro HORIGUCHI
Subject	Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem
Date	February 8, 2018 08:44:35
Msg-id	20180208.144435.79983345.horiguchi.kyotaro@lab.ntt.co.jp Whole thread Raw
In response to	Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses	Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem
List	pgsql-hackers

Tree view

Hello, I looked this a bit closer.

In upthread[1] Robert mentioned the exponentially increasing size
of additional segments.

>> Hmm, I had imagined making all of the segments the same size rather
>> than having the size grow exponentially.  The whole point of this is
>> to save memory, and even in the worst case you don't end up with that
>> many segments as long as you pick a reasonable base size (e.g. 64MB).
>
> Wastage is bound by a fraction of the total required RAM, that is,
> it's proportional to the amount of required RAM, not the amount
> allocated. So it should still be fine, and the exponential strategy
> should improve lookup performance considerably.

It seems that you are getting him wrong. (Anyway I'm not sure
what you meant by the above. not-yet-allocated memory won't be a
waste.) The conclusive number of dead tuples in a heap scan is
undeteminable until the scan ends. If we had a new dead tuple
required a, say 512MB new segment and the scan ends just after,
the wastage will be almost the whole of the segment.

On the other hand, I don't think the exponential strategy make
things considerably better. bsearch iterations in
lazy_tid_reaped() are distributed between segment search and tid
search. Intuitively more or less the increased segment size just
moves some iterations of the former to the latter.

I made a calculation[2]. With maintemance_work_mem of 4096MB, the
number of segments is 6 and expected number of bsearch iteration
is about 20.8 for the exponential strategy. With 64MB fixed size
segments, we will have 64 segments (that is not so many) and the
expected iteration is 20.4. (I suppose the increase comes from
the imbalanced size among segments.) Addition to that, as Robert
mentioned, the possible maximum memory wastage of the exponential
strategy is about 2GB and 64MB in fixed size strategy.

Seeing these numbers, I don't tend to take the exponential
strategy.


[1] https://www.postgresql.org/message-id/CAGTBQpbZX5S4QrnB6YP-2Nk+A9bxbaVktzKwsGvMeov3MTgdiQ@mail.gmail.com

[2] See attached perl script. I hope it is correct.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
#! /usr/bin/perl

$maxmem=1024 * 4;
#=====
print "exponential sized strategy\n";
$ss = 64;
$ts = 0;
$sumiteritem = 0;
for ($i = 1 ; $ts < $maxmem ; $i++) {
    $ss = $ss * 2;
    if ($ts + $ss > $maxmem) {
        $ss = $maxmem - $ts;
    }
    $ts += $ss;
    $ntups = $ts*1024*1024 / 6;
    $ntupinseg = $ss*1024*1024 / 6;
    $npages = $ntups / 291;
    $tsize = $npages * 8192.0 / 1024 / 1024 / 1024;
    $sumiteritem += log($ntupinseg) * $ntupinseg; # weight by percentage in all tuples
    printf("#%d : segsize=%dMB total=%dMB, (tuples = %ld, min tsize=%.1fGB), iterseg(%d)=%f, iteritem(%d) = %f,
expectediter=%f\n",
 
           $i, $ss, $ts, $ntups, $tsize,
           $i, log($i), $ntupinseg, log($ntupinseg), log($i) + $sumiteritem/$ntups);
}

print "\n\nfixed sized strategy\n";
$ss = 64;
$ts = 0;
for ($i = 1 ; $ts < $maxmem ; $i++) {
    $ts += $ss;
    $ntups = $ts*1024*1024 / 6;
    $ntupinseg = $ss*1024*1024 / 6;
    $npages = $ntups / 300;
    $tsize = $npages * 8192.0 / 1024 / 1024 / 1024;
    printf("#%d : segsize=%dMB total=%dMB, (tuples = %ld, min tsize=%.1fGB), interseg(%d)=%f, iteritem(%d) = %f,
expectediter=%f\n",
 
           $i, $ss, $ts, $ntups, $tsize,
        $i, log($i), $ntupinseg, log($ntupinseg), log($i) + log($ntupinseg));
}

pgsql-hackers by date:

From: Amit Langote
Date: 08 February 2018, 08:11:07
Subject: Re: [HACKERS] advanced partition matching algorithm forpartition-wise join

From: Etsuro Fujita
Date: 08 February 2018, 08:53:38
Subject: Re: postgres_fdw: perform UPDATE/DELETE .. RETURNING on a join directly

Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem - Mailing list pgsql-hackers

Previous

Next