Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem - Mailing list pgsql-hackers
From | Kyotaro HORIGUCHI |
---|---|
Subject | Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem |
Date | |
Msg-id | 20180208.144435.79983345.horiguchi.kyotaro@lab.ntt.co.jp Whole thread Raw |
In response to | Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>) |
Responses |
Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem
|
List | pgsql-hackers |
Hello, I looked this a bit closer. In upthread[1] Robert mentioned the exponentially increasing size of additional segments. >> Hmm, I had imagined making all of the segments the same size rather >> than having the size grow exponentially. The whole point of this is >> to save memory, and even in the worst case you don't end up with that >> many segments as long as you pick a reasonable base size (e.g. 64MB). > > Wastage is bound by a fraction of the total required RAM, that is, > it's proportional to the amount of required RAM, not the amount > allocated. So it should still be fine, and the exponential strategy > should improve lookup performance considerably. It seems that you are getting him wrong. (Anyway I'm not sure what you meant by the above. not-yet-allocated memory won't be a waste.) The conclusive number of dead tuples in a heap scan is undeteminable until the scan ends. If we had a new dead tuple required a, say 512MB new segment and the scan ends just after, the wastage will be almost the whole of the segment. On the other hand, I don't think the exponential strategy make things considerably better. bsearch iterations in lazy_tid_reaped() are distributed between segment search and tid search. Intuitively more or less the increased segment size just moves some iterations of the former to the latter. I made a calculation[2]. With maintemance_work_mem of 4096MB, the number of segments is 6 and expected number of bsearch iteration is about 20.8 for the exponential strategy. With 64MB fixed size segments, we will have 64 segments (that is not so many) and the expected iteration is 20.4. (I suppose the increase comes from the imbalanced size among segments.) Addition to that, as Robert mentioned, the possible maximum memory wastage of the exponential strategy is about 2GB and 64MB in fixed size strategy. Seeing these numbers, I don't tend to take the exponential strategy. [1] https://www.postgresql.org/message-id/CAGTBQpbZX5S4QrnB6YP-2Nk+A9bxbaVktzKwsGvMeov3MTgdiQ@mail.gmail.com [2] See attached perl script. I hope it is correct. regards, -- Kyotaro Horiguchi NTT Open Source Software Center #! /usr/bin/perl $maxmem=1024 * 4; #===== print "exponential sized strategy\n"; $ss = 64; $ts = 0; $sumiteritem = 0; for ($i = 1 ; $ts < $maxmem ; $i++) { $ss = $ss * 2; if ($ts + $ss > $maxmem) { $ss = $maxmem - $ts; } $ts += $ss; $ntups = $ts*1024*1024 / 6; $ntupinseg = $ss*1024*1024 / 6; $npages = $ntups / 291; $tsize = $npages * 8192.0 / 1024 / 1024 / 1024; $sumiteritem += log($ntupinseg) * $ntupinseg; # weight by percentage in all tuples printf("#%d : segsize=%dMB total=%dMB, (tuples = %ld, min tsize=%.1fGB), iterseg(%d)=%f, iteritem(%d) = %f, expectediter=%f\n", $i, $ss, $ts, $ntups, $tsize, $i, log($i), $ntupinseg, log($ntupinseg), log($i) + $sumiteritem/$ntups); } print "\n\nfixed sized strategy\n"; $ss = 64; $ts = 0; for ($i = 1 ; $ts < $maxmem ; $i++) { $ts += $ss; $ntups = $ts*1024*1024 / 6; $ntupinseg = $ss*1024*1024 / 6; $npages = $ntups / 300; $tsize = $npages * 8192.0 / 1024 / 1024 / 1024; printf("#%d : segsize=%dMB total=%dMB, (tuples = %ld, min tsize=%.1fGB), interseg(%d)=%f, iteritem(%d) = %f, expectediter=%f\n", $i, $ss, $ts, $ntups, $tsize, $i, log($i), $ntupinseg, log($ntupinseg), log($i) + log($ntupinseg)); }
pgsql-hackers by date: