Re: Vacuum: allow usage of more than 1GB of work mem - Mailing list pgsql-hackers
From | Pavan Deolasee |
---|---|
Subject | Re: Vacuum: allow usage of more than 1GB of work mem |
Date | |
Msg-id | CABOikdN1O7PEUK+TFAAHxQGG3HTuGmXY5B7mcW=G8MgEHrojpw@mail.gmail.com Whole thread Raw |
In response to | Re: Vacuum: allow usage of more than 1GB of work mem (Claudio Freire <klaussfreire@gmail.com>) |
Responses |
Re: Vacuum: allow usage of more than 1GB of work mem
Re: Vacuum: allow usage of more than 1GB of work mem |
List | pgsql-hackers |
On Fri, Sep 16, 2016 at 12:24 AM, Claudio Freire <klaussfreire@gmail.com> wrote:
On Thu, Sep 15, 2016 at 3:48 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> For example, we always allocate the TID array as large as we can fit into
> m_w_m, but maybe we don't need to wait with switching to the bitmap until
> filling the whole array - we could wait as long as the bitmap fits into the
> remaining part of the array, build it there and then copy it to the
> beginning (and use the bitmap from that point).
The bitmap can be created like that, but grow from the end of the
segment backwards.
So the scan can proceed until the bitmap fills the whole segment
(filling backwards), no copy required.
But I actually wonder if we are over engineering things and overestimating cost of memmove etc. How about this simpler approach:
1. Divide table in some fixed size chunks like Robert suggested. Say 1GB
2. Allocate pointer array to store a pointer to each segment. For 1TB table, thats about 8192 bytes.
3. Allocate a bitmap which can hold MaxHeapTuplesPerPage * chunk size in pages. For 8192 block and 1GB chunk, thats about 4.6MB. Note: I'm suggesting to use a bitmap here because provisioning for worst case, fixed size TID array will cost us 200MB+ where as a bitmap is just 4.6MB.
4. Collect dead tuples in that 1GB chunk. Also collect stats so that we know about the most optimal representation.
5. At the end of 1GB scan, if no dead tuples found, set the chunk pointer to NULL, move to next chunk and restart step 4. If dead tuples found, then check:
a. If bitmap can be further compressed by using less number of bits per page. If so, allocate a new bitmap and compress the bitmap.
b. If TID array will be a more compact representation. If so, allocate a TID array of right size and convert bitmap into an array.
c. Set chunk pointer to whichever representation we choose (of course add headers etc to interpret the representation)
6. Continue until we consume all m_w_m or end-of-table is reached. If we consume all m_w_m then do a round of index cleanup and restart.
I also realised that we can compact the TID array in step (b) above because we only need to store 17 bits for block numbers (we already know which 1GB segment they belong to). Given that usable offsets are also just 13 bits, TID array needs only 4 bytes per TID instead of 6.
Many people are working on implementing different ideas, and I can volunteer to write a patch on these lines unless someone wants to do that.
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: