Home > mailing lists

Re: Question about lazy_space_alloc() / linux over-commit - Mailing list pgsql-hackers

From	Jim Nasby
Subject	Re: Question about lazy_space_alloc() / linux over-commit
Date	March 9, 2015 22:12:46
Msg-id	54FE1AC6.8080502@BlueTreble.com Whole thread Raw
In response to	Re: Question about lazy_space_alloc() / linux over-commit (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses	Re: Question about lazy_space_alloc() / linux over-commit Re: Question about lazy_space_alloc() / linux over-commit
List	pgsql-hackers

Tree view

On 3/9/15 12:28 PM, Alvaro Herrera wrote:
> Robert Haas wrote:
>> On Sat, Mar 7, 2015 at 5:49 PM, Andres Freund <andres@2ndquadrant.com> wrote:
>>> On 2015-03-05 15:28:12 -0600, Jim Nasby wrote:
>>>> I was thinking the simpler route of just repalloc'ing... the memcpy would
>>>> suck, but much less so than the extra index pass. 64M gets us 11M tuples,
>>>> which probably isn't very common.
>>>
>>> That has the chance of considerably increasing the peak memory usage
>>> though, as you obviously need both the old and new allocation during the
>>> repalloc().
>>>
>>> And in contrast to the unused memory at the tail of the array, which
>>> will usually not be actually allocated by the OS at all, this is memory
>>> that's actually read/written respectively.
>>
>> Yeah, I'm not sure why everybody wants to repalloc() that instead of
>> making several separate allocations as needed.  That would avoid
>> increasing peak memory usage, and would avoid any risk of needing to
>> copy the whole array.  Also, you could grow in smaller chunks, like
>> 64MB at a time instead of 1GB or more at a time.  Doubling an
>> allocation that's already 1GB or more gets big in a hurry.
>
> Yeah, a chunk list rather than a single chunk seemed a good idea to me
> too.

That will be significantly more code than a simple repalloc, but as long 
as people are OK with that I can go that route.

> Also, I think the idea of starting with an allocation assuming some
> small number of dead tuples per page made sense -- and by the time that
> space has run out, you have a better estimate of actual density of dead
> tuples, so you can do the second allocation based on that new estimate
> (but perhaps clamp it at say 1 GB, just in case you just scanned a
> portion of the table with an unusually high percentage of dead tuples.)

I like the idea of using a fresh idea of dead tuple density when we need 
more space. We would also clamp this at maintenance_work_mem, not a 
fixed 1GB.

Speaking of which... people have referenced allowing > 1GB of dead 
tuples, which means allowing maintenance_work_mem > MAX_KILOBYTES. The 
comment for that says:

/* upper limit for GUC variables measured in kilobytes of memory */
/* note that various places assume the byte size fits in a "long" 
variable */

So I'm not sure how well that will work. I think that needs to be a 
separate patch.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

pgsql-hackers by date:

From: Alvaro Herrera
Date: 09 March 2015, 20:46:33
Subject: Re: tracking commit timestamps

From: Andreas Karlsson
Date: 09 March 2015, 22:24:43
Subject: Re: New functions

Re: Question about lazy_space_alloc() / linux over-commit - Mailing list pgsql-hackers

Previous

Next