Thread: reduce size of fmgr_builtins array

reduce size of fmgr_builtins array

From

John Naylor

Date:

23 December 2019, 14:52:18

Hi all,

Currently, we include the function name string in each FmgrBuiltin
struct, whose size is 24 bytes on 64 bit platforms. As far as I can
tell, the name is usually unused, so the attached (WIP, untested)
patch stores it separately, reducing this struct to 16 bytes.

We can go one step further and allocate the names as a single
character string, reducing the binary size. It doesn't help much to
store offsets, since there are ~40k characters, requiring 32-bit
offsets. If we instead compute the offset on the fly from stored name
lengths, we can use 8-bit values, saving 19kB of space in the binary
over using string pointers.

--
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

v1-fmgr-func-names.patch

Re: reduce size of fmgr_builtins array

From

John Naylor

Date:

01 January 2020, 23:15:47

I wrote:

> Currently, we include the function name string in each FmgrBuiltin
> struct, whose size is 24 bytes on 64 bit platforms. As far as I can
> tell, the name is usually unused, so the attached (WIP, untested)
> patch stores it separately, reducing this struct to 16 bytes.
>
> We can go one step further and allocate the names as a single
> character string, reducing the binary size. It doesn't help much to
> store offsets, since there are ~40k characters, requiring 32-bit
> offsets. If we instead compute the offset on the fly from stored name
> lengths, we can use 8-bit values, saving 19kB of space in the binary
> over using string pointers.

I tested with the attached C function to make sure
fmgr_internal_function() still returned the correct answer. I assume
this is not a performance critical function, but I still wanted to see
if there was a visible performance regression. I get this when calling
fmgr_internal_function() 100k times:

master: 833ms
patch: 886ms

The point of the patch is to increase the likelihood of
fmgr_isbuiltin() finding the fmgr_builtins[] element in L1 cache. It
seems harder to put a number on that for a realistic workload, but
reducing the array size by 1/3 couldn't hurt. I'll go ahead and add
this to the commitfest.

-- 
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

drive_builtin_by_name.c

Re: reduce size of fmgr_builtins array

From

Heikki Linnakangas

Date:

07 January 2020, 13:08:46

On 02/01/2020 01:15, John Naylor wrote:
> I wrote:
> 
>> Currently, we include the function name string in each FmgrBuiltin
>> struct, whose size is 24 bytes on 64 bit platforms. As far as I can
>> tell, the name is usually unused, so the attached (WIP, untested)
>> patch stores it separately, reducing this struct to 16 bytes.
>>
>> We can go one step further and allocate the names as a single
>> character string, reducing the binary size. It doesn't help much to
>> store offsets, since there are ~40k characters, requiring 32-bit
>> offsets. If we instead compute the offset on the fly from stored name
>> lengths, we can use 8-bit values, saving 19kB of space in the binary
>> over using string pointers.
> 
> I tested with the attached C function to make sure
> fmgr_internal_function() still returned the correct answer. I assume
> this is not a performance critical function, but I still wanted to see
> if there was a visible performance regression. I get this when calling
> fmgr_internal_function() 100k times:
> 
> master: 833ms
> patch: 886ms

Hmm. I was actually expecting this to slightly speed up 
fmgr_internal_function(), now that all the names fit in a smaller amount 
of cache. I guess there are more branches or a data dependency or 
something now. I'm not too worried about that though. If it mattered, we 
should switch to binary searching the array.

> The point of the patch is to increase the likelihood of
> fmgr_isbuiltin() finding the fmgr_builtins[] element in L1 cache. It
> seems harder to put a number on that for a realistic workload, but
> reducing the array size by 1/3 couldn't hurt.

Yeah. Nevertheless, it would be nice to be able to demonstrate the 
benefit in some test, at least. It feels hard to justify committing a 
performance patch if we can't show the benefit. Otherwise, we should 
just try to keep it as simple as possible, to optimize for readability.

A similar approach was actually discussed a couple of years back: 
https://www.postgresql.org/message-id/bd13812c-c4ae-3788-5b28-5633beed2929%40iki.fi. 
The conclusion then was that it's not worth the trouble or the code 
complication. So I think this patch is Rejected, unless you can come up 
with a test case that concretely shows the benefit.

- Heikki

Re: reduce size of fmgr_builtins array

From

John Naylor

Date:

18 January 2020, 04:52:26

On Tue, Jan 7, 2020 at 9:08 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> Yeah. Nevertheless, it would be nice to be able to demonstrate the
> benefit in some test, at least. It feels hard to justify committing a
> performance patch if we can't show the benefit. Otherwise, we should
> just try to keep it as simple as possible, to optimize for readability.
>
> A similar approach was actually discussed a couple of years back:
> https://www.postgresql.org/message-id/bd13812c-c4ae-3788-5b28-5633beed2929%40iki.fi.
> The conclusion then was that it's not worth the trouble or the code
> complication. So I think this patch is Rejected, unless you can come up
> with a test case that concretely shows the benefit.

Thanks for reviewing! As expected, a microbenchmark didn't show a
difference. I could try profiling in some workload, but I don't think
the benefit would be worth the effort involved. I've marked it
rejected.

-- 
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services