Thread: Use static inline functions for Float <-> Datum conversions

Use static inline functions for Float <-> Datum conversions

From

Heikki Linnakangas

Date:

31 August 2016, 10:48:37

Hi,

Now that we are OK with static inline functions, we can save some cycles
from floating-point functions, by turning Float4GetDatum,
Float8GetDatum, and DatumGetFloat8 into static inlines. They are only a
few instructions, but couldn't be implemented as macros before, because
they need a local union-variable for the conversion.

That can add up to significant speedups with float-heavy queries. For
example:

create table floats as select g::float8 as a, g::float8 as b, g::float8
as c from generate_series(1, 1000000) g;

select sum(a+b+c+1) from floats;

The sum query is about 4% faster on my laptop with this patch.

- Heikki

Attachment

0001-Use-static-inline-functions-for-float-Datum-conversi.patch

Re: Use static inline functions for Float <-> Datum conversions

From

Tom Lane

Date:

31 August 2016, 11:38:49

Heikki Linnakangas <hlinnaka@iki.fi> writes:
> Now that we are OK with static inline functions, we can save some cycles 
> from floating-point functions, by turning Float4GetDatum, 
> Float8GetDatum, and DatumGetFloat8 into static inlines.

Looks good to me.

I wonder whether there is a compiler-dependent way of avoiding the union
trick ... or maybe gcc is already smart enough that it doesn't matter?
        regards, tom lane

Re: Use static inline functions for Float <-> Datum conversions

From

Heikki Linnakangas

Date:

31 August 2016, 12:22:37

On 08/31/2016 02:38 PM, Tom Lane wrote:
> Heikki Linnakangas <hlinnaka@iki.fi> writes:
>> Now that we are OK with static inline functions, we can save some cycles
>> from floating-point functions, by turning Float4GetDatum,
>> Float8GetDatum, and DatumGetFloat8 into static inlines.
>
> Looks good to me.

Ok, will push.

> I wonder whether there is a compiler-dependent way of avoiding the union
> trick ... or maybe gcc is already smart enough that it doesn't matter?

It seems to compile into a single instruction, so it can't get any 
better from a performance point of view.

float8pl:
.LFB79:.loc 1 871 0.cfi_startproc
.LVL297:
.LBB959:
.LBB960:.loc 2 733 0movsd    40(%rdi), %xmm2
.LBE960:
.LBE959:
.LBB961:
.LBB962:movsd    32(%rdi), %xmm1
...

A union is probably what language pedantics would prefer anyway, and 
anything else would be more of a trick.

- Heikki

Re: Use static inline functions for Float <-> Datum conversions

From

Tom Lane

Date:

31 August 2016, 13:44:31

Heikki Linnakangas <hlinnaka@iki.fi> writes:
> On 08/31/2016 02:38 PM, Tom Lane wrote:
>> I wonder whether there is a compiler-dependent way of avoiding the union
>> trick ... or maybe gcc is already smart enough that it doesn't matter?

> It seems to compile into a single instruction, so it can't get any 
> better from a performance point of view.

Yeah, confirmed here.  On my not-real-new gcc (version 4.4.7, which
ships with RHEL6), these test functions:

Datum
compare_int8(PG_FUNCTION_ARGS)
{int64        x = PG_GETARG_INT64(0);int64        y = PG_GETARG_INT64(1);
PG_RETURN_BOOL(x < y);
}

Datum
compare_float8(PG_FUNCTION_ARGS)
{double        x = PG_GETARG_FLOAT8(0);double        y = PG_GETARG_FLOAT8(1);
PG_RETURN_BOOL(x < y);
}

compile into this (at -O2):

compare_int8:.cfi_startprocmovq    40(%rdi), %raxcmpq    %rax, 32(%rdi)setl    %almovzbl    %al, %eaxret.cfi_endproc

compare_float8:.cfi_startprocmovsd    40(%rdi), %xmm0xorl    %eax, %eaxucomisd    32(%rdi), %xmm0seta
%alret.cfi_endproc

(Not sure why the compiler does the widening of the comparison result
differently, but it doesn't look like it matters.)  Before this patch,
that looked like:

compare_float8:.cfi_startprocpushq    %rbx.cfi_def_cfa_offset 16.cfi_offset 3, -16movq    %rdi, %rbxsubq    $16,
%rsp.cfi_def_cfa_offset32movq    32(%rdi), %rdicall    DatumGetFloat8movq    40(%rbx), %rdimovsd    %xmm0, 8(%rsp)call
 DatumGetFloat8xorl    %eax, %eaxucomisd    8(%rsp), %xmm0seta    %aladdq    $16, %rsp.cfi_def_cfa_offset 16popq
%rbx.cfi_def_cfa_offset8ret.cfi_endproc
 

Nice.
        regards, tom lane